MSSQL to Snowflake Integration for Retail Giant

Client Overview

The client is one of the giant retailers in the US that offers a variety of food supplies, and services. They have a strong reputation for providing high-quality products and outstanding customer service.

Client Requirement

Replacement of SSIS process with Python for Data Integration from MSSQL to Snowflake

Data is transferred from the MSSQL Server to an Amazon S3 bucket in the form of CSV files.

An ETL pipeline, utilizing a Python script, is to be established to replace the current SSIS process and move data from the MSSQL Server to the Amazon S3 bucket.

The data from the S3 bucket needs to be loaded into Snowflake.

Our Solution

Bullet point

Created reconfiguration tables containing select queries for all necessary fields, source table names, target table names, header names, process names, an S3 key, and a destination folder in AWS.

Bullet point

Developed a Python script that runs daily at a specific time using a cron job on an EC2 instance.

Bullet point

The script retrieves metadata from the configuration tables and connects to the source MSSQL Server to fetch all records.

Bullet point

Delimited data is then written to CSV files.

Bullet point

The CSV files are then transferred to the specified destination folder in the S3 bucket.

Bullet point

Processes are logged in Snowflake tables.

Bullet point

Then the data is moved from the AWS S3 bucket to the Snowflake Data Warehouse using the lambda function.

Bullet point

GitLab is being used as a version control tool for managing and collaborating on code development.

Our Solution

Client Benefits

The new design allows for the reuse of the generic Python script for future imports.

The need to create a new SSIS package or SQL Server Agent job for each table import is eliminated.

This new design architecture allows customer to reduce the SSIS developer team.

The result in cost savings on investing in SSIS licenses.

Case Studies

Start your journey towards data-driven excellence.