MSSQL to Snowflake Integration for Retail Giant

Replacement of SSIS process with Python for Data Integration from MSSQL to Snowflake
Client Overview

The client is one of the giant retailers in the US that offers a variety of food supplies, and services. They have a strong reputation for providing high-quality products and outstanding customer service.

Client Requirement

Replacement of SSIS process with Python for Data Integration from MSSQL to Snowflake

  • Data is transferred from the MSSQL Server to an Amazon S3 bucket in the form of CSV files.
  • An ETL pipeline, utilizing a Python script, is to be established to replace the current SSIS process and move data from the MSSQL Server to the Amazon S3 bucket.
  • The data from the S3 bucket needs to be loaded into Snowflake.

Our Solution

  • Created reconfiguration tables containing select queries for all necessary fields, source table names, target table names, header names, process names, an S3 key, and a destination folder in AWS.
  • Developed a Python script that runs daily at a specific time using a cron job on an EC2 instance.
  • The script retrieves metadata from the configuration tables and connects to the source MSSQL Server to fetch all records.
  • Delimited data is then written to CSV files.
  • The CSV files are then transferred to the specified destination folder in the S3 bucket.
  • Processes are logged in Snowflake tables.
  • Then the data is moved from the AWS S3 bucket to the Snowflake Data Warehouse using the lambda function.
  • GitLab is being used as a version control tool for managing and collaborating on code development.
Our Solution Data Integration - Snowflake

Tools

Cloud Stack: AWS - S3, Lambda and EC2
Cloud-based Data Warehousing Platform - Snowflake
Programming Language - Python
Version Control - Gitlab

Client Benefits

Orion_code 1

The new design allows for the reuse of the generic Python script for future imports.

Orion_import-server 1

The need to create a new SSIS package or SQL Server Agent job for each table import is eliminated.

Orion_refresh-database 1

This new design architecture allows customer to reduce the SSIS developer team.

Orion_pay 1

The result in cost savings on investing in SSIS licenses.

Get real - time data integration solutions with GetOnData

Blog

Recent Blog