The client is one of the giant retailers in the US that offers a variety of food supplies, and services. They have a strong
reputation for providing high-quality products and outstanding customer service.
Queried a data pipeline to transform and load raw data from various platforms into the targeted tables in Snowflake.
Developed a query to group the activity and conversion data from the targeted Snowflake tables based on fiscal week and network for each platform.
Created an excel template to compare the source and targeted data based on fiscal week, network, and platform.
Fetched the sourced data directly from various platforms and inserted it into the excel template.
Run the developed query to get the grouped data from the targeted snowflake tables and load it in the excel template.
Created calculated fields in excel to compare the activity and conversion data between the source vs snowflake data and find out the data discrepancies in %.
Updated the source as well as the targeted data every week in excel to get the latest data.
Generate an excel report on a weekly basis to highlight the % data discrepancies for each network and fiscal week of the current fiscal year for each platform.
With the developed model, there is a 70% reduction in data inaccuracy and inconsistency leading to a more precise dataset.
It ensures cost-effectiveness because it saves time and money by making sure that the datasets collected and used in processing are clean and accurate.
Due to reduction in data inaccuracy and inconsistency it is helpful for top management to make informed decisions.
Validating data can assist the organization in reducing the potential errors caused by data decay by identifying where data is missing, incomplete, inconsistent, and inaccurate.