Cloud Data Transformation

Problem statement

Schreiber has plants across US, Europe, India & Mexico
SFI uses legacy DWH technologies.
The ask is to migrate to cloud based Data lake for easy scaling, uniform solution provision, self service by business users and adapting to big data world and AI for getting better insight using data.

Domains & Stakeholders

Domains : Supply chain, Customer experience, Operations, Finance, Project Life cycle Management (PLM), Global IT
Stake Holders : Product owner, scrum master, BRM, Data engineers, Power Bi developers, Business Analyst

Solution Provided

Set up cloud-based Data Lakehouse with Medallion architecture to store data in better order
Migrating the legacy Oracle based data warehouse to cloud
Migrating the legacy reports to Power BI reports

Methodology

Project delivery Methodology : Agile (Moved from Waterfall to Agile)
Number of Squads : scaled up from 1 to 6 Squads (each Squad contains 5 to 6 people)

Current Status

Diver : Out of 125 Diver models, 90 were converted to PBI semantic models and rest are decommissioned
Tableau : out of 410 reports, 200 are converted to PBI semantic models/reports and rest are decommissioned
Dimensions are built in Curated layer to use across project
Data lake is built in enriched layer to utilize the massaged raw data for reporting, AI/ML use case
Data sourcing done from Oracle, MS SQL, Web API, Manual excel files, SharePoint, parquet files, Json data, xml data etc.
Reusable scripts are written to smooth movement of data between layers and to be plugged into project specific dataflows
PoC done for streaming data to analyze power consumption in one of the plant
AI/ML : GCP Vertex AI, Databricks Auto ML models built for Demand Forecast
Gen AI : Integrating with Palantir to build Pilot Project in Supply chain domain
Data Science : Building Data science Capability

Share: