Data Engineering Hands-On Lab
E-commerce Analytics Platform
🎯 Business Problem
Company: GlobalMart - A growing e-commerce company
Situation:
GlobalMart operates across multiple regions and has been storing data in various formats across different systems:
- Sales data comes from their web platform (CSV files)
- Product catalog is maintained in Excel spreadsheets by different teams
- Customer data is exported from CRM system (JSON format)
- Inventory updates come from warehouse management system (XML format)
- Marketing campaign data is stored in Parquet format from their analytics tool
Current Pain Points:
- Data is scattered across multiple systems and formats
- Manual effort required to consolidate reports
- No single source of truth for business metrics
- Delayed insights due to manual data processing
- Data quality issues and inconsistencies
- No historical tracking of changes
Goal:
Build an end-to-end data pipeline that can: