π GitHub: https://github.com/dhananjay93/data-engineering-end-to-end-demo
Tech Stack: PostgreSQL (Cloud SQL), Airbyte, dbt, Airflow, Tableau, GCP
Built a modern, cloud-native data pipeline using GCP and open-source tools to ingest, transform, orchestrate, and visualize retail data from the Sample Superstore dataset.
| Layer | Tool Used | Description |
|---|---|---|
| Storage | GCP Cloud SQL (Postgres) | Hosted raw and transformed data |
| Ingestion | Airbyte (Cloud) | Moved data from raw DB to staging DB (ELT) |
| Modeling | dbt (Local) | SQL transformations: intermediate β destination |
| Orchestration | Apache Airflow (Local) | Scheduled & automated dbt runs |
| Visualization | Tableau | Built dashboards from transformed tables |

Used Tableauβs open-source Sample Superstore dataset, which contains sales, customer, region, and product information.
Excel (.csv)
β
Cloud SQL (raw.orders, raw.returns)
β β Airbyte β
Cloud SQL (intermediate.orders, intermediate.returns)
β
dbt transformations
β
Cloud SQL (destination.aggregated)
β
Tableau dashboards
master_table.sql: Cleaned and formatted raw datafct_sales_summary.sql: Aggregated metrics by region and category