πŸ”— GitHub: https://github.com/dhananjay93/data-engineering-end-to-end-demo

Tech Stack: PostgreSQL (Cloud SQL), Airbyte, dbt, Airflow, Tableau, GCP


πŸ”§ Overview

Built a modern, cloud-native data pipeline using GCP and open-source tools to ingest, transform, orchestrate, and visualize retail data from the Sample Superstore dataset.


πŸ› οΈ Tools & Technologies

Layer Tool Used Description
Storage GCP Cloud SQL (Postgres) Hosted raw and transformed data
Ingestion Airbyte (Cloud) Moved data from raw DB to staging DB (ELT)
Modeling dbt (Local) SQL transformations: intermediate β†’ destination
Orchestration Apache Airflow (Local) Scheduled & automated dbt runs
Visualization Tableau Built dashboards from transformed tables

πŸ”„ Architecture Diagram

image.png

πŸ“₯ Data Source

Used Tableau’s open-source Sample Superstore dataset, which contains sales, customer, region, and product information.


πŸ”„ Pipeline Architecture

Excel (.csv)
   ↓
Cloud SQL (raw.orders, raw.returns)
   ↓             ← Airbyte β†’
Cloud SQL (intermediate.orders, intermediate.returns)
   ↓
dbt transformations
   ↓
Cloud SQL (destination.aggregated)
   ↓
Tableau dashboards


🧱 dbt Models Created