GitHub Link
https://github.com/KEA-ACCELER/kafka-druid-superset
π’ Presentation Video
https://youtu.be/kDg0ZoLWLRs
https://youtu.be/kDg0ZoLWLRs
π Presentation Material
βοΈ Project Overview
This project involved building a system that collects, processes, analyzes, and visualizes real-time bus boarding and alighting data and bus stop data using kafka, kafka streams, druid, and superset. The system was constructed through the following steps:

- Firstly, using kafka, a message bus was set up to collect and deliver real-time bus boarding, alighting, and bus stop data. Kafka is known for its high performance and scalability and can integrate with various data sources.
- Next, kafka streams was utilized to process the streaming data related to bus boarding, alighting, and bus stops. Kafka streams is a library that allows easy processing of data from kafka, enabling the implementation of complex business logic. For example, it can calculate and deliver real-time statistics on passenger counts, boarding ratios per bus stop, and bus operating status.
- Subsequently, druid was employed to create a real-time analytical database for the bus boarding, alighting, and bus stop data. Druid is an open-source, high-performance database specifically designed for real-time analytics, capable of querying and aggregating large volumes of data quickly. Druid can ingest and index data from kafka in real-time and provides various aggregation functions and filtering capabilities.
- Finally, superset was used to build a BI platform that visualizes bus boarding, alighting, and bus stop data in various charts and dashboards. Superset is an open-source BI platform that integrates with druid to easily visualize data. It offers a user-friendly interface with diverse chart options and allows the creation of real-time updating dashboards.

π¬Β ν ꡬμ±
νμ (5)
π¬ Team Composition
Team Members: 5
π¨ Responsibilities