What: Set up Docker Compose stack with Kafka, Postgres, Schema Registry and Kafbat UI. Designed the database schema.
How: Ran docker compose up and verified services came up healthy.
Expected: All containers healthy, Kafbat visible at localhost:8080.
Could go wrong: Kafka listener config, image availability, healthcheck paths.
Bitnami Kafka image no longer publicly available — moved to paid tier. Found two alternatives: apache/kafka and confluent/cp-kafka. Went with apache/kafka:3.9.0 — actively maintained, clean KRaft config, no Zookeeper needed.
kafka:
image: apache/kafka:3.9.0
environment:
KAFKA_NODE_ID: 0
KAFKA_PROCESS_ROLES: broker,controller
KAFKA_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093,EXTERNAL://:9094
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,EXTERNAL://localhost:9094
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT
KAFKA_CONTROLLER_QUORUM_VOTERS: 0@kafka:9093
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
ports:
- "9094:9094"
Why two Kafka listeners? Containers reach Kafka via kafka:9092 (internal Docker network). Host machine reaches it via localhost:9094. One listener can't serve both.
# CONTROLLER - internal broker coordination
# PLAINTEXT - Docker → Kafka (kafka:9092)
# EXTERNAL - host → Kafka (localhost:9094)
Kafbat didn't work first try. Tried adding persistent volume config — didn't work, removed for now, will revisit later.
Fixed healthcheck — kafka-topics.sh isn't on PATH for the apache/kafka image the same way. Had to use full path and give more breathing room:
healthcheck:
test: ["CMD-SHELL", "/opt/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --list"]
interval: 10s
retries: 10
start_period: 30s
timeout: 10s
Decision: topics, partitions, and clusters are created programmatically via AdminClient on producer startup — not manually in the UI. UI is for debugging only.
db/init.sql — two timestamps needed:
occurred_at — when the event actually happened (set by producer)processed_at — when the consumer wrote it to the DB (set by DB default)Two IDs: