Kafka Partition

A topic on its own is just a single stream of messages. But what if millions of messages are coming in every second? One stream can't handle that fast enough.

That's where partitions come in. A partition is a subsection of a topic — Kafka splits a topic into multiple partitions so data can be processed in parallel across multiple brokers.

Think of it like a highway. One lane (no partition) → traffic jam. Multiple lanes (partitions) → cars move in parallel, much faster.

Broker
 └── Topic A
      ├── Partition 0  → [msg0, msg1, msg2, msg3, msg4, msg5, msg6...]
      └── Partition 1  → [msg0, msg1, msg2, msg3, msg4, msg5...]

 └── Topic B
      └── Partition 0  → [msg0, msg1, msg2...]

Key things to know:

How does a producer decide which partition to send to? the producer provides the key when sending the message.

key = "driver-123"
Partitions = 3

hash("driver-123") % 3 = 0

This is important — if you want all events for a specific driver to be in order, you use their driver ID as the key. That guarantees all their messages land in the same partition, in order.


Kafka Offset