Kafka only understands bytes. It can't store a JSON object or a string directly. So before sending, data must be converted to bytes. When reading, bytes must be converted back.
Devloper object message deta h then producer use convert krte h before sending to consumer
Serialization = Object/JSON → Bytes (Producer side, before sending) Deserialization = Bytes → Object/JSON (Consumer side, after reading)
Raw Data (JSON Object) → Serialize (Convert to Bytes) → Kafka Broker (Stores Bytes)
Kafka Broker (Bytes) → Deserialize (Convert to JSON) → Consumer (Reads JSON Object)
Simple analogy: It's like sending a file over WhatsApp. Your phone compresses/encodes it before sending (serialize), and the receiver's phone decodes it back (deserialize). The network in between only sees raw binary — it doesn't care what's inside.
Why does this matter?
| Format | When to use |
|---|---|
String |
Simple text, log messages |
JSON |
Most common — easy to read, flexible |
Avro |
Schema-enforced, compact binary, best for production at scale |
Protobuf |
Google's format, very fast and compact |
Events are appended to the end of a partition — Kafka never overwrites old messages. This is why Kafka is so fast — appending to a log is one of the cheapest operations a disk can do.
A Consumer Group is a group of consumers that work together to read from a topic. Instead of one consumer reading everything, the work is split across multiple consumers in the group.
The rule: Each partition is assigned to only one consumer in a group at a time.