Serialization

Kafka only understands bytes. It can't store a JSON object or a string directly. So before sending, data must be converted to bytes. When reading, bytes must be converted back.

Devloper object message deta h then producer use convert krte h before sending to consumer

Serialization = Object/JSON → Bytes (Producer side, before sending) Deserialization = Bytes → Object/JSON (Consumer side, after reading)

Producer Side — Serialization

Raw Data (JSON Object)  →  Serialize (Convert to Bytes)  →  Kafka Broker (Stores Bytes)

Consumer Side — Deserialization

Kafka Broker (Bytes)  →  Deserialize (Convert to JSON)  →  Consumer (Reads JSON Object)

Simple analogy: It's like sending a file over WhatsApp. Your phone compresses/encodes it before sending (serialize), and the receiver's phone decodes it back (deserialize). The network in between only sees raw binary — it doesn't care what's inside.

Why does this matter?

Common Serializer formats

Format When to use
String Simple text, log messages
JSON Most common — easy to read, flexible
Avro Schema-enforced, compact binary, best for production at scale
Protobuf Google's format, very fast and compact

Events are appended to the end of a partition — Kafka never overwrites old messages. This is why Kafka is so fast — appending to a log is one of the cheapest operations a disk can do.

Consumer Groups

A Consumer Group is a group of consumers that work together to read from a topic. Instead of one consumer reading everything, the work is split across multiple consumers in the group.

The rule: Each partition is assigned to only one consumer in a group at a time.