1. Explain the difference between consumer groups and partitions.
A partition is a division of a topic — it is how Kafka splits data for parallel storage and processing. A consumer group is a set of consumers that work together to read a topic.
The relationship between them is: each partition in a topic is assigned to exactly one consumer within a group at a time. So if a topic has 3 partitions and a consumer group has 3 consumers, each consumer gets one partition. If the group has 2 consumers, one consumer handles 2 partitions and the other handles 1. If the group has 4 consumers, one sits idle because there are not enough partitions.
Partitions decide how data is split. Consumer groups decide how that split data is read in parallel. More partitions means more consumers can work in parallel, which means higher throughput.
2. How does Kafka handle message retention and log compaction?
Kafka does not delete messages immediately after they are consumed. It retains messages for a configured period of time or until a size limit is reached. The default is 7 days. This means consumers can re-read old messages, replay events, or a new consumer can join and read from the beginning.
Log compaction is a different retention strategy. Instead of keeping messages by time, Kafka keeps only the latest message for each key. So if key A was updated 5 times, Kafka only keeps the last value. This is useful for things like a user profile topic where you only care about the current state, not the full history.
Two modes:
3. How does Kafka achieve high throughput and scalability?
Kafka achieves high throughput through several design decisions working together.
Partitioning — a topic is split into multiple partitions spread across brokers. Producers write to multiple partitions in parallel and consumers read from multiple partitions in parallel. More partitions means more parallelism.
Sequential disk writes — Kafka does not randomly write to disk. It only appends messages to the end of a log file. Sequential writes are much faster than random writes, often as fast as RAM on modern SSDs.
Batching — producers do not send one message at a time. They batch multiple messages together and send in one network call. This reduces overhead significantly.
Zero copy — Kafka uses a technique where data goes directly from disk to network without copying it into application memory first. This reduces CPU and memory usage.
Horizontal scaling — you add more brokers to the cluster and rebalance partitions across them. More brokers means more storage and more parallel processing without changing anything in the application.