chapter two

2 Kafka cluster data architecture

 

This chapter covers

  • Organizing related messages through topics
  • Utilizing partitions for parallel processing data
  • The composition of Kafka messages: keys, values, and headers
  • Using replication to ensure availability and fault tolerance
  • Working with compacted topics for persistent data storage

Let’s start with the building blocks of Kafka from the cluster’s point of view: topics, partitions, replication, and how data is physically stored. We’ll begin with topics and partitions—how to process data in parallel, preserve ordering where it matters, and replicate partitions. Then we’ll look inside a topic: message structure (keys, values, headers), batches and offsets, the on-disk layout, retention policies, and how to select the number of partitions. Finally, we’ll look at compacted topics—the rationale, mechanics, and when compaction runs. These fundamentals are essential for grasping Kafka’s architecture.

2.1 Inside the Kafka cluster

In this chapter, we’ll take a step away from the business patterns related to applying Kafka and explore the implementation of design ideas from an architectural perspective. We’ll explore the fundamental building blocks of Kafka:

  • Topics—Destinations events are dispatched to
  • Partitions—Scalability and redundancy units
  • Messages—Carriers of event information

Additionally, we will discuss two types of topics:

  • Streaming topics—For event streaming
  • State storage topics—For storing state

2.2 Core concepts of data processing

2.2.1 Partitioning the topic

2.2.2 Processing data concurrently

2.2.3 Ordering within a topic

2.3 Replicating partitions

2.3.1 Replica leaders and followers

2.3.2 Choosing replication factor and minimal number of in-sync replicas

2.4 Inside the topic

2.4.1 Messages: Keys, values and headers

2.4.2 Message batches and offsets

2.4.3 Physical representation of a topic

2.4.4 Data retention

2.4.5 Selecting the number of partitions

2.5 Compacted topics

2.5.1 The idea of compaction

2.5.2 How compaction works

2.5.3 When compaction happens

2.6 Online resources

2.7 Summary