6 Brokers

This chapters covers

The role of brokers and their duties
Evaluating options for certain broker configuration values
Explaining replicas and how they stay up to date

So far in our discussions, we have dealt with Kafka from the view of an application developer interacting from external applications and processes. However, Kafka is a distributed system that deserves attention in its own right. In this chapter, let’s look at the parts that make the Kafka brokers work.

6.1 Introducing the broker

Although we have focused on the client side of Kafka so far, our focus will now shift to another powerful component of the ecosystem: brokers. Brokers work together with other brokers to form the core of the system.

As we start to discover Kafka, those who are familiar with big data concepts or who have worked with Hadoop before might see familiar terminologies such as rack awareness (knowing which physical server rack a machine is hosted on) and partitions. Kafka has a rack awareness feature that makes replicas for a partition exist physically on separate racks [1]. Using familiar data terms should make us feel at home as we draw new parallels between what we’ve worked with before and what Kafka can do for us. When setting up our own Kafka cluster, it is important to know that we have another cluster to be aware of: Apache ZooKeeper. This then is where we’ll begin.

6.2 Role of ZooKeeper

6.3 Options at the broker level

6.3.1 Kafka’s other logs: Application logs

6.3.2 Server log

6.3.3 Managing state

6.4 Partition replica leaders and their role

6.4.1 Losing data

6.5 Peeking into Kafka

6.5.1 Cluster maintenance

6.5.2 Adding a broker

6.5.3 Upgrading your cluster

6.5.4 Upgrading your clients

6.5.5 Backups

6.6 A note on stateful systems

6.7 Exercise

Summary

References