Appendix C. Understanding Kafka Streams architecture

 

In this book, you’ve learned that Kafka Streams is a directed, acyclic graph of processing nodes called a topology. You’ve seen how to add processing nodes to a topology for processing events in a Kafka Topic. But we still need to discuss how Kafka Streams get events into a topology, how the processing occurs, and how processed events are written back to a Kafka topic. We’ll take a deeper look into these questions in this appendix.

Here’s an illustration showing how this high-level view of what we’re going to discuss:

Figure C.1 Componetized view of a Kafka Streams application there are three sections: consuming, processing, and producing
kafka streams components

As you can see from the illustration, at a high level, we can break up how a Kafka Streams application works into three categories:

  1. Consuming events from a Kafka topic
  2. Assigning, distributing and processing events
  3. Producing processed events results to a Kafka topic

Given that we’ve already covered the Kafka clients in a previous chapter and that Kafka Streams is an abstraction over them, we won’t get into those details here. Instead, I’ll combine points one and three into a more general discussion on clients and then go deeper into Kafka Streams architecture for point two.

C.1 Consumer and producer clients in Kafka Streams

C.2 Assigning, distributing and processing events

C.3 Threads in Kafka Streams - the Stream Thread

C.4 Processing records

sitemap