6 Developing Kafka Streams
This chapter covers
- Introducing the Kafka Streams API
- Building our first Kafka Streams application
- Working with customer data and creating more complex applications
- Splitting, merging, and branching streams
A Kafka Streams application is a graph of processing nodes transforming event data as it streams through each node. In this chapter, you’ll learn how to build a graph that makes up a stream processing application with Kafka Streams.
6.1 A look at Kafka Streams
Let’s take a look at an illustration of what this means in figure 6.1. This illustration represents the generic structure of most Kafka Streams applications. There is a source node that consumes event records from a Kafka broker. There are any number of processing nodes, each performing a distinct task, and, finally, a sink node producing transformed records back out to Kafka. In chapter 4, we discussed how to use the Kafka clients to produce and consume records with Kafka. Much of what you learned in that chapter applies to Kafka Streams because, at its heart, Kafka Streams is an abstraction over the producers and consumers, leaving you free to focus on your stream-processing requirements.
Figure 6.1 Kafka Streams is a graph with a source node, any number of processing nodes, and a sink node.
Note While Kafka Streams is the native stream processing library for Apache Kafka, it does not run inside the cluster or brokers but connects as a client application.