6 Developing Kafka Streams

 

This chapter covers

  • Introducing the Kafka Streams API
  • Building our first Kafka Streams application
  • Working with customer data; creating more complex applications
  • Splitting, merging and branching streams oh my!

Simply stated, a Kafka Streams application is a graph of processing nodes that transforms event data as it streams through each node. Let’s take a look at an illustration of what this means:

Figure 6.1. Kafka Streams is a graph with a source node, any number of processing nodes and a sink node
shipping event stream

This illustration represents the generic structure of most Kafka Streams applications. There is a source node that consumes event records from a Kafka broker. Then there are any number of processing nodes, each performing a distinct task and finally a sink node used to write the transformed records back out to Kafka. In a previous chapter we discussed how to use the Kafka clients for producing and consuming records with Kafka. Much of what you learned in that chapter applies for Kafka Streams, because at it’s heart, Kafka Streams is an abstraction over the producers and consumers, leaving you free to focus on your stream processing requirements.

[Important]  Important

While Kafka Streams is the native stream processing library for Apache Kafka ®, it does not run inside the cluster or brokers, but connects as a client application.

In this chapter, you’ll learn how to build such a graph that makes up a stream processing application with Kafka Streams.

6.1 The Streams DSL

6.2 Hello World for Kafka Streams

6.2.1 Creating the topology for the Yelling App

6.2.2 Kafka Streams configuration

6.2.3 Serde creation

6.3 Masking credit card numbers and tracking purchase rewards in a retail sales setting

6.3.1 Building the source node and the masking processor

6.3.2 Adding the patterns processor

6.3.3 Building the rewards processor

6.3.4 Using Serdes to encpsulate serializers and deserializers in Kafka Streams

6.5.1 Filtering purchases

sitemap