5 Stateless transformations with Kafka Streams

 

This chapter covers

  • Setting up and configuring a Kafka Streams application
  • Mapping Kafka records
  • Filtering Kafka records
  • Routing Kafka records

So far, we have focused on replicating data changes from data sources to data sinks without processing them on the way. A lot of use cases require transforming the data while streaming them. For instance, to become compliant with data protection regulations, Excellent Toys might want to mask the IP addresses of website visitors when streaming the data from their analytics tool to downstream consumers.

Kafka Streams is a Java library that helps you build stream applications that consume data from Kafka topics as their input, apply processing logic of arbitrary complexity to the data, and produce the processed data to Kafka topics as their output. Kafka Streams applications can be developed, tested, packaged, and deployed as regular Java applications, which makes stream processing accessible to software developers and anyone with basic programming experience.

5.1 Setting up a Kafka Streams application

5.1.1 Configuration

5.1.2 Processing topology

5.1.3 Executing the Kafka Streams application

5.2 Mapping data

5.2.1 map() and mapValues()

5.2.2 flatMap() and flatMapValues()

5.3 Filtering data

5.4 Routing data to different data sinks

5.5 Summary