8 Designing Streaming Applications
This chapter covers
- An introduction to real-time processing and its key principles
- The design for building streaming applications
- The architecture of the Kafka Streams framework
- Exploring ksqlDB and Apache Flink for real-time data processing
To implement real-time processing use cases, you need a clear grasp of the underlying concepts and frameworks. Here we’ll cover the building blocks for streaming on Kafka—how to transform, join, and aggregate events as they arrive. You’ll weigh dedicated stream-processing frameworks against traditional service code. Using Kafka Streams, we explain core concepts and operators—stateless (map, filter) and stateful (joins, windows, aggregates)—and when to use the Processor API. Finally, we compare alternatives—ksqlDB, Apache Flink, and managed cloud services—with guidance on when to choose them over Kafka Streams.
8.1 Field notes: Transforming data in motion
The team gathered for their regular meeting, notebooks and laptops at the ready. Today’s topic quickly turned to the question of how to transform and aggregate their data as part of the Customer360 project.
Eva: Since we’re doing all this research, I think it’s worth considering transforming the data not just at the final service but somewhere in the middle. We could use a streaming framework for that.
Max: A streaming framework? What do you mean by that, Eva?