Chapter 5. Stateful stream processing

 

This chapter covers

  • Processing multiple events from a stream by using state
  • The most popular stream processing frameworks
  • Using Apache Samza for detecting abandoned shopping carts
  • Deploying a Samza job on Apache Hadoop YARN

In chapter 3, we introduced the idea of processing continuous event streams and implemented a simple application that processed individual shopping events from the Nile website. The app we wrote did a few neat things: it read individual events from Kafka, filtered out bad input events, enriched the event with location information, and finally wrote the newly filtered and enriched event back out to Kafka.

Chapter 3’s app was relatively simple because it operated on only a single event at a time: it read each individual event off a Kafka topic, and then decided whether it would either filter the event (discard it), or enrich the event and write that enriched event back to a new Kafka topic. In the terminology introduced in chapter 3, our app was performing single-event processing, whereby one input event generates zero or more output events, in contrast to what we call multiple-event processing, whereby one or more input events generates zero or more output events.

5.1. Detecting abandoned shopping carts

 
 

5.2. Modeling our new events

 
 
 

5.3. Stateful stream processing

 
 
 
 

5.4. Detecting abandoned carts

 
 

5.5. Running our Samza job

 
 
 
 

Summary

 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage