10 Stateful computation

 

In this chapter

  • an introduction to stateful and stateless components
  • how stateful components work
  • related techniques

Have you tried turning it off and on again?

—The IT Crowd

We talked about state in chapter 5. In most computer programs, it is an important concept. For example, the progress in a game, the current content in a text editor, the rows in a spreadsheet, and the opened pages in a web browser are all states of the programs. When a program is closed and opened again, we would like to recover to the desired state. In streaming systems, handling states correctly is very important. In this chapter, we are going to discuss in more detail how states are used and managed in streaming systems.

The migration of the streaming jobs

System maintenance is part of our day-to-day work with distributed systems. A few examples are: releasing a new build with bug fixes and new features, upgrading software or hardware to make the systems more secure or efficient, and handling software and hardware failures to keep the systems running.

AJ and Sid have decided to migrate the streaming jobs to new and more efficient hardware to reduce cost and improve reliability. This is a major maintenance task, and it is important to proceed carefully.

Stateful components in the system usage job

Revisit: State

The states in different components

State data vs. temporary data

Stateful vs. stateless components: The code

The stateful source and operator in the system usage job

States and checkpoints

Checkpoint creation: Timing is hard

Event-based timing

Creating checkpoints with checkpoint events

A checkpoint event is handled by instance executors

A checkpoint event flowing through a job

Summary