chapter five

5 Fault tolerance

This chapter covers

Self-healing systems and the let-it-crash principle
The actor lifecycle signals
Supervising strategies and their signals
Monitoring and watching

This chapter covers Akka’s tools for making applications more resilient. These tools, which follow the let-it-crash principle, are supervision, monitoring, and the actor lifecycle features. We look at examples that show how to apply them to typical failure scenarios.

NOTE

The source code for this chapter is available at www.manning.com/books/akka-in-action-second-edition or https://github.com/franciscolo pezsancho/akka-topics/tree/main/chapter05. You can find the contents of any snippet or listing in the .scala file with the same name as the class, object, or trait.

5.1 What fault tolerance is (and what it isn’t)

Let’s start with a definition of what we refer to here as a fault-tolerant system and why you’d write code to embrace the notion of failure. In an ideal world, a system is always available and can guarantee that it will be successful with each undertaken action. The only two paths to this ideal are using components that can never fail or accounting for every possible fault by providing a recovery action, which is also assured of success. In most architectures, what you have instead is a catch-all mechanism that terminates as soon as an uncaught failure arises.

5.1.1 Plain old objects and exceptions

5 Fault tolerance

This chapter covers

NOTE

5.1 What fault tolerance is (and what it isn’t)

5.1.1 Plain old objects and exceptions

5.1.2 Wrap it up and let it crash

5.2 Actor lifecycle events: Signals

5.3 Supervision strategies and signals

5.3.1 Uneventful resuming

5.3.2 Stopping and the PostStop signal

5.3.3 Restart and the PreRestart signal

5.3.4 Custom strategy

5.4 Watching signals from an actor

5.5 Back to the initial use case

Summary