2 First cup of chaos and blast radius


This chapter covers

  • Setting up a virtual machine to run through accompanying code
  • Using basic Linux forensics—why did your process die?
  • Performing your first chaos experiment with a simple bash script
  • Understanding the blast radius

The previous chapter covered what chaos engineering is and what a chaos experiment template looks like. It is now time to get your hands dirty and implement an experiment from scratch! I’m going to take you step by step through building your first chaos experiment, using nothing more than a few lines of bash. I’ll also use the occasion to introduce and illustrate new concepts like blast radius.

Just one last pit stop before we’re off to our journey: let’s set up the workspace.


I’ll bet you’re wondering what a blast radius is. Let me explain. Much like an explosive, a software component can go wrong and break other things it connects to. We often speak of a blast radius to describe the maximum number of things that can be affected by something going wrong. I’ll teach you more about it as you read this chapter.

2.1 Setup: Working with the code in this book

2.2 Scenario

2.3 Linux forensics 101

2.3.1 Exit codes

2.3.2 Killing processes

2.3.3 Out-Of-Memory Killer

2.4 The first chaos experiment

2.4.1 Ensure observability

2.4.2 Define a steady state

2.4.3 Form a hypothesis

2.4.4 Run the experiment

2.5 Blast radius

2.6 Digging deeper

2.6.1 Saving the world