It’s time to take a deep dive—all the way to the OS—to learn how to do chaos engineering at the syscall level. I want to show you that even in a simple system, like a single process running on a host, you can create plenty of value by applying chaos engineering and learning just how resilient that system is to failure. And, oh, it’s good fun too!
- Understand what a process does without looking at its source code
- List and block the syscalls that a process can make
- Experimentally test your assumptions about how a process deals with failure
If I do my job well, you’ll finish this chapter with a realization that it’s hard to find a piece of software that can’t benefit from chaos engineering, even if it’s closed source. Whoa, did I just say closed source? The same guy who always goes on about how great open source software is and who maintains some himself? Why would you do closed source? Well, sometimes it all starts with a promotion.