chapter six

6 Who you gonna call? Syscall-busters!

 

This chapter covers

  • Observing syscalls of a running process using strace and BPF
  • Working with black-box software, understanding what it does without reading the source code
  • Designing chaos experiments at the syscall level
  • Blocking syscalls using strace and seccomp

It’s time to take a deep dive - all the way to the OS - to learn how to do chaos engineering at the syscall level. I want to show you that even in a simple system, like a single process running on a host, you can create plenty of value by applying chaos engineering and learning just how resilient that system is to failure. And oh - it’s good fun too!

In this chapter, we’ll start with a brief refresher on syscalls. We’ll then see how to do the following:

  • Understand what a process does without looking at its source code
  • List and block the syscalls that it can make
  • Experimentally test our assumptions about how it deals with failure.

If I do my job well, you’ll finish this chapter with a realization that it’s hard to find a piece of software that can’t benefit from some chaos engineering, even if it’s closed source. Whoa, did I just say “closed source?” The same guy who always goes on about how great open source software is and who maintains some himself? Why would you do closed source? Well, sometimes it all starts with a promotion.

6.1   Scenario - congratulations on your promotion!

6.1.1   System X: if everyone is using it, but no one maintains it, is it abandonware?

6.2   A brief refresher on syscalls

6.2.1   Finding out about syscalls

6.2.2   Standard C library and glibc

6.3   How to observe a process’ syscalls?

6.3.1   strace and sleep

6.3.2   strace and System X

6.3.3   strace’s problem - overhead

6.3.4   BPF

6.3.5   Other options

6.4   Blocking syscalls for fun and profit part 1 - strace

6.4.1   Experiment 1: breaking the close syscall

6.4.2   Experiment 1: steady state

6.4.3   Experiment 1: implementation

6.4.4   Experiment 1: analysis

6.4.5   Experiment 2: breaking the write syscall

6.4.6   Experiment 2: steady state

6.4.7   Experiment 2: implementation

6.5   Blocking syscalls for fun and profit part 2 - seccomp

6.5.1   Seccomp the easy way - Docker

6.5.2   Seccomp the hard way - libseccomp

6.6   Summary