10 Chaos in Kubernetes

 

This chapter covers

  • Quick introduction to Kubernetes
  • Designing chaos experiments for software running on Kubernetes
  • Killing subsets of applications running on Kubernetes to test their resilience
  • Injecting network slowness using a proxy

It’s time to cover Kubernetes (https://kubernetes.io/). Anyone working in software engineering would have a hard time not hearing it mentioned, at the very least. I have never seen an open source project become so popular so quickly. I remember going to one of the first editions of KubeCon in London in 2016 to try to evaluate whether investing any time into this entire Kubernetes thing was worth it. Fast-forward to 2020, and Kubernetes expertise is now one of the most demanded skills!

Kubernetes solves (or at least makes it easier to solve) a lot of problems that arise when running software across a fleet of machines. Its wide adoption indicates that it might be doing something right. But, like everything else, it’s not perfect, and it adds its own complexity to the system—complexity that needs to be managed and understood, and that lends well to the practices of chaos engineering.

Kubernetes is a big topic, so I’ve split it into three chapters:

10.1 Porting things onto Kubernetes

10.1.1 High-Profile Project documentation

10.1.2 What’s Goldpinger?

10.2 What’s Kubernetes (in 7 minutes)?

10.2.1 A very brief history of Kubernetes

10.2.2 What can Kubernetes do for you?

10.3 Setting up a Kubernetes cluster

10.3.1 Using Minikube

10.3.2 Starting a cluster

10.4 Testing out software running on Kubernetes

10.4.1 Running the ICANT Project

10.4.2 Experiment 1: Kill 50% of pods

10.4.3 Party trick: Kill pods in style

10.4.4 Experiment 2: Introduce network slowness

Summary