chapter four

4 Using cgroups for processes in our Pods

This chapter covers

Exploring the basics of cgroups
Identifying Kubernetes processes
Learning how to create and manage cgroups
Using Linux commands to investigate cgroup hierarchies
Understanding cgroup v2 versus cgroup v1
Installing Prometheus and looking at Pod resource usage

The last chapter was pretty granular, and you might have found it a little bit theoretical. After all, nobody really needs to build their own Pods from scratch nowadays (unless you’re Facebook). Never fear, from here on out, we will start moving a little bit further up the stack.

In this chapter, we’ll dive a bit deeper into cgroups: the control structures that isolate resources from one another in the kernel. In the previous chapter, we actually implemented a simple cgroup boundary for a Pod that we made all by ourselves. This time around, we’ll create a “real” Kubernetes Pod and investigate how the kernel manages that Pod’s cgroup footprint. Along the way, we’ll go through some silly, but nevertheless instructive, examples of why cgroups exist. We’ll conclude with a look at Prometheus, the time-series metrics aggregator that has become the de facto standard for all metrics and observation platforms in the cloud native space.

4.1 Pods are idle until the prep work completes

4.2 Processes and threads in Linux

4.2.1 systemd and the init process

4.2.2 cgroups for our process

4.2.3 Implementing cgroups for a normal Pod

4.3 Testing the cgroups

4.4 How the kubelet manages cgroups

4.5 Diving into how the kubelet manages resources

4.5.1 Why can’t the OS use swap in Kubernetes?

4.5.2 Hack: The poor man’s priority knob

4.5.3 Hack: Editing HugePages with init containers

4.5.4 QoS classes: Why they matter and how they work

4.5.5 Creating QoS classes by setting resources

4.6 Monitoring the Linux kernel with Prometheus, cAdvisor, and the API server