5 Resource Management

This chapter covers:

How Kubernetes allocates the finite resources in your cluster
Configuring your workload to request just the resources it needs
Overcommitting resources to improve your performance-to-cost ratio
Avoiding a single point of failure with a highly available deployment strategy
Targeting and avoiding specific groups of nodes for deployments

In chapter 2, we talked about how Containers are the new level of isolation each with their own resources, and in Chapter 3 that the schedulable unit in Kubernetes is a Pod (basically just a collection of containers). In this chapter we go deeper into the “container orchestrator” that is the Kubernetes scheduler and look at how Pods are allocated to machines, as well as the information that you need to give the system to have things configured correctly.

Knowing how the scheduling of Pods to Nodes works helps you make better architectural decisions around allocations, bursting, overcommit, availability and reliability.

5.1 Pod Scheduling

The Kubernetes scheduler performs (at a basic level) a resource-based allocation of Pods to Nodes, and is really the brains of the whole system. When you submit your configuration to Kubernetes (as we did in Chapter 3 and 4), it’s the scheduler that does the heavy lifting of finding a node in your cluster with enough resources, and tasks the node with booting and running the containers in your Pods.

5.1.1 Specifying Pod Resources

5.1.2 Quality of Service

5.1.3 Evictions, Priority and Preemption

5.2 Calculating Pod Resources

5.2.1 Setting Memory Requests and Limits

5.2.2 Setting CPU Requests and Limits

5.2.3 Reducing Costs by Overcommitting CPU

5.2.4 Balancing Pod Replicas and Internal Pod Concurrency

5.3 Placing Pods

5.3.1 Building Highly Available Deployments

5.3.2 Collocating Interdependent Pods

5.3.3 Avoiding Related Pods

5.3.4 Avoiding or Targeting Groups of Nodes

5.3.5 Debugging Placement Issues

5.3.6 Other Placement Policies

5.4 Summary