Chapter 15. Automatic scaling of pods and cluster nodes
This chapter covers
- Configuring automatic horizontal scaling of pods based on CPU utilization
- Configuring automatic horizontal scaling of pods based on custom metrics
- Understanding why vertical scaling of pods isn’t possible yet
- Understanding automatic horizontal scaling of cluster nodes
Applications running in pods can be scaled out manually by increasing the replicas field in the ReplicationController, ReplicaSet, Deployment, or other scalable resource. Pods can also be scaled vertically by increasing their container’s resource requests and limits (though this can currently only be done at pod creation time, not while the pod is running). Although manual scaling is okay for times when you can anticipate load spikes in advance or when the load changes gradually over longer periods of time, requiring manual intervention to handle sudden, unpredictable traffic increases isn’t ideal.
Luckily, Kubernetes can monitor your pods and scale them up automatically as soon as it detects an increase in the CPU usage or some other metric. If running on a cloud infrastructure, it can even spin up additional nodes if the existing ones can’t accept any more pods. This chapter will explain how to get Kubernetes to do both pod and node autoscaling.
The autoscaling feature in Kubernetes was completely rewritten between the 1.6 and the 1.7 version, so be aware you may find outdated information on this subject online.