Chapter 15. Automatic scaling of pods and cluster nodes

 

This chapter covers

  • Configuring automatic horizontal scaling of pods based on CPU utilization
  • Configuring automatic horizontal scaling of pods based on custom metrics
  • Understanding why vertical scaling of pods isn’t possible yet
  • Understanding automatic horizontal scaling of cluster nodes

Applications running in pods can be scaled out manually by increasing the replicas field in the ReplicationController, ReplicaSet, Deployment, or other scalable resource. Pods can also be scaled vertically by increasing their container’s resource requests and limits (though this can currently only be done at pod creation time, not while the pod is running). Although manual scaling is okay for times when you can anticipate load spikes in advance or when the load changes gradually over longer periods of time, requiring manual intervention to handle sudden, unpredictable traffic increases isn’t ideal.

Luckily, Kubernetes can monitor your pods and scale them up automatically as soon as it detects an increase in the CPU usage or some other metric. If running on a cloud infrastructure, it can even spin up additional nodes if the existing ones can’t accept any more pods. This chapter will explain how to get Kubernetes to do both pod and node autoscaling.

The autoscaling feature in Kubernetes was completely rewritten between the 1.6 and the 1.7 version, so be aware you may find outdated information on this subject online.

15.1. Horizontal pod autoscaling

 
 
 

15.1.1. Understanding the autoscaling process

 
 

15.1.2. Scaling based on CPU utilization

 
 
 
 

15.1.3. Scaling based on memory consumption

 
 

15.1.4. Scaling based on other and custom metrics

 
 

15.1.5. Determining which metrics are appropriate for autoscaling

 
 

15.1.6. Scaling down to zero replicas

 
 

15.2. Vertical pod autoscaling

 
 
 
 

15.2.1. Automatically configuring resource requests

 

15.2.2. Modifying resource requests while a pod is running

 
 
 

15.3. Horizontal scaling of cluster nodes

 

15.3.1. Introducing the Cluster Autoscaler

 
 
 

15.3.2. Enabling the Cluster Autoscaler

 
 

15.3.3. Limiting service disruption during cluster scale-down

 
 
 
 

15.4. Summary

 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest