Chapter 5. Autoscaling with metrics

 

This chapter covers

  • Using container metrics
  • Creating a Horizontal Pod Autoscaler
  • Setting resource requests and limits
  • Autoscaling applications
  • Load testing with the Apache HTTP server benchmarking tool

In the last chapter, you learned about the health and status of an application. You learned that OpenShift deployments use replication controllers (RCs) under the covers to ensure that a static number of pods is always running. Readiness probes and liveness probes make sure running pods start as expected and behave as expected. The number of pods servicing a given workload can also be easily modified to a new static number with a single command or the click of a button.

This new deployment model gives you much better resource utilization than the traditional virtual machine (VM) model, but it’s not a silver bullet for operational efficiency. One of the big IT challenges with VMs is resource utilization. Traditionally, when deploying VMs, developers ask for much higher levels of CPU and RAM than are actually needed. Not only is making changes to VM resources challenging, but many developers typically have no idea what types of resources are needed to run the application. Even at large companies like Google and Netflix, predicting application workload demand is so challenging that tools are often used to scale the applications as needed.

5.1. Determining expected workloads is difficult

5.2. Installing OpenShift metrics

5.3. Using pod metrics to trigger pod autoscaling

5.4. Summary