6 Scaling up
This chapter covers:
- Scaling Pods and Nodes manually
- How Kubernetes platforms can be configured to add and remove nodes automatically based on the resources your Pods require
- Using request-based metrics to dynamically scale Pod replicas
- Using low priority “balloon” pods to provision burst capacity
- Architecting apps so that they can be scaled
Now that we have the application deployed and have health checks in place to keep it running without intervention, it’s a good time to look at how you’re going to scale up. I’ve named this chapter “scaling up”, as I think everyone cares deeply about whether their system architecture can handle being scaled up when your application becomes wildly successful and you need to serve all your new users, but don’t worry I’ll also cover scaling down so you can save money during the quiet periods.
Our goal is ultimately to operationalize our deployment using automatic scaling, that way we can be fast asleep, or relaxing on a beach in Australia, and our application can be responding to traffic spikes dynamically. To get there, we’ll need to ensure that the application is capable of scaling, understand the scaling interactions of Pods and Nodes in the Kubernetes cluster, and the right metrics to configure an autoscaler to do it all for us.