Autoscaling awakens the engineering imagination in a way that few topics do. Most of the systems we build seem lifeless or mindless. But to build a system that appears to breathe is somehow uniquely fascinating. Depressingly, though, autoscaling turns out to be easy to spell, yet hard to achieve. The system that breathes peacefully today is yelling obscenities tomorrow.
My goal in this chapter is to explain the basic structure and functioning of the components responsible for the management of scaling in Knative Serving: the Autoscaler, the Activator, and the Queue-Proxy. Most of the time, you will not need to think of these because these embody the accumulated observations and insights of the Knative authors. But, these are dynamic systems and exhibit dynamic complexity, which means that you will occasionally be surprised. A grasp of the components will help you to moderate your surprise.