Errors happen. Code has bugs. Hardware, software, and networks are unreliable. Failures happen regularly for all types of applications, not just for microservices. But microservices applications are more complex, and so problems can become considerably more difficult to debug as we grow our application. The more microservices we maintain, the greater the chance at any given time that some of those microservices are misbehaving!
We can’t avoid problems entirely. It doesn’t matter if they are caused by human error or unreliable infrastructure. It’s a certainty—problems happen. But just because problems can’t always be avoided, doesn’t mean we shouldn’t try to mitigate against them. A well-engineered application expects problems, even when the specific nature of some problems can’t be anticipated.
As our application evolves to be more complex, we’ll need techniques to combat problems and keep our microservices healthy. Our industry has developed many best practices and patterns for dealing with problems. We’ll cover some of the most useful ones in this chapter. Following this guidance will make our application run more smoothly and be more reliable, resulting in less stress and easier recovery from problems when they do happen.