At this point in the book, you should have a good idea about what ROI-driven observability means. There is, however, an operations topic we haven’t discussed yet and that is needed to complete the picture: how satisfied is the consumer of a service, and how do we know whether the consumer is satisfied, based on data? The consumer doesn’t have to be an external customer, especially in larger organizations, where consumers could be different business units. Now, don’t get me wrong—there’s nothing more motivating than a snarky tweet or a thoughtful comment on the orange site (aka Hacker News). However, wouldn’t it be nice if we could automate the whole process?
If you step back, you will find that DevOps and site reliability engineering (SRE) took off in the past decade, with the former being more bottom up and the latter clearly being driven by Google. The core concepts and ideas in this chapter are, indeed, a Google invention, and if you want to study every last detail, including best practices, I encourage you to head over to the Google SRE books site (https://sre.google/books/) and read everything. In this chapter, we will take a more practical approach, covering the fundamentals quickly and then showing how to use them.