5 Evolutionary Observability

 

This chapter covers

  • Why observability is critical for both a platform and its users
  • Providing observability as a service to platform users
  • Understanding how Observability Platforms work and when they are needed
  • Using Service Level Objectives (SLOs) to gain user confidence

Imagine that work on the platform starts at Epetech with a backlog of stories, and things are going well. Prioritization of stories is leading to a steady stream of delivery, and with the observability-driven development practices that have been evangelized across the team, plenty of telemetry data can be used to diagnose and uncover issues. More importantly, the business has a great idea of the value that platform efforts are returning right from the start. Across the engineering department, we are improving observability and seeing the benefits! Beyond quickly diagnosing and troubleshooting issues, leaders at the organization are starting to see that data can be correlated across applications and services to show how systems are performing across a whole portfolio, and they want to know more. They want to be able to quickly and easily define new queries that can be used to spot trends for areas that are succeeding and those that are failing.

5.1 Why observability matters?

5.1.1 Observability is more than metrics and alerts

5.1.2 Use cases for observability beyond basic monitoring of applications

5.1.3 What Does Good Look Like?

5.1.4 Viewing observability through a single pane of glass

5.2 Observability as a platform service

5.2.1 The end-user access experience

5.2.2 Automatic collection of customer data

5.2.3 Who needs to respond when things need attention?

5.3 Observability platform as a separate internal product

5.3.1 Architecture of an observability platform

5.3.2 Should you build or buy?

5.3.3 Cross-platform observability

5.3.4 Strategies to Drive Adoption

5.4 Observability of published SLIs, SLOs, and SLAs

5.4.1 SLOs as Code

5.5 Summary