7 Observability: Understanding the behavior of your services

 

This chapter covers

  • Collecting basic request-level metrics
  • Understanding Istio’s standard service-to-service metrics
  • Using Prometheus to scrape workload and control-plane metrics
  • Adding new metrics in Istio to track in Prometheus

Recently, you may have heard the term observability start to creep into the vocabulary of software engineers, operations, and site-reliability teams. These teams have to deal with the near-exponential increase in complexity when operationalizing a microservices-style architecture on cloud infrastructure. When we start to deploy our application as tens or hundreds of services (or more) per application, we increase the number of moving pieces, reliance on the network for things to succeed, and the number of things that can and do go wrong.

7.1 What is observability?

7.1.1 Observability vs. monitoring

7.1.2 How Istio helps with observability

7.2 Exploring Istio metrics

7.2.1 Metrics in the data plane

7.2.2 Metrics in the control plane

7.3 Scraping Istio metrics with Prometheus

7.3.1 Setting up Prometheus and Grafana

7.3.2 Configuring the Prometheus Operator to scrape the Istio control plane and workloads

7.4 Customizing Istio’s standard metrics

7.4.1 Configuring existing metrics

7.4.2 Creating new metrics

7.4.3 Grouping calls with new attributes

Summary