1 Introduction


Telemetry is the feedback you get from your production systems that tells you what’s going on in there—feedback that improves your ability to make decisions about your production systems. For NASA, the production system might be a rover on Mars, but most of the rest of us have our production systems right here on Earth (and sometimes in orbit around Earth). Whether it’s the amount of power left in a rover’s batteries or the number of containers live in production right now, everything is telemetry. Modern computing systems, especially those operating at scale, live and breathe telemetry, which is how we can manage systems that large at all. Telemetry is ubiquitous in our industry:

1.1 Defining the styles of telemetry

1.1.1 Defining centralized logging

1.1.2 Defining metrics

1.1.3 Defining distributed tracing

1.1.4 Defining SIEM

1.2 How telemetry is consumed by different teams

1.2.1 Telemetry use by Operations, DevOps, and SRE teams

1.2.2 Telemetry use by Security and Compliance teams

1.2.3 Telemetry use by Software Engineering and SRE teams

1.2.4 Telemetry use by Customer Support teams

1.2.5 Telemetry use by business intelligence

1.3 Challenges facing telemetry systems

1.3.1 Chronic underinvestment harms decision-making

1.3.2 Diverse needs resist standardization

1.3.3 Information spills and cleaning them up to avoid legal problems

1.3.4 Court orders break your assumptions

1.4 What you will learn