9 Introduction

 

Chapter 1 from Software Telemetry by Jamie Riedesel

This chapter covers

  • What telemetry systems are
  • What telemetry means to different technical groups
  • Challenges unique to telemetry systems

Telemetry is the feedback you get from your production systems that tells you what’s going on in there, all to improve your ability to make decisions about your production systems. For NASA the production system might be a rover on Mars, but most of the rest of us have our production systems right here on Earth (and sometimes in orbit around Earth). Whether its the amount of power left in a rover’s batteries, or the number of containers live in Production right now, it’s all telemetry. Modern computing systems, especially those operating at scale, live and breathe telemetry; it’s how we can manage systems that large at all. Using telemetry is ubiquitous in our industry.

1.1       Defining the styles of telemetry

1.1.1   Defining centralized logging

1.1.2   Defining metrics

1.1.3   Defining distributed tracing

1.1.4   Defining Security Information Event Management

1.2       How telemetry is consumed by different teams

1.2.1   Telemetry usage by Operations, DevOps, and SRE teams

1.2.2   Telemetry usage by Security and Compliance teams

1.2.3   Telemetry usage by Software Engineering and SRE teams

1.2.4   Telemetry usage by Customer Support teams

1.2.5   Telemetry usage by Business Intelligence

1.3       Challenges facing telemetry systems

1.3.1   Chronic under-investment harms decision making

1.3.2   Diverse needs resist standardization

1.3.3   Information spills and cleaning them up

1.3.4   Court-orders break your assumptions

1.4       What you will learn

1.5       Summary