1 Introduction
This chapter covers
- Defining what latency means
- Measuring latency
- Motivating optimizing for latency
- Comparing latency to throughput and bandwidth
- Trade-offs when optimizing for latency
This book is about how to build low-latency applications.
Latency is crucial across a wide range of use cases today. When your application suddenly slows down under load, when a database query that should take milliseconds stretches into seconds, or when users abandon your service because pages won't load, you need concrete solutions. However, many low-latency techniques are effectively developer folklore, hidden in blog posts, mailing lists, and side notes in books on performance optimization.
This book provides the specific techniques, tools, and mental models you need to diagnose latency problems and address them systematically. So, instead of hunting through scattered blog posts and forum discussions for bits of information to solve your problem, this book is a comprehensive guide to understanding how latency works across the entire stack and what to do about it. It is the book I always wished I had when I was grappling with latency issues. Although this book focuses on applying the techniques in practice, we'll also cover enough of the background side of things to strike a balance between theory and practice.