chapter one

1 Introduction

This chapter covers

Defining what latency means
Measuring latency
Motivating optimizing for latency
Comparing latency to throughput and bandwidth
Trade-offs when optimizing for latency

This book is about how to build low-latency applications. Latency is crucial across a wide range of use cases today. When your application suddenly slows down under load, when a database query that should take milliseconds stretches into seconds, or when users abandon your service because pages won’t load, you need concrete solutions. However, many low-latency techniques are effectively developer folklore, hidden in blog posts, mailing lists, and side notes in books on performance optimization.

This book provides the specific techniques, tools, and mental models you need to diagnose latency problems and address them systematically. Instead of you having to hunt through scattered blog posts and forum discussions for bits of information to solve your problem, this book provides a comprehensive guide to understanding how latency works across the entire stack and what to do about it. It is the book I always wished I had when I was grappling with latency issues. Although this book focuses on applying the techniques in practice, we’ll also cover enough of the background side of things to strike a balance between theory and practice.

1 Introduction

This chapter covers

1.1 What is latency?

1.2 How is latency measured?

1.3 Why does latency matter?

1.3.1 User experience

1.3.2 Real-time systems

1.3.3 Efficiency

1.4 What latency is not

1.5 Latency vs. bandwidth

1.6 Latency vs. energy

Summary