chapter one

1 Introduction

This chapter covers

Defining what latency means
Measuring latency
Motivating optimizing for latency
Comparing latency to throughput and bandwidth
Trade-offs when optimizing for latency

This book is about building low-latency systems.

What you will get from reading this book is a solid understanding of what latency is, why it matters, and how to build low-latency systems. Our approach in this book is to strike a balance between practice and theory – engineering and academic research – so that you will not only know how to build low latency systems but also have a deeper understanding to know what solutions to apply and when they’re appropriate.

In this first chapter, we will define what latency is and why it matters because we first need intuition around what we want to build low latency systems. We will also discuss how latency relates to bandwidth and throughput and the trade-offs between throughput and energy efficiency.

So let’s get started!

1.1 What is latency?

Latency is a performance metric that measures time delay in your system. As a developer, you probably have discussed response time or lag informally, which is one way to define latency. However, as this book is about building low-latency systems, we need a better definition of latency to optimize it.

The definition of latency we are going to use in this book is as follows:

Latency is the time delay between a cause and its observed effect.

1 Introduction

This chapter covers

1.1 What is latency?

1.2 How is latency measured?

1.3 Why does latency matter?

1.3.1 User experience

1.3.2 Real-time systems

1.3.3 Efficiency

1.4 What latency is not?

1.5 Latency vs. bandwidth

1.6 Latency vs. energy

1.7 Summary