1 Introduction
This chapter covers
- Defining what latency means
- Measuring latency
- Motivating optimizing for latency
- Comparing latency to throughput and bandwidth
- Trade-offs when optimizing for latency
This book is about building low-latency systems.
What you will get from reading this book is a solid understanding of what latency is, why it matters, and how to build low-latency systems. Our approach in this book is to strike a balance between practice and theory – engineering and academic research – so that you will not only know how to build low latency systems but also have a deeper understanding to know what solutions to apply and when they’re appropriate.
In this first chapter, we will define what latency is and why it matters because we first need intuition around what we want to build low latency systems. We will also discuss how latency relates to bandwidth and throughput and the trade-offs between throughput and energy efficiency.
So let’s get started!
1.1 What is latency?
Latency is a performance metric that measures time delay in your system. As a developer, you probably have discussed response time or lag informally, which is one way to define latency. However, as this book is about building low-latency systems, we need a better definition of latency to optimize it.
The definition of latency we are going to use in this book is as follows:
Latency is the time delay between a cause and its observed effect.