2 Modeling and measuring latency

 

This chapter covers

  • Designing with laws of latency in mind
  • Thinking of latency as a distribution
  • Discovering common sources of latency
  • Understanding how latency compounds
  • Measuring latency correctly

In the previous chapter, we discussed latency is and is not and why we care. We pointed out that latency is the time delay between a cause and its observed effect, which is context-specific. For example, if you browse a web page, we're generally interested in the request-response latency between the client and the server. For lower-level system components, we might be interested in the latency between a packet arriving from the network, the time it takes for the userspace application to process it, and so on. We also discussed how latency compares to bandwidth and throughput and the latency and energy-efficiency trade-offs.

In this chapter, we will dive into more details on how to model and measure latency, which is essential as you build for low latency. First, you will learn about two important principles related to latency: Little's Law and Amdahl's Law. They are theoretical concepts that provide practical insights into systems design and performance. Little's Law, guiding our system design, reveals the relationship between latency, throughput, and concurrency. On the other hand, Amdahl's Law explores the balance between latency and parallelism, showing the potential acceleration we can achieve by reducing latency or increasing parallel processing.

2.1 Laws of latency

2.1.1 Little’s Law

2.1.2 Amdahl’s Law

2.2 Latency distribution

2.3 Common sources of latency

2.3.1 Physics

2.3.2 CPU and hardware

2.3.3 Virtualization

2.3.4 Operating system, drivers, and firmware

2.3.5 Managed runtime

2.3.6 Application

2.4 Compounding latency

2.5 Measuring latency

2.6 Putting it together: Measuring network latency

2.6.1 Plotting with histograms

2.6.2 Plotting with eCDF

2.7 Summary