7 Eliminating work

 

This chapter covers

  • Eliminating work by taming algorithmic complexity
  • Reducing serialization overheads
  • Managing memory with low latency
  • Mitigating OS overheads
  • Replacing slow computation with precomputing

Welcome to the third part of the book!

In the previous part of the book, we examined different techniques for organizing data when designing an application for low latency. We introduced techniques that aim to reduce latency to ensure data access is not a latency bottleneck for your application:

  • Colocation to bring two components closer together.
  • Replication to maintain multiple (consistent) copies of the data.
  • Partitioning to reduce synchronization costs.
  • Caching to temporarily keep a copy of the data.

In other words, we looked at how your decisions on data organization impact latency and what you can do to mitigate that. However, in this third part of the book, we’re switching gears to explore how to build low-latency applications from the computational perspective to understand how to structure your application logic when building for low latency.

7.1 Overview

7.2 Algorithmic complexity

7.3 Serializing and deserializing

7.4 Memory management

7.4.1 Dynamic memory allocation

7.4.2 Garbage collection

7.4.3 Virtual and physical memory

7.4.4 Demand paging

7.4.5 Memory topology

7.5 Operating system overheads

7.5.1 Scheduling delay and context switching

7.5.2 Background tasks and interrupts

7.5.3 Network stack

7.6 Precomputation

7.7 Putting it together: Benchmarking with Criterion

7.8 Summary