3 Colocation
This chapter covers
- Colocating code and data as a latency optimization
- Optimizing for low latency in distributed systems
- Optimizing for low latency in multicore systems
Welcome to the second part of the book!
In the book's first part in Chapters 1 and 2, we explored latency as a performance metric and how to model and measure it, an essential prerequisite for optimizing for low latency. However, in this second part of the book, we'll turn our focus on optimizing data-related latency.
The first data-related optimization we'll look at is colocation. Consider the following scenario: your web application queries a database server located 6000 kilometers (~3700 miles) away, adding 60 ms of network round-trip latency to every request. By moving that database to the same data center as your application, you reduce query latency to under 10 ms, which is a 6x improvement through a simple change in placement.
Colocation works by minimizing the physical distance data must travel. If your application accesses a database, bringing the database closer makes data access faster. Taking this to the extreme, embedding a database directly in your application eliminates network communication. Colocation also applies to system-to-system communication as high-frequency traders discovered long ago. What they noticed is that that placing their automated trading servers in the same data center as the exchange provides a crucial latency advantage over competitors located elsewhere.