This chapter covers
- Understanding the benefits and downsides of partitioning
- Partitioning strategies and how to choose one
- Request routing when partitioning
- Mitigating against skewed workloads and hot partitions
In the previous two chapters, we explored techniques for reducing latency with colocation and replication. Colocation places related computing, such as business logic and data resources, nearby, which minimizes the network distance between them. Colocation can improve latency by reducing communication latency between components. For example, an application that uses serverless functions for backend logic may benefit from a database that is colocated with the serverless runtime. Replication, on the other hand, is a technique for copying relevant data to multiple locations while maintaining consistency between the copies. With replication, you receive the same benefits as with colocation, but across various locations. However, replicating the entire dataset across numerous locations can be impractical due to storage costs and network bandwidth requirements. Additionally, maintaining consistency across multiple replicas can introduce significant coordination overhead, as every write operation may need to be synchronized across various locations, thereby creating latency bottlenecks rather than reducing them.