chapter five

5 Partitioning

 

This chapter covers

  • Understanding the benefits and downsides of partitioning
  • Partitioning strategies and how to choose one
  • Request routing when partitioning
  • Skewed workloads and hot partitions and how to mitigate against them

In the previous two chapters, we explored techniques for reducing latency with colocation and replication. Colocation places related computing such as business logic and data resources nearby, which minimizes the network distance between them. Colocation can improve latency because you reduce communication latency between the components. For example, an application that uses serverless functions for backend logic may benefit from a database that is colocated with the serverless runtime. Replication, on the other hand, is a technique for copying relevant data at multiple locations while maintaining consistency between the copies. With replication, you get the same kind of benefits as with colocation, but at multiple different places.

5.1 Why partition data?

5.2 Physical partitioning strategies

5.2.1 Horizontal partitioning

5.2.2 Vertical partitioning

5.2.3 Hybrid partitioning

5.3 Logical partitioning strategies

5.3.1 Functional partitioning

5.3.2 Geographical partitioning

5.3.3 User-based partitioning

5.3.4 Time-based partitioning

5.3.5 Overpartitioning

5.4 Request routing

5.4.1 Direct routing

5.4.2 Proxy routing

5.4.3 Forward routing

5.5 Partition imbalance

5.5.1 Hot partitions

5.5.2 Skewed workloads

5.6 Putting it together: Horizontal partitioning with SQLite

5.7 Summary