7 Partitioning

 

This chapter covers

  • The concept of partitioning
  • Static and dynamic partitioning
  • Vertical and horizontal partitioning
  • Item-based lookup versus directory-based lookup
  • Common strategies

Two techniques are essential in guaranteeing the scalability and reliability of a distributed system: partitioning and replication. Partitioning seeks to overcome scalability limitations of a single component while replication seeks to overcome reliability limitations of a single component.

First up: Partitioning!

7.1 Encyclopedias and volumes

When I was a teenager, one of my most cherished possessions was a small encyclopedia. An encyclopedia is a compilation of entries that are sorted alphabetically. However, my encyclopedia was too extensive to fit into a single book. Instead, my encyclopedia was divided into multiple volumes (see Figure 7.1).

Figure 7.1 Encyclopedias and volumes

The encyclopedia represents a logical object while the volumes represent physical objects. Logically, the encyclopedia contains all entries, but physically, the volumes contain disjoint subsets of entries.

The authors could have placed any item in a randomly selected volume. However, doing so would have made the encyclopedia completely useless. If I needed to find an item, I would be forced to scan the entire encyclopedia to locate the item or come to the realization that the item doesn't exist.

7.2 Thinking in partitions

7.3 The mechanics of partitioning and balancing

7.4 (Re)Partitioning

7.4.1 Types of partitioning

7.4.2 Data item to partition assignment strategies

7.5 Common item-based assignment strategies

7.5.1 Range partitioning

7.5.2 Hash partitioning

7.6 Repartitioning

7.6.1 Range partitioning

7.6.2 Hash partitioning

7.7 Consistent hashing

7.8 (Re)Balancing and over partitioning

7.9 Summary