Chapter 12. Scaling your system with sharding

 

This chapter covers

  • Sharding motivation and architecture
  • Setting up and loading a sample shard cluster
  • Querying and indexing a sharded cluster
  • Choosing a shard key
  • Deploying sharding in production

With the increasing scale of modern applications, it’s become more and more expensive, and in some cases impossible, to get a single machine powerful enough to handle the load. One solution to the problem is to pool the capacity of a large number of less powerful machines. Sharding in MongoDB is designed to do just that: partition your database into smaller pieces so that no single machine has to store all the data or handle the entire load. On top of that, sharding in MongoDB is transparent to the application, which means the interface for querying a sharded cluster is exactly the same as the interface for querying a replica set or a single mongod server instance.

We’ll begin the chapter with an overview of sharding. We’ll go into detail about what problems it tries to solve and how to know when you need it. Next, we’ll talk about the components that make up a sharded cluster. Then, we’ll cover the two different ways to shard, and scratch the surface of MongoDB’s range-based partitioning.

12.1. Sharding overview

12.2. Understanding components of a sharded cluster

12.3. Distributing data in a sharded cluster

12.4. Building a sample shard cluster

12.5. Querying and indexing a shard cluster

12.6. Choosing a shard key

12.7. Sharding in production

12.8. Summary