4 Replication
This chapter covers
- Understanding latency of strong and eventual consistency
- Replicating with single-leader, multi-leader, or leaderless approach
- Replicating synchronously or asynchronously
- Applying replicated state machines for replication
In the previous chapter, we explored colocation as a technique to reduce latency by bringing data close to computation. However, colocation to a single location creates a fundamental problem: what happens when that location fails or becomes unavailable? But also, what if you want to colocate in multiple locations?
High availability and fault tolerance are essential for most production systems. For example, if the network fails or your database node crashes, you still want to keep your application operational. To achieve this, you fundamentally need redundancy in your data by maintaining multiple copies across different nodes and, in some cases, even across different data centers. Replication is a technique that maintains multiple copies of the same data across various locations while ensuring consistency between the copies.