4 Replication

 

This chapter covers

  • Understanding latency of strong and eventual consistency
  • Replicating with single-leader, multi-leader, or leaderless approach
  • Replicating synchronously or asynchronously
  • Applying replicated state machines for replication

In the previous chapter, we explored colocation as a technique to reduce latency by bringing data close to computation. However, colocation to a single location creates a fundamental problem: what happens when that location fails or becomes unavailable? But also, what if you want to colocate in multiple locations?

High availability and fault tolerance are essential for most production systems. For example, if the network fails or your database node crashes, you still want to keep your application operational. To achieve this, you fundamentally need redundancy in your data by maintaining multiple copies across different nodes and, in some cases, even across different data centers. Replication is a technique that maintains multiple copies of the same data across various locations while ensuring consistency between the copies.

4.1 Why replicate data?

4.2 Availability and scalability

4.3 Consistency model

4.3.1 Strong consistency

4.3.2 Eventual consistency

4.3.3 Other consistency models

4.4 Replication strategies

4.4.1 Single-leader replication

4.4.2 Multi-leader replication

4.4.3 Leaderless replication

4.4.4 Read your writes property

4.4.5 Local-first approach

4.5 Asynchronous vs. synchronous replication

4.6 State machine replication

4.7 Case study: Viewstamped replication

4.8 Putting it together: Replicating a key-value store

4.9 Summary