chapter six

6 Caching

This chapter covers

Caching with different strategies
Cache consistency, coherence, and invalidation
Maximizing cache hit ratio
Cache replacement policies

Welcome to the last chapter in this part of the book, focusing on latency optimizations involving data. We have discussed colocation, replication, and partitioning as techniques for optimizing for low latency. As you’ve seen, each has its upsides and downsides in terms of complexity, consistency, and performance. To wrap up, we’ll discuss a technique most developers are familiar with: caching.

Caching is a technique for speeding up data retrieval by having a temporary copy of the data closer to where the data is accessed. With caching, you have a backing store, such as a database that contains the primary copy of the data, which you cache to one or more locations to speed up data access. If this sounds similar to colocation or replication, that’s because there are many similarities. However, caching as a technique has its tradeoffs in terms of latency and complexity, which are the topics of this chapter.

6.1 Why cache data?

Typically, you should consider caching for reducing latency over other techniques if your application or system

Doesn’t need transactions or complex queries
Cannot be changed, which makes using techniques such as replication difficult
Has compute or storage constraints that prevent you from using other techniques

6.2 Caching overview

6.3 Caching strategies

6.3.1 Cache-aside caching

6 Caching

This chapter covers

6.1 Why cache data?

6.2 Caching overview

6.3 Caching strategies

6.3.1 Cache-aside caching

6.3.2 Read-through caching

6.3.3 Write-through caching

6.3.4 Write-behind caching

6.3.5 Client-side caching

6.3.6 Distributed caching

6.4 Cache coherency

6.5 Cache hit ratio

6.6 Cache replacement

6.6.1 Least recently used (LRU)

6.6.2 Least frequently used (LFU)

6.6.3 First-in, first-out (FIFO) and SIEVE

6.7 Time-to-live (TTL)