6 Caching
This chapter covers
- Caching with different strategies
- Cache consistency, coherence, and invalidation
- Maximizing cache hit ratio
- Cache replacement policies
Welcome to the last chapter in this part of latency optimizations involving data. We have discussed colocation, replication, and partitioning as techniques for optimizing for low latency. As we’ve seen, each has its upsides and downsides in complexity, consistency, and performance. To wrap up the discussion on latency optimizations involving data, we will discuss a technique most developers are familiar with: caching.
Caching is a technique to speed up data retrieval by having a temporary copy of the data closer to where the data is accessed. With caching, you have a backing store such as a database that contains the primary copy of the data, which you cache to one or more locations to speed up data access. If this sounds similar to colocation or replication, that’s because there are many similarities. However, caching as a technique has its trade-offs in terms of latency and complexity, which is the topic of this chapter.
6.1 Why cache data?
Typically, you should consider caching for reducing latency over other techniques if your application or system:
- Doesn’t need transactions or complex queries.
- Cannot be changed, which makes using techniques such as replication hard.
- Has compute or storage constraints that prevent other techniques.