Chapter 12. Realtime views

 

This chapter covers

  • The theoretical model of the speed layer
  • How the batch layer eases the responsibilities of the speed layer
  • Using random-write databases for realtime views
  • The CAP theorem and its implications
  • The challenges of incremental computation
  • Expiring data from the speed layer

Up to this point, our discussion of the Lambda Architecture has revolved around the batch and serving layers—components that involve computing functions over every piece of data you have. These layers satisfy all the desirable properties of a data system save one: low-latency updates. The sole job of the speed layer is to satisfy this final requirement.

Running functions over the entire master dataset—potentially petabytes of data—is a resource-intensive operation. To lower the latency of updates as much as possible, the speed layer must take a fundamentally different approach than the batch and serving layers. As such, the speed layer is based on incremental computation instead of batch computation.

12.1. Computing realtime views

12.2. Storing realtime views

12.3. Challenges of incremental computation

12.4. Asynchronous versus synchronous updates

12.5. Expiring realtime views

12.6. Summary