Chapter 8. An example batch layer: Architecture and algorithms
This chapter covers
- Building a batch layer from end to end
- Practical examples of precomputation
- Iterative graph algorithms
- HyperLogLog for efficient set-cardinality operations
You’ve now learned all the pieces of the batch layer: formulating a schema for your data, storing a master dataset, and running computations at scale with a minimum of complexity. In this chapter you’ll tie these pieces together into a coherent batch layer. No new theory is introduced in this chapter—our goal is to reinforce the concepts of the previous chapters by going through a batch layer design from start to finish. There is great value in understanding how the theory maps to a non-trivial example.
Specifically, you’ll learn how to create the batch layer for our running example of SuperWebAnalytics.com. SuperWebAnalytics.com is complex enough to require a fairly sophisticated batch layer, but not so complex as to lose you in the details. You’ll see that the various batch layer abstractions fit together nicely and that the resulting batch layer for SuperWebAnalytics.com is quite elegant.