Chapter 1. A new paradigm for Big Data
Figure 1.1. Relational schema for simple analytics application
Figure 1.2. Batching updates with queue and worker
Figure 1.3. Fully incremental architecture
Figure 1.4. Using replication to increase availability
Figure 1.5. Adding logging to fully incremental architectures
Figure 1.6. Lambda Architecture
Figure 1.7. Architecture of the batch layer
Figure 1.8. Batch layer
Figure 1.9. Serving layer
Figure 1.10. Speed layer
Figure 1.11. Lambda Architecture diagram
Chapter 2. Data model for Big Data
Figure 2.1. The master dataset in the Lambda Architecture serves as the source of truth for your Big Data system. Errors at the serving and speed layers can be corrected, but corruption of the master dataset is irreparable.
Figure 2.2. Three possible options for storing friendship information for FaceSpace. Each option can be derived from the one to its left, but it’s a one-way process.
Figure 2.3. The relationships between data, views, and queries
Figure 2.4. Classifying information as data or a view depends on your perspective. To FaceSpace, Tom’s birthday is a view because it’s derived from the user’s birthdate. But the birthday is considered data to a third-party advertiser.
Figure 2.5. A summary of one day of trading for Google, Apple, and Amazon stocks: previous close, opening, high, low, close, and net change.