List of Figures

 

Chapter 1. A new paradigm for Big Data

Figure 1.1. Relational schema for simple analytics application

Figure 1.2. Batching updates with queue and worker

Figure 1.3. Fully incremental architecture

Figure 1.4. Using replication to increase availability

Figure 1.5. Adding logging to fully incremental architectures

Figure 1.6. Lambda Architecture

Figure 1.7. Architecture of the batch layer

Figure 1.8. Batch layer

Figure 1.9. Serving layer

Figure 1.10. Speed layer

Figure 1.11. Lambda Architecture diagram

Chapter 2. Data model for Big Data

Figure 2.1. The master dataset in the Lambda Architecture serves as the source of truth for your Big Data system. Errors at the serving and speed layers can be corrected, but corruption of the master dataset is irreparable.

Figure 2.2. Three possible options for storing friendship information for FaceSpace. Each option can be derived from the one to its left, but it’s a one-way process.

Figure 2.3. The relationships between data, views, and queries

Figure 2.4. Classifying information as data or a view depends on your perspective. To FaceSpace, Tom’s birthday is a view because it’s derived from the user’s birthdate. But the birthday is considered data to a third-party advertiser.

Figure 2.5. A summary of one day of trading for Google, Apple, and Amazon stocks: previous close, opening, high, low, close, and net change.