chapter one

1 The world of the data lakehouse

 

This chapter covers

  • What a data lakehouse is and how it differs from traditional data architectures
  • How Apache Iceberg shapes the lakehouse paradigm
  • When and why to implement an Apache Iceberg lakehouse

Data architecture has evolved less through innovation for its own sake and more through persistent failure. Organizations have long struggled to deliver analytics that are fast, affordable, and trustworthy at scale. Systems that performed well tended to be expensive and rigid. Systems that were flexible and cheap were often slow, fragile, and hard to govern. Each new generation of architecture promised to fix these tradeoffs but introduced new technical and business constraints that limited how the data could be used.

1.1 Evolution from database to data lakehouse

1.2 The rise of data warehouses

1.3 The move to cloud data warehouses

1.4 The data lake and the Hadoop era

1.5 Apache Iceberg: Giving data lakes data warehouse capabilities

1.6 The data lakehouse: Best of both worlds

Summary