chapter nine

9 Understanding the consumption layer

 

This chapter covers

  • Semantic consistency across tools
  • Open interfaces such as JDBC, ODBC, Arrow Flight, and MCP
  • Evaluating BI tools, notebook environments, and AI platforms for integration
  • Choosing the right consumption tools

Now that your lakehouse has a solid foundation, from storage and ingestion to catalog and federation, it’s time to focus on where data creates value: consumption. This is where your lakehouse architecture begins to yield insights, drive decisions, and power innovation. Whether you’re enabling real-time dashboards, supporting ad hoc data exploration in Python notebooks, SQL querying, natural language analytics with AI agents, or training large-scale machine learning models, the consumption layer bridges your technical investment and practical outcomes.

In traditional data architectures, consumption was often constrained by the limits of data movement, format compatibility, and tool lock-in. Accessing data meant replicating it into specialized databases, BI tools, and other systems, each with its own constraints. Apache Iceberg’s emphasis on openness and table-format portability has reshaped this paradigm. Now the data remains in place, and tools can come to the data rather than the other way around. This shift dramatically reduces friction, empowering teams to use their tool of choice without compromising governance, consistency, or performance.

9.1 Revisiting the benefits of the lakehouse for consumption

9.1.1 Connecting the lakehouse to the people

9.2 Revisiting requirements from our audit

9.2.1 Interpreting requirements for consumption

9.2.2 Requirements for BI tools

9.2.3 Requirements for interactive notebook environments

9.2.4 Requirements for AI and specialized data consumption tools

9.3 Open interfaces for seamless consumption

9.3.1 JDBC and ODBC

9.3.2 Arrow Flight

9.3.3 Model Context Protocol (MCP)

9.4 Business intelligence tools in the lakehouse

9.4.1 Open source BI tools

9.4.2 Commercial BI tools

9.4.3 Tools for AI and machine learning workloads

9.5 Choosing the right consumption tools: Ten illustrated scenarios

9.5.1 Startup with a data science focus

9.5.2 Large financial institution with strict governance