appendix-a

Appendix A The metadata tables

 

Apache Iceberg tracks rich metadata to efficiently manage its tables. This metadata isn't just for internal bookkeeping; it’s also exposed to users through special metadata tables that can be queried like regular tables. These metadata tables provide visibility into the physical and logical layout of an Iceberg table, offering insights into data file sizes, partitions, snapshots, and more.

Understanding how to use these metadata tables is essential for monitoring table health, diagnosing performance issues, and automating maintenance tasks like compaction and snapshot expiration. This section provides an overview of these tables, demonstrates how to query them using common engines like Spark and Dremio, and explains how to interpret their outputs.

We'll walk through each of Iceberg’s primary metadata tables and describe how they can be used in practical maintenance scenarios. We'll also examine how these tables enable proactive optimization workflows by serving as a foundation for dynamic maintenance triggers.

A.1 Querying Iceberg metadata tables

Iceberg exposes its metadata through a rich set of system tables that enable users to analyze and manage the physical and logical aspects of their datasets. These metadata tables are integral for understanding table evolution, data layout, and snapshot management. They can be accessed using both Spark and Dremio, offering flexibility across different analytics engines.

Accessing metadata tables

A.2 The history metadata table

A.3 The snapshots metadata table

A.4 The metadata_log_entries metadata table

A.5 The manifests metadata table

A.6 The partitions metadata table

A.7 The files metadata table

A.8 The manifests metadata table

A.9 The partitions metadata table

A.10 The position_deletes metadata table

A.11 The all_data_files metadata table

A.12 The all_delete_files metadata table

A.13 The all_entries metadata table

A.14 The all_manifests metadata table

A.15 The refs metadata table

A.16 Monitoring table health with metadata tables