4 Preparing for your move to Apache Iceberg
This chapter covers
- Performing an infrastructure audit
- Engaging stakeholders to surface technical and organizational needs
- Documenting current tooling, storage systems, and governance practices
- Turning audit findings into prioritized, actionable requirements
- Considerations for data storage, ingestion, catalog, federation, and consumption
Adopting Apache Iceberg isn’t just a technical upgrade. It’s a shift in how your organization handles data, from ingestion and storage to governance and analytics. To bring it into production, you need more than experimentation. You need a clear view of your current setup.
Every organization’s data landscape is unique. You may be using different file formats, legacy ETL tools, or navigating region-specific compliance constraints. If you try to integrate Iceberg without first auditing what you have, you risk inefficiencies, unexpected costs, or governance gaps. A thoughtful audit helps you avoid these pitfalls by narrowing the options down to a clear strategy.
This chapter won’t tell you which catalog to pick or which query engine is “best.” That’s by design. We’ll cover lakehouse component options in depth in future chapters. The lakehouse ecosystem is too diverse for a one-size-fits-all recommendation. Instead, the goal here is to help you clarify your requirements so you can use them as a filter when evaluating tools. Many tools are marketed to highlight strengths that may not align with your goals.