Metadata Layer Architecture
This chapter covers:
- A definition of data platform metadata and how it differs from business metadata
- How to architect the optimal metadata layer for the size and complexity of your system and organization
- Designing a metadata model with multiple domains - Pipeline Configuration, Data Quality Checks and Pipeline activity
- Metadata layer implementation options
- Existing commercial and open source options for metadata layer implementation
By the end of this chapter you’ll be able to:
- Architect an appropriate metadata layer for the size and complexity of your system and organization
- Leverage metadata to simplify the management of your data platform
- Evaluate which of the commercial and open source options might be worth exploring for your use case
In this chapter, we’ll help you get a clear understanding of what we mean by data platform internal metadata and why it is important to the operation of a data platform
We’ll cover the difference between configuration and activity metadata and how each can be used - using examples of a data platform with growing complexity. We will show why the metadata layer should become the primary interface for data engineers and advanced data users.