8 Metadata
In this chapter:
- Managing metadata for understanding data
- Introducing Azure Purview
- Maintaining a data dictionary and a data glossary
- Understanding advanced features of Azure Purview
This chapter is all about metadata, in other words data about the data. This is one aspect of data governance. We will cover other two important aspects in the following chapters data quality (in chapter 9) and compliance (in chapter 10). Figure 8.1 highlights our current area of focus. We won’t show this map of our data platform again until the last chapter, which covers data distribution.
Figure 8.1 Data governance deals with multiple aspects of managing data, including metadata, data quality, access control, compliance to laws and standards etc.
We’ll start by outlining the information architecture challenges a big data platform encounters and how metadata can help address these. We’ll introduce two important concepts: data dictionaries and data glossaries. Using these, we can inventory our datasets and queries.
Next, we’ll look at Azure Purview. Azure Purview is the Azure data governance service which can help us manage our metadata. We’ll spin up a new instance of Azure Purview and go over some of its key features. At the time of writing, the service was recently launched, and there is no Azure CLI support for it. Unlike other chapters, where we were able to automate via Azure CLI, this time around we will be looking at more UI.