8 Metadata

 

This chapter covers

  • Managing metadata for understanding data
  • Introducing Azure Purview
  • Maintaining a data dictionary and a data glossary
  • Understanding advanced features of Azure Purview

This chapter is all about metadata: in other words, data about data. This is one aspect of data governance. We will cover two other important aspects in the following chapters: data quality in chapter 9 and compliance in chapter 10. Figure 8.1 highlights our current area of focus. We won’t view this map of our data platform again until the last chapter, which covers data distribution.

Figure 8.1 Data governance deals with multiple aspects of managing data, including metadata, data quality, access control, compliance with laws and standards, etc.

We’ll start by outlining the information architecture challenges a big data platform encounters and how metadata can help address them. We’ll introduce two important concepts: data dictionaries and data glossaries. Using these, we can inventory our datasets and queries.

Next, we’ll look at Azure Purview. Azure Purview is the Azure data governance service that helps us manage our metadata. We’ll spin up a new instance of Azure Purview and go over some of its key features.

Note

At the time of writing, the Azure Purview service was recently launched, and there is no Azure CLI support for it yet.

8.1 Making sense of the data

8.2 Introducing Azure Purview

8.3 Maintaining a data inventory

8.3.1 Setting up a scan

8.3.2 Browsing the data dictionary

8.3.3 Data dictionary recap

8.4 Managing a data glossary

8.4.1 Adding a new glossary term

8.4.2 Curating terms

8.4.3 Custom templates and bulk import

8.4.4 Data glossary recap

sitemap