10 Compliance

 

This chapter covers

  • Compliance for a data platform
  • Data classification and handling
  • Designing a compliant access model
  • Supporting GDPR requirements

Data platforms, by definition, deal with data. While some types of data are harmless, other types carry liability. In this chapter, we will talk about compliance and data handling. First, we’ll see some examples of data classification and data handling standards. Depending on the nature of the data we process, we will see where we can store it, who can access it, what we can do with it, how long can we keep it, and so on. We will also look at some techniques we can use to change the type of the data. This includes anonymization and pseudonymization of personably identifiable information, and aggregation of sensitive data.

Next, we’ll look at implementing an access model that properly restricts access, including some advanced features provided by storage solutions, like row-level security and access control lists.

The General Data Protection Regulation (GDPR) is a famous regulation passed by the European Union with worldwide impact. We’ll look at a few key points we need to be aware of and see how we can make our data platform GDPR-compliant. Before we dig in, please read the important note that follows.

10.1 Data classification

10.1.1 Feature data

10.1.2 Telemetry

10.1.3 User data

10.1.4 User-owned data

10.1.5 Business data

10.1.6 Data classification recap

10.2 Changing classification through processing

10.2.1 Aggregation

10.2.2 Anonymization

10.2.3 Pseudonymization

10.2.4 Masking

10.2.5 Processing classification changes recap

10.3 Implementing an access model

10.3.1 Security groups