21 Data masking

Data protection is heavily regulated in many countries, and failing to protect certain types of data may break national and international laws, such as the GDPR in the European Union and HIPAA in the United States. For these laws and many additional reasons, one of the most important duties of a DBA is preventing data leaks.

SQL Server implements many security principles, like authentication and authorization, to help protect data from unauthorized access, but these measures can be bypassed when databases are moved from production to other environments, such as development and testing, or when databases are given to vendors for troubleshooting.

To reduce the potential for data breaches when sharing databases that contain sensitive data, we must consider protecting data privacy by replacing any personally identifiable information (PII) with fabricated data, while also keeping the resulting data meaningful to the consuming applications or test suites. PII includes, but is not limited to, name, birth date, passport number, home address, and phone number.

In this chapter, we will focus on static data masking. This is the process of permanently replacing sensitive data at rest with new values by updating the data in our database.

21.1 Getting started

21.2 A common approach

21.3 The better approach

21 Data masking

21.1 Getting started

21.2 A common approach

21.3 The better approach

21.3.1 Generating random data

21.4 The process

21.4.1 Finding potential PII data

21.4.2 Generating a configuration file for masking

21.4.3 Applying static data masking

21.4.4 Validating a data masking configuration file

21.5 Hands-on lab