2 Differential privacy for machine learning

This chapter covers

What differential privacy is
Using differential privacy mechanisms in algorithms and applications
Implementing properties of differential privacy

In the previous chapter, we investigated various privacy-related threats and vulnerabilities in machine learning (ML ) and concepts behind privacy-enhancing technologies. From now on, we will focus on the details of essential and popular privacy-enhancing technologies. The one we will discuss in this chapter and the next is differential privacy (DP).

Differential privacy is one of the most popular and influential privacy protection schemes used in applications today. It is based on the concept of making a dataset robust enough that any single substitution in the dataset will not reveal any private information. This is typically achieved by calculating the patterns of groups within the dataset, which we call complex statistics, while withholding information about individuals in the dataset.

For instance, we can consider an ML model to be complex statistics describing the distribution of its training data. Thus, differential privacy allows us to quantify the degree of privacy protection provided by an algorithm on the (private) dataset it operates on. In this chapter, we’ll look at what differential privacy is and how it has been widely adopted in practical applications. You’ll also learn about its various essential properties.

2.1 What is differential privacy?

2.1.1 The concept of differential privacy

2 Differential privacy for machine learning

This chapter covers

2.1 What is differential privacy?

2.1.1 The concept of differential privacy

2.1.2 How differential privacy works

2.2 Mechanisms of differential privacy

2.2.1 Binary mechanism (randomized response)

2.2.2 Laplace mechanism

2.2.3 Exponential mechanism

2.3 Properties of differential privacy

2.3.1 Postprocessing property of differential privacy

2.3.2 Group privacy property of differential privacy

Summary