7 Privacy-preserving data mining techniques

 

This chapter covers

  • The importance of privacy preservation in data mining
  • Privacy protection mechanisms for processing and publishing data
  • Exploring privacy-enhancing techniques for data mining
  • Implementing privacy techniques in data mining with Python

So far we have discussed different privacy-enhancing technologies that the research community and the industry have partnered together on. This chapter focuses on how these privacy techniques can be utilized for data mining and management operations. In essence, data mining is the process of discovering new relationships and patterns in data to achieve further meaningful analysis. This usually involves machine learning, statistical operations, and data management systems. In this chapter we will explore how various privacy-enhancing technologies can be bundled with data mining operations to achieve privacy-preserving data mining.

First, we will look at the importance of privacy preservation in data mining and how private information can be disclosed to the outside world. Then we will walk through the different approaches that can be utilized to ensure privacy guarantees in data mining operations, along with some examples. Toward the end of this chapter, we will discuss the recent evolution of data management techniques and how these privacy mechanisms can be instrumented in database systems to design a tailor-made privacy-enriched database management system.

CH07_00_UN01_Zhuang

7.1 The importance of privacy preservation in data mining and management

7.2 Privacy protection in data processing and mining

7.2.1 What is data mining and how is it used?

7.2.2 Consequences of privacy regulatory requirements

7.3 Protecting privacy by modifying the input

7.3.1 Applications and limitations

7.4 Protecting privacy when publishing data

7.4.1 Implementing data sanitization operations in Python

7.4.2 k-anonymity

7.4.3 Implementing k-anonymity in Python

Summary