6 Privacy-preserving synthetic data generation
- Concepts and the techniques of synthetic data generation
- Generating synthetic data for anonymization
- Using differential privacy mechanisms for privacy-preserving synthetic data generation
- Working with a case study on “Private Synthetic Data Release via Feature-level Micro-aggregation” discussing how to design a privacy-preserving synthetic data generation scheme for machine learning tasks
6.1 Overview of Synthetic Data Generation
6.1.1 What is synthetic data? Why is it important?
6.1.2 Application aspects of using synthetic data for privacy preservation
6.1.3 How to generate synthetic data?
6.2 Assuring Privacy via Data Anonymization
6.2.1 The issue of private information sharing vs privacy concerns
6.2.2 Use of k-anonymity against re-identification attacks
6.2.3 Anonymization beyond k-anonymity
6.3 Differential Privacy for Privacy-preserving Synthetic Data Generation
6.3.1 Differentially Private Synthetic Histogram Representation Generation
6.3.2 Differentially Private Synthetic Tabular Data Generation
6.3.3 Differentially Private Synthetic Multi-Marginal Data Generation
6.4 Case study on Private Synthetic Data Release via Feature-level Micro-aggregation
6.4.1 Generating Synthetic Data
6.4.2 Evaluating the Performance of the Generated Synthetic Data
6.5 Summary