7 Data deletion

 

This chapter covers

  • What is meant when we refer to data deletion
  • Why companies need to delete data
  • How modern data collection works
  • Deleting account-level data
  • Deleting warehouse data and sensitive data
  • How to structure data ownership

We have, so far, looked at privacy as a holistic business differentiator as well as a risk mitigator, involving processes such as classifying data, building an inventory, sharing data securely, and conducting technical privacy reviews. Another key concept in data privacy is data deletion; this is critical, since most security and privacy risks emanate from data misuse, leakage, and exfiltration. Chapter 5 provided some useful techniques for obfuscating data so as to mitigate privacy harms if the data is mishandled. However, in some cases, it may be more practical to delete the data altogether, since the best way to prevent data misuse is to not have the data at all.

This chapter will walk you through a system architecture for deleting data in a highly distributed environment. You will need to adapt what we discuss here to your systems, since all companies vary in their architecture and data, but this chapter will provide you with hands-on skills to start this complex but necessary initiative. You will learn how to approach operational and archival data from a privacy perspective.

7.1 Why must a company delete data?

7.2 What does a modern data collection architecture look like?

7.2.1 Distributed architecture and microservices: How companies collect data

7.2.2 How real-time data is stored and accessed

7.2.3 Archival data storage

7.2.4 Other data storage locations

7.2.5 How data storage grows from collection to archival

7.3 How the data collection architecture works

7.4 Deleting account-level data: A starting point

7.4.1 Account deletion: Building the tooling and process

7.4.2 Scaling account deletion