chapter three

3 Data privacy and safety: Technical and legal controls

 

This chapter covers

  • Sources of bias in training data
  • Improving the safety of outputs from LLMs
  • Mitigating privacy risks with user inputs to LLMs
  • Data protection laws and their application to generative AI systems

In the previous chapter, we discussed how large language models (LLMs) are trained on massive datasets from the internet. In practice, that data is likely to contain personal information, bias, and other undesirable content. We also introduced the concept of post-training and the primary post-training techniques. While some LLM developers use the unrestricted nature of their models as a selling point, most major LLM providers have a set of policies around the kinds of content they don’t want the model to produce and are dedicating a great deal of effort to ensuring that their models follow those policies as closely as possible, through post-training and other methods. For example, commercial LLM providers don’t want LLMs to generate hate speech or discrimination because it could reflect poorly on the company in the eyes of consumers. Although these policies will vary depending on an organization’s values and external pressures, improving an LLM’s safety involves exercising control over the model’s generations, which requires technical interventions.

What’s in the training data?

Encoding bias

Linguistic diversity

Sensitive information

Safety-focused improvements for LLM generations

Post-processing detection algorithms

Content filtering or conditional pretraining

Safety post-training

Machine unlearning

Navigating user privacy and commercial risks

Inadvertent data leakage

Best practices when interacting with LLMs

Data protection and privacy in the age of AI

International standards and data protection laws

Are generative AI systems GDPR-compliant?

Privacy regulations in academia

Corporate policies

Governing data in an AI-driven world

Conclusion

Summary