3 Data privacy and safety: Technical and legal controls
This chapter covers
- Sources of bias in training data
- Improving the safety of outputs from LLMs
- Mitigating privacy risks with user inputs to LLMs
- Data protection laws and their application to generative AI systems
In the previous chapter, we discussed how large language models (LLMs) are trained on massive datasets from the internet. In practice, that data is likely to contain personal information, bias, and other undesirable content. We also introduced the concept of post-training and the primary post-training techniques. While some LLM developers use the unrestricted nature of their models as a selling point, most major LLM providers have a set of policies around the kinds of content they don’t want the model to produce and are dedicating a great deal of effort to ensuring that their models follow those policies as closely as possible, through post-training and other methods. For example, commercial LLM providers don’t want LLMs to generate hate speech or discrimination because it could reflect poorly on the company in the eyes of consumers. Although these policies will vary depending on an organization's values and external pressures, improving an LLM's safety involves exercising control over the model’s generations, which requires technical interventions.