This chapter covers
- Improving the safety of outputs from LLMs
- Mitigating privacy risks with user inputs to chatbots
- Understanding data protection laws in the United States and the European Union
In the previous chapter, we discussed how large language models (LLMs) are trained on massive datasets from the internet that are likely to contain personal information, bias, and other types of undesirable content. While some LLM developers use the unrestricted nature of their models as a selling point, most major LLM providers have a set of policies around the kinds of content they don’t want the model to produce and are dedicating a great deal of effort to ensuring that their models follow those policies as closely as possible. For example, commercial LLM providers often don’t want LLMs to generate hate speech or discrimination because it could reflect poorly on the company in the eyes of consumers. Although these specific policies will vary depending on organizational values and external pressures, ultimately, improving the safety of an LLM is about exercising control over the model’s generations, and that requires technical interventions.