chapter three

3 Data privacy and safety with LLMs

This chapter covers

Improving the safety of outputs from LLMs
Mitigating privacy risks with user inputs to chatbots
Understanding data protection laws in the United States and the European Union

In the previous chapter, we discussed how large language models (LLMs) are trained on massive datasets from the internet that are likely to contain personal information, bias, and other types of undesirable content. While some LLM developers use the unrestricted nature of their models as a selling point, most major LLM providers have a set of policies around the kinds of content they don’t want the model to produce and are dedicating a great deal of effort to ensuring that their models follow those policies as closely as possible. For example, commercial LLM providers often don’t want LLMs to generate hate speech or discrimination because it could reflect poorly on the company in the eyes of consumers. Although these specific policies will vary depending on organizational values and external pressures, ultimately, improving the safety of an LLM is about exercising control over the model’s generations, and that requires technical interventions.

Safety-focused improvements for LLM generations

Post-processing detection algorithms

Content filtering or conditional pre-training

Reinforcement learning from human feedback

Reinforcement learning from AI feedback

Navigating user privacy and commercial risks

Inadvertent data leakage

Best practices when interacting with chatbots

Understanding the rules of the road: Data policies and regulations

International standards and data protection laws

Are chatbots compliant with GDPR?

Privacy regulations in academia

Corporate policies

Summary