10 Ethical and responsible large language models

 

This chapter covers

  • Identifying model bias
  • Model interpretability
  • Responsible LLMs
  • Safeguarding LLMs

Navigating the ethical and responsible aspects of using powerful transformer-based language models goes beyond compliance with AI regulations. Ethical and safe AI systems are a fundamental goal for researchers and practitioners alike.

All engineers and developers of AI systems need strategies for uncovering and understanding biases inherent in LLMs, which is crucial for mitigating discrimination. It’s also increasingly critical to increase the transparency of LLMs using analytical tools like to gain a deeper understanding of how decisions are made by these models. It’s also essential to safeguard your LLMs using input and output scanners and other tools to to validate input prompts and a model’s response. This a chapter will introduce you to the core tools and practices of safeguarding your LLMs.

10.1 Understanding biases in large language models

The data you feed into your machine learning model is mostly responsible for how your model "behaves" later on during inference. Consequently, understanding the contents of the pretraining data is crucial for enhancing transparency and identifying the origins of bis and other possible downstream problems.

10.2 Transparency and explainability of large language models

10.2.1 Using Captum to analyze the behavior of generative language models

10.2.2 Using LIME to explain a model prediction

10.3 Responsible use of large language models

10.3.1 The foundation model transparency index

10.4 Safeguarding your language model

10.4.1 Jailbreaks and lifecycle vulnerabilities

10.4.2 Shielding your model against hazardous abuse

10.5 Summary