chapter ten

10 Ethical and responsible large language models

 

This chapter covers

  • Identifying model bias
  • Model interpretability
  • Responsible large language models
  • Safeguarding large language models

Navigating the ethical and responsible aspects of using powerful transformer-based language models goes beyond compliance with AI regulations. Ethical and safe AI systems are a fundamental goal for researchers and practitioners alike.

All engineers and developers of AI systems need strategies for uncovering and understanding biases inherent in large language models (LLMs), which is crucial for mitigating discrimination. It’s also getting more and more critical to increase the transparency of LLMs using analytical tools to gain a deeper understanding of how decisions are made by these models. It’s essential to safeguard your LLMs using input and output scanners and other tools to validate input prompts and a model’s response. This chapter will introduce you to the core tools and practices of safeguarding your LLMs.

10.1 Understanding biases in LLMs

The data you feed into your machine learning model is mostly responsible for how your model behaves later on during inference. Consequently, understanding the contents of the pretraining data is crucial for enhancing transparency and identifying the origins of bias and other possible downstream problems.

10.1.1 Identifying bias

10.1.2 Model interpretability and bias in AI

10.2 Transparency and explainability of LLMs

10.2.1 Using Captum to analyze the behavior of generative language models

10.2.2 Using local interpretable model-agnostic explanations to explain a model prediction

10.3 Responsible use of LLMs

10.3.1 The foundation model transparency index

10.4 Safeguarding your language model

10.4.1 Jailbreaks and lifecycle vulnerabilities

10.4.2 Shielding your model against hazardous abuse

Summary