Now that you have learned about human-in-the-loop architectures in the first two chapters, we will spend four chapters on active learning: the set of techniques for sampling the most important data for humans to review.
Chapter 3 covers uncertainty sampling, introducing the most widely used techniques for understanding a model’s uncertainty. The chapter starts by introducing different ways to interpret uncertainty from a single neural model and then looks at uncertainty from different types of machine learning architectures. The chapter also covers how to calculate uncertainty when you have multiple predictions for each data item, such as when you are using an ensemble of models.
Chapter 4 tackles the complicated problem of identifying where your model might be confident but wrong due to undersampled or nonrepresentative data. It introduces a variety of data sampling approaches that are useful for identifying gaps in your model’s knowledge, such as clustering, representative sampling, and methods that identify and reduce real-world bias in your models. Collectively, these techniques are known as diversity sampling.
Uncertainty sampling and diversity sampling are most effective when combined, so chapter 5 introduces ways to combine different strategies into a comprehensive active learning system. Chapter 5 also covers some advantage transfer learning techniques that allow you to adapt machine learning models to predict which items to sample.