chapter eight

8 Testing and selection

This chapter covers:

Structuring testing environments, then migrating code and artifacts to them
Measuring the properties of models
Understanding how to test ML-discovered models offline and online
Understanding how the test results can be used to select models
Using qualitative evaluation and selection and quantitative measures
Avoiding deceptive traps when evaluating your models

So far in sprint 2, the team designed the model to be developed using their understanding of the data, the client’s challenges and context, and the application that they expect to build. They’ve used a structured process to develop the model and tracked their progress using an experiment tracker and a model repository. They’ve also applied their common sense and experience to find and reject models that are suspicious or problematic. It’s important now to make sense of the model’s outcome and to properly evaluate competitor models to make good choices about the models that they take into production and application development.

8.1 Why test and select?

8.2 Testing processes

8.2.1 Offline testing

8.2.2 Offline test environments

8.2.3 Online testing

8.2.4 Field trials

8.2.5 A/B testing

8.2.6 Multi-armed bandits (MABs)

8.2.7 Nonfunctional testing

8.3 Model selection

8.3.1 Quantitative selection

8.3.2 Choosing With Comparable Tests

8.3.3 Choosing with many tests

8.3.4 Qualitative selection measures

8.4 Post modelling checklist

8.5 The Bike Shop: sprint 2