chapter seven

7 Validation schemas

This chapter covers

Ensuring reliable evaluation
Standard validation schemas
Nontrivial validation schemas
Split updating procedure
Validation schemas as part of the design document

Building a robust evaluation process is essential for a machine learning (ML) system, and in this chapter, we will cover the process of building a proper validation schema to achieve confident estimates of system performance. We will touch upon typical validation schemas, as well as how to select the right validation based on the specifics of a given problem and what factors to consider when designing the evaluation process in the wild.

A proper validation procedure aims to imitate what knowledge we are supposed to have and what knowledge can be dropped while operating in a real-life environment. This is somewhat connected to the overfitting problem or generalization, which we’ll cover in detail in chapter 9.

It also provides a reliable and robust estimation of a system’s performance, ideally with some theoretical guarantees. As an example, we guarantee that a real value will be in the range between the lower confidence bound and upper confidence bound 95 times out of 100 (this case will be covered in a campfire story from Valerii later in the chapter). It also helps detect and prevent data leaks, overfitting, and divergence between offline and online performance.

7.1 Reliable evaluation

7.2 Standard schemas

7.2.1 Holdout sets

7.2.2 Cross-validation

7 Validation schemas

This chapter covers

7.1 Reliable evaluation

7.2 Standard schemas

7.2.1 Holdout sets

7.2.2 Cross-validation

7.2.3 The choice of K

7.2.4 Time-series validation

7.3 Nontrivial schemas

7.3.1 Nested validation

7.3.2 Adversarial validation

7.3.3 Quantifying dataset leakage exploitation

7.4 Split updating procedure

7.5 Design document: Choosing validation schemas

7.5.1 Validation schemas for Supermegaretail

7.5.2 Validation schemas for PhotoStock Inc.

Summary