Chapter 5. Choosing and evaluating models

This chapter covers

Mapping business problems to machine learning tasks
Evaluating model quality
Validating model soundness

As a data scientist, your ultimate goal is to solve a concrete business problem: increase look-to-buy ratio, identify fraudulent transactions, predict and manage the losses of a loan portfolio, and so on. Many different statistical modeling methods can be used to solve any given problem. Each statistical method will have its advantages and disadvantages for a given business goal and business constraints. This chapter presents an outline of the most common machine learning and statistical methods used in data science.

To make progress, you must be able to measure model quality during training and also ensure that your model will work as well in the production environment as it did on your training data. In general, we’ll call these two tasks model evaluation and model validation. To prepare for these statistical tests, we always split our data into training data and test data, as illustrated in figure 5.1.

Chapter 5. Choosing and evaluating models

This chapter covers

Figure 5.1. Schematic model construction and evaluation

5.1. Mapping problems to machine learning tasks

5.2. Evaluating models

5.3. Validating models

5.4. Summary

Chapter 5. Choosing and evaluating models

This chapter covers

Figure 5.1. Schematic model construction and evaluation

5.1. Mapping problems to machine learning tasks

5.2. Evaluating models

5.3. Validating models

5.4. Summary

Unable to load book!