6 Choosing and evaluating models

 

This chapter covers

  • Mapping business problems to machine learning tasks
  • Evaluating model quality
  • Explaining model predictions

In this chapter, we will discuss the modeling process (figure 6.1). We discuss this process before getting into the details of specific machine learning approaches, because the topics in this chapter apply generally to any kind of model. First, let’s discuss choosing an appropriate model approach.

Figure 6.1. Mental model

6.1. Mapping problems to machine learning tasks

As a data scientist, your task is to map a business problem to a good machine learning method. Let’s look at a real-world situation. Suppose that you’re a data scientist at an online retail company. There are a number of business problems that your team might be called on to address:

  • Predicting what customers might buy, based on past transactions
  • Identifying fraudulent transactions
  • Determining price elasticity (the rate at which a price increase will decrease sales, and vice versa) of various products or product classes
  • Determining the best way to present product listings when a customer searches for an item
  • Customer segmentation: grouping customers with similar purchasing behavior
  • AdWord valuation: how much the company should spend to buy certain AdWords on search engines
  • Evaluation of marketing campaigns
  • Organizing new products into a product catalog

6.1.1. Classification problems

6.1.2. Scoring problems

6.1.3. Grouping: working without known targets

6.1.4. Problem-to-method mapping

6.2. Evaluating models

6.2.1. Overfitting

6.2.2. Measures of model performance

6.2.3. Evaluating classification models