chapter eight

Chapter 8. Predicting numeric values: regression

This chapter covers

Linear regression
Locally weighted linear regression
Ridge regression and stagewise linear regression
Predicting the age of an abalone and an antique selling price

The previous chapters focused on classification that predicts only nominal values for the target variable. With the tools in this chapter you’ll be able to start predicting target values that are continuous. You may be asking yourself, “What can I do with these tools?” “Just about anything” would be my answer. Companies may use this for boring things such as sales forecasts or forecasting manufacturing defects. One creative example I’ve seen recently is predicting the probability of celebrity divorce.

In this chapter, we’ll first discuss linear regression, where it comes from, and how to do it in Python. We’ll next look at a technique for locally smoothing our estimates to better fit the data. We’ll explore shrinkage and a technique for getting a regression estimate in “poorly formulated” problems. We’ll explore the theoretical notions of bias and variance. Finally, we’ll put all of these techniques to use in forecasting the age of abalone and the future selling price of antique toys. To get the data on the antique toys, we’ll first use Python to do some screen scraping. It’s an action-packed chapter.

8.1. Finding best-fit lines with linear regression

Chapter 8. Predicting numeric values: regression

This chapter covers

8.1. Finding best-fit lines with linear regression

8.2. Locally weighted linear regression

8.3. Example: predicting the age of an abalone

8.4. Shrinking coefficients to understand our data

8.5. The bias/variance tradeoff

8.6. Example: forecasting the price of LEGO sets

8.7. Summary