Take a moment to look back on what you’ve learned so far. Assuming you’ve completed parts 1 and 2 of this book, you now possess the skills you need to tackle a large range of classification problems. In this part of the book, we’ll shift our focus from predicting categorical variables to predicting continuous ones.
As you learned in chapter 1, we use the term regression for supervised machine learning that predicts a continuous outcome variable. In chapters 9 through 12, you’re going to learn a variety of regression algorithms that will help you deal with different data situations. Some of them are suited to situations in which there are linear relationships between predictor variables and your outcome, and are highly interpretable. Others are able to model nonlinear relationships but may not be quite so interpretable.
We’ll start by covering linear regression—which, as you’ll learn, is closely related to the logistic regression we worked with in chapter 4. In fact, if you’re already familiar with linear regression, you may be wondering why I’ve waited until now to cover linear regression, when the theory of logistic regression is built on it. It’s because to make your learning more simple and enjoyable, I wanted to cover classification, regression, dimension reduction, and clustering separately, so each of these topics is distinct in your mind. But I hope the theory we’ll cover in this next part will solidify your understanding of logistic regression.