Chapter 12. Regression with kNN, random forest, and XGBoost

 

This chapter covers

  • Using the k-nearest neighbors algorithm for regression
  • Using tree-based algorithms for regression
  • Comparing k-nearest neighbors, random forest, and XGBoost models

You’re going to find this chapter a breeze. This is because you’ve done everything in it before (sort of). In chapter 3, I introduced you to the k-nearest neighbors (kNN) algorithm as a tool for classification. In chapter 7, I introduced you to decision trees and then expanded on this in chapter 8 to cover random forest and XGBoost for classification. Well, conveniently, these algorithms can also be used to predict continuous variables. So in this chapter, I’ll help you extend these skills to solve regression problems.

By the end of this chapter, I hope you’ll understand how kNN and tree-based algorithms can be extended to predict continuous variables. As you learned in chapter 7, decision trees suffer from a tendency to overfit their training data and so are often vastly improved by using ensemble techniques. Therefore, in this chapter, you’ll train a random forest model and an XGBoost model, and benchmark their performance against the kNN algorithm.

12.1. Using k-nearest neighbors to predict a continuous variable

 
 
 
 

12.2. Using tree-based learners to predict a continuous variable

 
 
 

12.3. Building your first kNN regression model

 
 
 

12.4. Building your first random forest regression model

 
 

12.5. Building your first XGBoost regression model

 
 

12.6. Benchmarking the kNN, random forest, and XGBoost model-building processes

 
 
 

12.7. Strengths and weaknesses of kNN, random forest, and XGBoost

 

Summary

 
 
 
 

Solutions to exercises

 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage