chapter twelve

12 Combining models to maximize results: Ensemble learning

 
A picture containing text Description automatically generated

This chapter covers

  • What is ensemble learning, and how it is used to combine weak classifiers into a stronger one.
  • Using bagging to combine classifiers in a random way.
  • Using boosting to combine classifiers in a more clever way.
  • Some of the most popular ensemble methods: random forests, AdaBoost, gradient boosting, and XGBoost.

After learning many interesting and useful machine learning models, it is natural to wonder if it is possible to combine these classifiers. Thankfully, we can, and in this chapter I show you several ways to build stronger models by combining weaker ones. The two main methods we learn in this chapter are bagging and boosting. In a nutshell, bagging consists of constructing a few models in a random way, and joining them together. Boosting, on the other hand, consists of building these models in a smarter way by picking each model strategically to focus on the previous models’ mistakes. The results that ensemble methods have shown in important machine learning problems has been tremendous. For example, the Netflix prize, which was given to the best model that fits a large dataset of Netflix viewership data, was won by a group that used ensemble methods.

12.1  With a little help from our friends

12.2  Bagging - Joining some weak learners randomly to build a strong learner

12.2.1    First, (over)fitting a decision tree

12.2.2    Fitting a random forest manually

12.2.3    Training a random forest in sklearn

12.3  AdaBoost - Joining weak learners in a clever way to build a strong learner

12.3.1    A big picture of AdaBoost - Building the weak learners

12.3.2    Combining the weak learners into a strong learner

12.3.3    Coding AdaBoost in sklearn

12.4  Gradient Boosting - Using decision trees to build strong learners

12.5  XGBoost - An extreme way to do gradient boosting

12.5.1    XGBoost similarity score - A new and effective way to measure similarity in a set

12.5.2    Building the weak learners

12.5.3    Tree pruning - a way to reduce overfitting by simplifying the weak learners

12.5.4    Making the predictions

12.5.5    Training an XGBoost model in Python

12.6  Applications of ensemble methods