12 Combining models to maximize results: Ensemble learning

In this chapter

what ensemble learning is, and how it is used to combine weak classifiers into a stronger one
using bagging to combine classifiers in a random way
using boosting to combine classifiers in a cleverer way
some of the most popular ensemble methods: random forests, AdaBoost, gradient boosting, and XGBoost

After learning many interesting and useful machine learning models, it is natural to wonder if it is possible to combine these classifiers. Thankfully, we can, and in this chapter, we learn several ways to build stronger models by combining weaker ones. The two main methods we learn in this chapter are bagging and boosting. In a nutshell, bagging consists of constructing a few models in a random way and joining them together. Boosting, on the other hand, consists of building these models in a smarter way by picking each model strategically to focus on the previous models’ mistakes. The results that these ensemble methods have shown in important machine learning problems has been tremendous. For example, the Netflix Prize, which was awarded to the best model that fits a large dataset of Netflix viewership data, was won by a group that used an ensemble of different models.

With a little help from our friends

Bagging: Joining some weak learners randomly to build a strong learner

12 Combining models to maximize results: Ensemble learning

In this chapter

With a little help from our friends

Bagging: Joining some weak learners randomly to build a strong learner

AdaBoost: Joining weak learners in a clever way to build a strong learner

Gradient boosting: Using decision trees to build strong learners

XGBoost: An extreme way to do gradient boosting

Applications of ensemble methods

Summary

Exercises