10 Combining models to maximize results:Ensemble learning

This chapter covers

  • What is ensemble learning.
  • Joining several weak classifiers to form a strong classifier.
  • Bagging: A method to randomly join several classifiers.
  • Boosting: A method to join several classifiers in a smarter way.
  • AdaBoost: A very successful example of boosting methods.

After learning many interesting and very useful machine learning classifiers, a good question to ask is “Is there a way to combine them?”. Thankfully the answer is yes! In this chapter we learn several ways to build stronger classifiers by combining weaker ones. The methods we learn in this chapter are bagging and boosting. In a nutshell, bagging consists on constructing a few classifiers in a random way and putting them together. Boosting, on the other hand, consists of building these models in a smarter way, by picking each model strategically to focus on the previous models’ mistakes. One of the most popular examples of boosting is the AdaBoost algorithm (Adaptive Boosting), which we study at the end of the chapter.

10.1       With a little help from our friends

10.2       Why an ensemble of learners? Why not just one really good learner?

10.3       Bagging - Joining some classifiers together to build a stronger classifier

10.3.1                Building random forests by joining several trees

10.3.2                Coding a random forest in sklearn

10.4       Boosting - Joining classifiers together in a smarter way to get a stronger classifier

10.4.1                A big picture of AdaBoost

10.4.2                A detailed (mathematical) picture of AdaBoost

10.4.3                Coding AdaBoost in Sklearn

10.5       XGboost - An extreme way to do gradient boosting

10.5.1                XGBoost similarity score

10.5.2                Building the learners

10.6       Applications of ensemble methods

10.7       Summary