6 Sequential ensembles: Newton boosting

 

This chapter covers

  • Using Newton’s descent to optimize loss functions for training models
  • Implementing and understanding how Newton boosting works
  • Learning with regularized loss functions
  • Introducing XGBoost as a powerful framework for Newton boosting
  • Avoiding overfitting with XGBoost

In the previous two chapters, we saw two approaches to constructing sequential ensembles: In chapter 4, we introduced a new ensemble method called adaptive boosting (AdaBoost), which uses weights to identify the most misclassified examples. In chapter 5, we introduced another ensemble method called gradient boosting, which uses gradients (residuals) to identify the most misclassified examples. The fundamental intuition behind both of these boosting methods is to target the most misclassified (essentially, the worst behaving) examples at every iteration to improve classification.

6.1 Newton’s method for minimization

 
 

6.1.1 Newton’s method with an illustrative example

 

6.1.2 Newton’s descent over loss functions for training

 
 
 

6.2 Newton boosting: Newton’s method + boosting

 

6.2.1 Intuition: Learning with weighted residuals

 
 

6.2.2 Intuition: Learning with regularized loss functions

 
 

6.2.3 Implementing Newton boosting

 
 
 

6.3 XGBoost: A framework for Newton boosting

 
 
 

6.3.1 What makes XGBoost “extreme”?

 

6.3.2 Newton boosting with XGBoost

 

6.4 XGBoost in practice

 
 
 

6.4.1 Learning rate

 
 

6.4.2 Early stopping

 
 

6.5 Case study redux: Document retrieval

 

6.5.1 The LETOR data set

 
 
 

6.5.2 Document retrieval with XGBoost

 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage