7 Learning with continuous and count labels

 

This chapter covers

  • Regression in machine learning
  • Loss and likelihood functions for regression
  • When to use different loss and likelihood functions
  • Adapting parallel and sequential ensembles for regression problems
  • Using ensembles for regression in practical settings

Many real-world modeling, prediction, and forecasting problems are best framed and solved as regression problems. Regression has a rich history predating the advent of machine learning and has long been a part of the standard statistician’s toolkit.

Regression techniques have been developed and widely applied in many areas. Here are just a few examples:

  • Weather forecasting—To predict the precipitation tomorrow using data from today, including temperature, humidity, cloud cover, wind, and more
  • Insurance analytics—To predict the number of automobile insurance claims over a period of time, given various vehicle and driver attributes
  • Financial forecasting—To predict stock prices using historical stock data and trends
  • Demand forecasting—To predict the residential energy load for the next three months using historical, demographic, and weather data

Whereas chapters 2-6 introduced ensembling techniques for classification problems, in this chapter, we’ll see how to adapt ensembling techniques to regression problems.

7.1 A brief review of regression

7.1.1 Linear regression for continuous labels

7.1.2 Poisson regression for count labels

7.1.3 Logistic regression for classification labels

7.1.4 Generalized linear models

7.1.5 Nonlinear regression

7.2 Parallel ensembles for regression

7.2.1 Random forests and Extra Trees

7.2.2 Combining regression models