chapter seven

7 Learning with continuous and count labels

This chapter covers

Regression in machine learning
Loss and likelihood functions for regression
When to use different loss and likelihood functions
Adapting parallel and sequential ensembles for regression problems
Using ensembles for regression in practical settings

Many real-world modeling, prediction, and forecasting problems are best framed and solved as regression problems. Regression has a rich history predating the advent of machine learning and has long been a part of the standard statistician’s toolkit.

Regression techniques have been developed and widely applied in many areas. Here are just a few examples:

Weather forecasting—To predict the precipitation tomorrow using data from today, including temperature, humidity, cloud cover, wind, and more
Insurance analytics—To predict the number of automobile insurance claims over a period of time, given various vehicle and driver attributes
Financial forecasting—To predict stock prices using historical stock data and trends
Demand forecasting—To predict the residential energy load for the next three months using historical, demographic, and weather data

Whereas chapters 2-6 introduced ensembling techniques for classification problems, in this chapter, we’ll see how to adapt ensembling techniques to regression problems.

7.1 A brief review of regression

7 Learning with continuous and count labels

This chapter covers

7.1 A brief review of regression

7.1.1 Linear regression for continuous labels

7.1.2 Poisson regression for count labels

7.1.3 Logistic regression for classification labels

7.1.4 Generalized linear models

7.1.5 Nonlinear regression

7.2 Parallel ensembles for regression

7.2.1 Random forests and Extra Trees

7.2.2 Combining regression models