3 Heterogeneous Parallel Ensembles: Combining Strong Learners
This chapter covers
- Combining base learning models by performance-based weighting
- Combining base learning models with meta-learning: stacking
- Avoiding overfitting by ensembling with cross validation
- A large-scale, real-world text-mining case study with heterogeneous ensembles
In the previous chapter, we introduced two parallel ensemble methods: bagging and random forest. These methods (and their variants) train homogeneous ensembles, where every base estimator is trained using the same base learning algorithm. For example, in bagging classification, all the base estimators are decision tree classifiers. In this chapter, we continue exploring parallel ensemble methods, this time focusing on heterogeneous ensembles.
Heterogeneous ensemble methods use different base learning algorithms to directly ensure ensemble diversity. For example, a heterogeneous ensemble can consist of three base estimators: a decision tree, a support vector machine (SVM) and an artificial neural network. These base estimators are still trained independently of each other.