Chapter 5. Learning models
This chapter covers
- Implementing model-learning algorithms
- Using Spark’s model-learning capabilities
- Handling third-party code
Continuing on our journey through the phases of a machine learning system, we now arrive at model learning (see figure 5.1). You can think of this part as that day when you were very young and looked up at a dark sky and decided that, based on past experience, it just might rain. The model you learned was dark clouds lead to rain. Although you may not remember it well, you figured out that model by reasoning about your past experiences with dark and bright days and whether you got rained on.
That process you went through of reasoning about past experiences to develop a model that could be applied to future situations is analogous to what we do in the model-learning phase of a machine learning system. As I defined it in chapter 1, machine learning is learning from data, and this is the step where we do that learning. We’ll run a model-learning algorithm over our features to produce a model. In the context of a machine learning system, a model is a way of encoding the mapping from features to concepts. It’s a way of generalizing all the information in the training instances.
In software terms, a model is a program that was instantiated with instances that can now return predictions when called with features. This definition is shown in a stub implementation in the following listing.