5 The Mechanics of Learning

This chapter covers:

Understanding how algorithms can learn from data
Reframing learning as parameter estimation, using differentiation and gradient descent
Walking through a very simple learning algorithm from scratch
How PyTorch supports learning with autograd

With the blooming of machine learning that has occurred over the last decade, the notion of machines that learn from experience has become a mainstream theme in both technical and journalistic circles. Now, how is it exactly that a machine learns? What are the mechanics of it, or, in other words, the algorithm behind it? From the point of view of an outer observer, a learning algorithm is presented input data that is paired with desired outputs. Once learning has occurred, that algorithm will be capable of producing correct outputs when it is fed new data that is similar enough to the input data it was trained on. With deep learning, this process works even when the input data and the desired output are far from each other, when they come from different domains, like an image and a sentence describing it, as we’ve seen in Chapter 2.

5.1 Learning is just parameter estimation

5.1.1 A Hot Problem

5.1.2 Choosing a linear model as a first try

5.1.3 Less loss is what we want

5.1.4 From Problem to PyTorch

5.1.5 Down Along the Gradient

5.1.6 Getting Analytical

5.1.7 The Training Loop

5.2 PyTorch’s Autograd: Back-propagate all things

5.2.1 Optimizers a-la Carte

5.2.2 Training, Validation, and Overfitting

5.2.3 Autograd Nits and Switching it Off

5.3 Conclusion

5.4 Exercises

5.5 Summary