8 Training neural networks: Forward propagation and backpropagation
This chapter covers
- Sigmoid functions as differential surrogates for Heaviside step functions
- Layering in neural networks: expressing linear layers as matrix-vector multiplication
- Regression loss, forward and backward propagation, and their math
So far, we have seen that neural networks make complicated real-life decisions by modeling the decision-making process with mathematical functions. These functions can become arbitrarily involved, but fortunately, we have a simple building block called a perceptron that can be repeated systematically to model any arbitrary function. We need not even explicitly know the function being modeled in closed form. All we need is a reasonably sized set of sample inputs and corresponding correct outputs. This collection of input and output pairs is known as training data. Armed with this training data, we can train a multilayer perceptron (MLP, aka neural network) to emit reasonably correct outputs on inputs it has never seen before.