Chapter 5. Learning multiple weights at a time: generalizing gradient descent
In this chapter
- Gradient descent learning with multiple inputs
- Freezing one weight: what does it do?
- Gradient descent learning with multiple outputs
- Gradient descent learning with multiple inputs and outputs
- Visualizing weight values
- Visualizing dot products
“You don’t learn to walk by following rules. You learn by doing and by falling over.”
Richard Branson, http://mng.bz/oVgd
In the preceding chapter, you learned how to use gradient descent to update a weight. In this chapter, we’ll more or less reveal how the same techniques can be used to update a network that contains multiple weights. Let’s start by jumping in the deep end, shall we? The following diagram shows how a network with multiple inputs can learn.
There’s nothing new in this diagram. Each weight_delta is calculated by taking its output delta and multiplying it by its input. In this case, because the three weights share the same output node, they also share that node’s delta. But the weights have different weight deltas owing to their different input values. Notice further that you can reuse the ele_mul function from before, because you’re multiplying each value in weights by the same value delta.