Chapter 4. Introduction to neural learning: gradient descent

 

In this chapter

  • Do neural networks make accurate predictions?
  • Why measure error?
  • Hot and cold learning
  • Calculating both direction and amount from error
  • Gradient descent
  • Learning is just reducing error
  • Derivatives and how to use them to learn
  • Divergence and alpha

“The only relevant test of the validity of a hypothesis is comparison of its predictions with experience.”

Milton Friedman, Essays in Positive Economics (University of Chicago Press, 1953)

Predict, compare, and learn

In chapter 3, you learned about the paradigm “predict, compare, learn,” and we dove deep into the first step: predict. In the process, you learned a myriad of things, including the major parts of neural networks (nodes and weights), how datasets fit into networks (matching the number of datapoints coming in at one time), and how to use a neural network to make a prediction.

Perhaps this process begged the question, “How do we set weight values so the network predicts accurately?” Answering this question is the main focus of this chapter, as we cover the next two steps of the paradigm: compare and learn.

Compare

Comparing gives a measurement of how much a prediction “missed” by

Once you’ve made a prediction, the next step is to evaluate how well you did. This may seem like a simple concept, but you’ll find that coming up with a good way to measure error is one of the most important and complicated subjects of deep learning.

Learn

Compare: Does your network make good predictions?

Why measure error?

What’s the simplest form of neural learning?

Hot and cold learning

Characteristics of hot and cold learning

Calculating both direction and amount from error

One iteration of gradient descent

Learning is just reducing error

Let’s watch several steps of learning

Why does this work? What is weight_delta, really?

Tunnel vision on one concept

A box with rods poking out of it

Derivatives: Take two

What you really need to know

What you don’t really need to know

How to use a derivative to learn

Look familiar?