This chapter covers
- Classifying images of handwritten digits as vector data
- Designing a type of neural network called a multilayer perceptron
- Evaluating a neural network as a vector transformation
- Fitting a neural network to data with a cost function and gradient descent
- Calculating partial derivatives for neural networks in backpropagation
In the final chapter of this book, we combine almost everything you’ve learned so far to introduce one of the most famous machine learning tools used today: artificial neural networks. Artificial neural networks, or neural networks for short, are mathematical functions whose structure is loosely based on the structure of the human brain. These are called artificial to distinguish from the “organic” neural networks that exist in the brain. This might sound like a lofty and complex goal, but it’s all based on a simple metaphor for how the brain works.
Before explaining the metaphor, I’ll preface this discussion by reminding you that I’m not a neurologist. The rough idea is that the brain is a big clump of interconnected cells called neurons and, when you think certain thoughts, what’s actually happening is electrical activity at specific neurons. You can see this electrical activity in the right kind of brain scan where various parts of the brain light up (figure 16.1).