Chapter 5. Deep learning for computer vision


This chapter covers

  • Understanding convolutional neural networks (convnets)
  • Using data augmentation to mitigate overfitting
  • Using a pretrained convnet to do feature extraction
  • Fine-tuning a pretrained convnet
  • Visualizing what convnets learn and how they make classification decisions

This chapter introduces convolutional neural networks, also known as convnets, a type of deep-learning model almost universally used in computer vision applications. You’ll learn to apply convnets to image-classification problems—in particular those involving small training datasets, which are the most common use case if you aren’t a large tech company.

5.1. Introduction to convnets

We’re about to dive into the theory of what convnets are and why they have been so successful at computer vision tasks. But first, let’s take a practical look at a simple convnet example. It uses a convnet to classify MNIST digits, a task we performed in chapter 2 using a densely connected network (our test accuracy then was 97.8%). Even though the convnet will be basic, its accuracy will blow out of the water that of the densely connected model from chapter 2.

The following lines of code show you what a basic convnet looks like. It’s a stack of Conv2D and MaxPooling2D layers. You’ll see in a minute exactly what they do.

5.2. Training a convnet from scratch on a small dataset

5.3. Using a pretrained convnet

5.4. Visualizing what convnets learn