8 Using convolutions to generalize

 

This chapter covers

  • Understanding convolution
  • Building a convolutional neural network
  • Creating custom nn.Module subclasses
  • The difference between the module and functional APIs
  • Design choices for neural networks

In the previous chapter, we built a simple neural network that could fit (or overfit) the data, thanks to the many parameters available for optimization in the linear layers. Our model used fully connected layers, where each neuron connects to every neuron in the previous layer, treating the image as a flattened vector with no spatial structure preserved. However, we had issues with our model in that it was better at memorizing the training set than it was at generalizing the properties of birds and airplanes. Based on our model architecture, we’ve got a guess as to why that’s the case. Due to the fully connected setup needed to detect the various possible translations of the bird or airplane in the image, we have both too many parameters (making it easier for the model to memorize the training set) and no position independence (making it harder to generalize). As we discussed in the last chapter, we could augment our training data by using a wide variety of re-cropped images to try to force generalization, but that won’t address the issue of having too many parameters.

8.1 The case for convolutions

8.1.1 What convolutions do

8.2 Convolutions in action

8.2.1 Padding the boundary

8.2.2 Detecting features with convolutions

8.2.3 Looking further with depth and pooling

8.2.4 Putting it all together for our network

8.3 Subclassing nn.Module

8.3.1 Our network as an nn.Module

8.3.2 How PyTorch keeps track of parameters and submodules

8.3.3 The functional API

8.4 Training our convolution neural network

8.4.1 Measuring accuracy

8.4.2 Saving and loading our model

8.4.3 Training on the GPU

8.5 Model design

8.5.1 Adding memory capacity: Width

8.5.2 Helping our model to converge and generalize: Regularization

8.5.3 Going deeper to learn more complex structures: Depth

8.5.4 Comparing the designs from this section