chapter ten

10 Convolutions in neural networks

This chapter covers

The graphical and algebraic view of neural networks
Two-dimensional and three-dimensional convolution with custom weights
Adding convolution layers to a neural network

Image analysis typically involves identifying local patterns. For instance, to do face recognition, we need to analyze local patterns of neighboring pixels corresponding to eyes, noses, and ears. The subject of the photograph may be standing on a beach in front of the ocean, but the big picture involving sand and water is irrelevant.

Convolution is a specialized operation that examines local patterns in an input signal. These operators are typically used to analyze images: that is, the input is a 2D array of pixels. To illustrate this, we examine a few examples of special-purpose convolution operations that detect the edges, corners, and average illumination in a small neighborhood of pixels from an image. Once we have detected such local properties, we can combine them and recognize higher-level patterns like ears, noses, and eyes. We can combine those in turn to detect still higher-level structures like faces. The system naturally lends itself to multilayer convolutional neural networks—the lowest layers(closest to the input) detect edges and corners, and the next layers detect ears, eyes, noses, and so forth.

10.1 One-dimensional convolution: Graphical and algebraical view

10.1.1 Curve smoothing via 1D convolution

10.1.2 Curve edge detection via 1D convolution

10.1.3 One-dimensional convolution as matrix multiplication

10.1.4 PyTorch- One-dimensional convolution with custom weights

10.2 Convolution output size

10.3 Two-dimensional convolution: Graphical and algebraic view

10.3.1 Image smoothing via 2D convolution

10.3.2 Image edge detection via 2D convolution

10.3.3 PyTorch- 2D convolution with custom weights

10.3.4 Two-dimensional convolution as matrix multiplication

10.4 Three-dimensional convolution

10.4.1 Video motion detection via 3D convolution

10.4.2 PyTorch- Three-dimensional convolution with custom weights

10.5 Transposed convolution or fractionally strided convolution

10.5.1 Application of transposed convolution: Autoencoders and embeddings

10.5.2 Transposed convolution output size

10.5.3 Upsampling via transpose convolution

10.6 Adding convolution layers to a neural network

10.7 Pooling