10 One, Two and Three Dimensional Convolution and Transposed Convolution in Neural Networks

 

Image analysis typically involves identification of local patterns. For instance, if one wants to do face recognition, one needs to analyze local patterns of neighboring pixels corresponding to eyes, noses and ears. The subject of the photograph maybe standing on a beach in front of the ocean. The big picture involving sand and water is irrelevant.

Convolution is a specialized operation that examines local patterns in an input signal. These operators are typically used to analyze images, i.e., the input is a 2D array of pixels. To illustrate this, we will study a few examples of special purpose convolution operations that respectively detect edges, corners, the average illumination in a small neighborhood of pixels, from an image. Once we have detected such local properties, we can combine them and recognize higher level patterns like ears, noses and eyes. Those we can combine, in turn, to detect still higher level structures like faces. The system naturally lends itself to multi-layer convolutional neural networks - the lowest layers(closest to the input) detect edges and corners, the next layers detect ears, eyes, noses and so forth.

10.1 One Dimensional Convolution: Graphical and Algebraical view

 
 
 
 

10.1.1 Curve Smoothing via 1D Convolution

 
 

10.1.2 Curve Edge Detection via 1D Convolution

 
 

10.1.3 One Dimensional Convolution as Matrix Multiplication

 
 
 

10.1.4 PyTorch: One-dimensional convolution with custom weights

 
 

10.2 Convolution Output Size

 
 

10.3 Two Dimensional Convolution: Graphical and Algebraic view

 
 
 
 

10.3.1 Image Smoothing via 2D Convolution

 
 
 
 

10.3.2 Image Edge Detection via 2D Convolution

 
 
 

10.3.3 PyTorch: Two-dimensional convolution with custom weights

 
 
 
 

10.3.4 Two Dimensional Convolution as Matrix Multiplication

 
 

10.4 Three Dimensional Convolution

 

10.4.1 Video Motion Detection via 3D Convolution

 
 
 

10.4.2 PyTorch: Three-dimensional convolution with custom weights

 
 
 

10.5 Transposed Convolution or Fractionally Strided Convolution

 
 

10.5.1 Application of Transposed convolution: AutoEncoders and Embeddings

 
 
 
 

10.5.2 Transposed Convolution Output Size

 

10.5.3 Upsampling via Transpose Conv

 
 
 

10.6 Convolution Layers to a Neural Network

 

10.6.1 PyTorch: Adding Convolution Layers to a Neural Network

 
 
 

10.7 Pooling

 
 

10.8 Chapter Summary

 
 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest