So far, we’ve looked at convolutional networks with deep layers and convolutional networks with wide layers. In particular, we’ve seen how the corresponding connectivity patterns both between and within convolutional blocks addressed issues of vanishing and exploding gradients and the problem of memorization from overcapacity.
Those methods of increasing deep and wide layers, along with regularization (adding noise to reduce overfitting) at the deeper layers, reduced the problem with memorization but certainly did not eliminate it. So researchers explored other connectivity patterns within and between residual convolutional blocks to further reduce memorization without substantially increasing the number of parameters and compute operations.
We’ll cover three of those alternative connectivity patterns in this chapter: DenseNet, Xception, and SE-Net. These patterns all had similar goals: reducing compute complexity in the connectivity component. But they differed in their approaches to the problem. Let’s first get an overview of those differences. Then we’ll spend the rest of the chapter looking at the specifics of each pattern.