chapter three

3 ResNet Revolution

This chapter covers

The challenges of training deep neural networks
Residual connections and how they revolutionized deep learning
Evolution to ResNet v2 and training networks exceeding 1,000 layers
Pooling and striding were replaced with dilated convolutions for dense prediction tasks
CS231n becomes the first deep learning course at Stanford

Before 2015, increasing depth did not immediately translate into improved performance. In fact, deeper models often exhibited higher training error. The heart of the problem was a double bind. Depth was required for representational capacity, yet depth impaired optimization; reducing depth alleviated optimization problems, but capped capacity. Just as despair seemed justified, a deceptively simple idea emerged.

3.1 Telephone Game

3.2 Comparison with Other Major Architectures

3.2.1 ResNet vs. AlexNet & ZFNet

3.2.2 ResNet vs. VGGNet

3.2.3 ResNet vs. GoogLeNet

3.3 Real-World Adoption

3.4 ResNet-v2

3.4.1 Post-Activation to Pre-Activation

3.4.2 Ablation Studies

3.5 Extending ResNet with Dense Prediction

3.6 Human Measuring Sticks

3.7 CS231n

3.8 Scale What?