3 Residual Revolution (ResNet)

 

This chapter covers

  • The challenges of training deep neural networks
  • Residual connections and how they revolutionized deep learning
  • ResNet’s breakthrough results on ImageNet and other benchmarks
  • How ResNet reshaped scalability and optimization
  • Real-world impact and industry adoption of ResNet models
  • Extending ResNet with dilated convolutions for dense prediction tasks
  • Evolution to ResNet-v2 and training networks exceeding 1,000 layers

In 2015, “deep learning” was a misnomer. Despite the success of AlexNet, network depth remained shallow. The heart of the problem were two knotted and relentless challenges that involved signal loss, but in complementary ways that compounded each other, making them doubly binding. Just when despair seemed justified, a deceptively simple idea emerged.

Residual connections create shortcuts that preserve the integrity of the original signal, protecting it from cumulative distortions and maintaining strong gradient signals throughout backpropagation. ResNet earns its place on Sutskever’s List because it represents a pivotal shift in design philosophy, where “add a residual connection” has become an indispensable tool in the machine learning toolkit.[1]

3.1 Telephone Game

3.2 Comparison with Other Major Architectures

3.2.1 ResNet vs. AlexNet & ZFNet

3.2.2 ResNet vs. VGGNet

3.2.3 ResNet vs. GoogLeNet

3.3 Real-World Adoption

3.4 ResNet-v2

3.4.1 Post-Activation to Pre-Activation

3.4.2 Ablation Studies

3.5 Extending ResNet with Dense Prediction

3.6 Human Measuring Sticks

3.7 ResNet Revolution