14 Advanced building blocks

 

This chapter covers

  • Improving translation invariance with anti-aliased pooling
  • Converging faster with improved residual connections
  • Fighting overfitting by mixing data

The exercises in this book have thus far been designed so that you can learn about real techniques in a minimal amount of compute time. But when you work on real-world problems, they often require hundreds of epochs and models that are deeper than what you have trained so far, and they must process larger inputs than the examples in this book.

When tackling these larger data sets, it sometimes takes extra tools to get the best results. This chapter covers some of the latest and greatest techniques that researchers have developed to improve deep learning models: techniques that often work best when training on larger data sets for many epochs. We focus on approaches that are simple, broadly useful, effective, and easy to implement. For these more advanced techniques, you often will not see the full benefit on smaller models or by training for only 10 to 20 epochs as we have for most of the book. In practice, these techniques bear the greatest fruit when training for 100 to 300 epochs. I have designed experiments to show some of the benefits in a relatively short amount of time, but you should expect more significant benefits on bigger problems.

14.1 Problems with pooling

14.1.1  Aliasing compromises translation invariance

14.1.2  Anti-aliasing by blurring

14.1.3  Applying anti-aliased pooling

14.2 Improved residual blocks

14.2.1  Effective depth

14.2.2  Implementing ReZero

14.3 MixUp training reduces overfitting

14.3.1  Picking the mix rate

14.3.2  Implementing MixUp

Exercises

Summary