13 Transfer learning

 

This chapter covers

  • Transferring a pretrained network to a new problem
  • Understanding the difference between frozen and warm weights
  • Learning with less data via transfer learning
  • Transfer learning for text problems with transformer-based models

You now know a range of techniques for training models from scratch on new data. But what if you do not have time to wait for a big model to train? Or what if you do not have a lot of data to begin with? Ideally, we could use information from a bigger, well-curated dataset to help us learn a more accurate model in fewer epochs for our new, smaller dataset.

That is where transfer learning comes into play. The idea behind transfer learning is that if someone has gone through the effort of training a big model on a bunch of data, you can probably use that already trained model as a starting point for your problem. In essence, you want to transfer to your problem all the information that the model has extracted from some related problem. When that’s possible, transfer learning can save you weeks of time, improve your accuracy, and just generally work better. This is especially valuable because you can get better results with less labeled data, which is a big time- and moneysaver. This makes transfer learning one of the most practical tools you should know for on-the-job work.

13.1 Transferring model parameters

13.1.1  Preparing an image dataset

13.2 Transfer learning and training with CNNs

13.2.1  Adjusting pretrained networks

13.2.2  Preprocessing for pretrained ResNet

13.2.3  Training with warm starts

13.2.4  Training with frozen weights

13.3 Learning with fewer labels

13.4 Pretraining with text

13.4.1  Transformers with the Hugging Face library

13.4.2  Freezing weights with no-grad

Exercises

Summary