chapter two
2 Pretrained networks
This chapter covers
- Running pretrained image-recognition models
- Working with pretrained transformers and diffusion models
- Accessing models through Hugging Face
- Captioning images with a pretrained model
In our first chapter, we hinted at the transformative potential of deep learning, and now it’s time to deliver. Computer vision is certainly one of the fields that has been most affected by the advent of deep learning, for a variety of reasons. As the need to classify or interpret the content of natural images grew, very large datasets became available, and new constructs such as convolutional layers were invented and could be run quickly on GPUs with unprecedented accuracy. All of these factors are combined with the internet giants’ desire to understand pictures taken by millions of users with their mobile devices and managed on their platforms. Quite the perfect storm.