11 Image segmentation

 

This chapter covers

  • The different branches of computer vision: image classification, image segmentation, and object detection
  • Building a segmentation model from scratch
  • Using the pretrained Segment Anything Model

Chapter 8 gave you a first introduction to deep learning for computer vision via a simple use case: binary image classification. But there’s more to computer vision than image classification! This chapter dives deeper into another essential computer vision application—image segmentation.

11.1 Computer vision tasks

So far, we’ve focused on image classification models: an image goes in, a label comes out. “This image likely contains a cat; this other one likely contains a dog.” But image classification is only one of several possible applications of deep learning in computer vision. In general, there are three essential computer vision tasks you need to know about:

11.1.1 Types of image segmentation

11.2 Training a segmentation model from scratch

11.2.1 Downloading a segmentation dataset

11.2.2 Building and training the segmentation model

11.3 Using a pretrained segmentation model

11.3.1 Downloading the Segment Anything Model

11.3.2 How Segment Anything works

11.3.3 Preparing a test image

11.3.4 Prompting the model with a target point

11.3.5 Prompting the model with a target box

Summary