7 Computer vision and making sense of images

 

This chapter covers

  • How computer vision works
  • Training an image classification model
  • Using ML to detect objects in images
  • Integrating ML.NET with PyTorch

Computer vision is an area of machine learning that is responsible for working with digital imagery data. Different types of tasks fall under this category. One such task is image classification. It’s similar to a standard shallow learning classification but works with images. For example, a model designed for this task may be able to assign a category to an image depending on what’s displayed on it. If it’s a picture of a pizza, the image will be classified as “food” or “pizza”, depending on how the model was trained. If the model sees a picture of a teddy bear, it may classify it as a “toy” or a “teddy bear”. Figure 7.1 shows how it works.

Figure 7.1 Image classification in action

Another area of computer vision is the ability to detect objects on images. Unlike image classification, which typically classifies the whole image, image detection can detect any number of objects in the same image.

Because the models have to deal with unstructured imagery data rather than structured textual and numeric data, they need to be trained by using advanced deep learning techniques. This is done by relying on artificial neural networks.

7.1 Training an image classification model

7.1.1 A brief overview of TensorFlow

7.1.2 Training pipeline for image classification

7.1.3 Consuming image classification model

7.2 Training an object detection model

7.2.1 Training pipeline for object detection

7.2.2 Consuming an object detection model

7.3 Integrating with TensorFlow for image classification

7.3.1 Adding TensorFlow integration

7.4 How computer vision works

7.4.1 Image classification: recognizing what’s in an image

7.4.2 Object detection: locating and identifying objects

7.5 Project: building an intelligent shopping system

7.6 Summary