Chapter 14. Cloud Vision: image recognition

 

This chapter covers

  • An overview of image recognition
  • The different types of recognition supported by Cloud Vision
  • How Cloud Vision pricing is calculated
  • An example evaluating whether profile images are acceptable

For humans, image recognition is one of those things that’s easy to understand but difficult to define. We can ask toddlers, “What’s this picture of?” and get an answer, but asking “Explain to me what it means to recognize an image.” will probably get a blank stare. To move into a slightly more philosophical area, you might say that we know what it means to “understand an image” but find it tough to explain clearly what exactly constitutes that understanding.

It’s difficult to get a computer to recognize an image. Things that are hard to define are typically tricky to express as code, and understanding an image falls in that category. As with many definition problems, we get around this by choosing a specific definition and sticking to that. In the case of Cloud Vision, we’re going to look at image recognition as being able to slap a bunch of annotations on a given image, as shown in figure 14.1, where each annotation covers a visual area and provides some structured context about the region.

Figure 14.1. Vision as annotations

14.1. Annotating images

14.2. Understanding pricing

14.3. Case study: enforcing valid profile photos

Summary