Continuing our journey through dealing with unstructured data leads us to our image case study. Just as it was an issue with our NLP case study, the big question of this chapter is, how do we represent images in a machine-readable format? Throughout this chapter, we will take a look at ways to construct, extract, and learn feature representations of images for the purpose of solving an object recognition problem.
Object recognition simply means we are going to work with labeled images, where each image contains a single object, and the purpose of the model is to classify the image as a category that specifies what object is in the image. Object recognition is considered a relatively simple computer vision problem, as we don’t have to worry about finding the object or objects within an image using bounding boxes, nor do we have to do anything beyond pure classification into (usually) mutually exclusive categories. Let’s jump right into taking a look at the dataset for this case study—the CIFAR-10 dataset.