In the previous chapters, we explained how we can use deep neural networks for image classification tasks. In image classification, we assume that there is only one main target object in the image, and the model’s sole focus is to identify the target category. However, in many situations, we are interested in multiple targets in the image. We want to not only classify them, but also obtain their specific positions in the image. In computer vision, we refer to such tasks as object detection. Figure 7.1 explains the difference between image classification and object detection tasks.