6 Teaching machines to see: Image classification with CNNs

 

This chapter covers

  • Exploratory data analysis on image data in Python
  • Preprocessing and feeding data via image pipelines
  • Using the Keras functional API to implement a complex CNN model
  • Training and evaluating the CNN model

We have already done a fair bit of work on CNNs. CNNs are a type of network that can operate on two-dimensional data, such as images. CNNs use the convolution operation to create feature maps of images (i.e., a grid of pixels) by moving a kernel (i.e., a smaller grid of values) over the image to produce new values. The CNN has several of these layers that generate more and more high-level feature maps as they get deeper. You can also use max or average pooling layers between convolutional layers to reduce the dimensionality of the feature maps. The pooling layers also move a kernel over feature maps to create the smaller representation of the input. The final feature maps are connected to a series of fully connected layers, where the final layer produces the prediction (e.g., the probability of an image belonging to a certain category).

We have implemented CNN using the Keras Sequential API. We used various Keras layers such as Conv2D, MaxPool2D, and Dense to easily implement CNNs. We’ve already studied various parameters related to the Conv2D and MaxPool2D layers, such as window size, stride, and padding.

6.1 Putting the data under the microscope: Exploratory data analysis

6.1.1 The folder/file structure

6.1.2 Understanding the classes in the data set

6.1.3 Computing simple statistics on the data set

6.2 Creating data pipelines using the Keras ImageDataGenerator

6.3 Inception net: Implementing a state-of-the-art image classifier

6.3.1 Recap on CNNs

6.3.2 Inception net v1

6.3.3 Putting everything together

6.3.4 Other Inception models

6.4 Training the model and evaluating performance

Summary