chapter nine

9 Advanced deep learning for computer vision

This chapter covers:

The different branches of computer vision: image classification, image segmentation, and object detection
Modern convnet architecture patterns: residual connections, batch normalization, and depthwise separable convolutions
Techniques for visualizing and interpreting what convnets learn

The previous chapter gave you a first introduction to deep learning for computer vision via simple models (stacks of layer_conv_2d() and layer_max_pooling_2d() layers) and a simple use case (binary image classification). But there’s more to computer vision than image classification! This chapter dives deeper into more diverse applications and advanced best practices.

9.1 Three essential computer vision tasks

So far, we’ve focused on image classification models: an image goes in, a label comes out: “This image likely contains a cat; this other one likely contains a dog.” But image classification is only one of several possible applications of deep learning in computer vision. In general, there are three essential computer vision tasks you need to know about:

9.2 An image segmentation example

9.3 Modern convnet architecture patterns

9.3.1 Modularity, hierarchy, and reuse

9 Advanced deep learning for computer vision

This chapter covers:

9.1 Three essential computer vision tasks

9.2 An image segmentation example

9.3 Modern convnet architecture patterns

9.3.1 Modularity, hierarchy, and reuse

9.3.2 Residual connections

9.3.3 Batch normalization

9.3.4 Depthwise separable convolutions

9.3.5 Putting it together: A mini Xception-like model

9.4 Interpreting what convnets learn

9.4.1 Visualizing intermediate activations

9.4.2 Visualizing convnet filters

9.4.3 Visualizing heatmaps of class activation

Summary