chapter five
5 Mobile Convolutional Neural Networks
This chapter covers
- The design principles and unique requirements for mobile convolutional networks developed by researchers
- The design patterns for MobileNet v1/v2, SqueezeNet and ShuffleNet.
- Coding examples of these models using the procedural design pattern.
- Quantizing models (making them more compact).
- Executing quantized models using Tensorflow TFLite.
Compact models, which are used in mobile and IoT devices, have a special challenge in contrast to their PC or Cloud equivalents; compact models need to operate in substantially less memory, and therefore cannot benefit from the use of overcapacity to achieve high accuracy. To fit models into these constrained memory sizes, they need to have substantially fewer parameters for inference or prediction. The architecture for compact models relies on a tradeoff between accuracy and latency. That is, the more of the device’s memory the model occupies, the higher the accuracy--but the longer the response time (i.e., latency) and vice-versa.