13 Best practices for the real world

This chapter covers

Hyperparameter tuning
Model ensembling
Mixed-precision training
Training Keras models on multiple GPUs or on a TPU

You’ve come far since the beginning of this book. You can now train image classification models, image segmentation models, models for classification or regression on vector data, timeseries forecasting models, text-classification models, sequence-to-sequence models, and even generative models for text and images. You’ve got all the bases covered.

However, your models so far have all been trained at a small scale—on small datasets, with a single GPU—and they generally haven’t reached the best achievable performance on each dataset we looked at. This book is, after all, an introductory book. If you are to go out in the real world and achieve state-of-the-art results on brand new problems, there’s still a bit of a chasm that you’ll need to cross.

This penultimate chapter is about bridging that gap and giving you the best practices you’ll need as you go from machine learning student to fully fledged machine learning engineer. We’ll review essential techniques for systematically improving model performance: hyperparameter tuning and model ensembling. Then we’ll look at how you can speed up and scale up model training, with multi-GPU and TPU training, mixed precision, and leveraging remote computing resources in the cloud.

13 Best practices for the real world

This chapter covers

13.1 Getting the most out of your models

13.1.1 Hyperparameter optimization

13.1.2 Model ensembling

13.2 Scaling-up model training

13.2.1 Speeding up training on GPU with mixed precision

13.2.2 Multi-GPU training

13.2.3 TPU training

Summary