14 Training and deployment pipeline

This chapter covers

Feeding models training data in a production environment
Scheduling for continuous retraining
Using version control and evaluating models before and after deployment
Deploying models for large-scale on-demand and batch requests, in both monolithic and distributed deployments

In the previous chapter, we went through the data pipeline portion of an end-to-end production ML pipeline. Here, in the final chapter of the book, we will cover the final portion of the end-to-end pipeline: training, deployment, and serving.

To remind you with a visual, figure 14.1 shows the whole pipeline, borrowed from chapter 13. I’ve circled the part of the system we’ll address in this chapter.

Figure 14.1 Production e2e pipeline with this chapter’s emphasis on training and deployment

You may ask, what exactly is a pipeline and why do we use one, whether for ML production or any programmatic production operation that is managed by orchestration? You typically use pipelines when the job, such as training or other operations handled by orchestration, has multiple steps that occur in sequential order: do step A, do step B, and so on.

14.1 Model feeding

14.1.1 Model feeding with tf.data.Dataset

14 Training and deployment pipeline

This chapter covers

Figure 14.1 Production e2e pipeline with this chapter’s emphasis on training and deployment

14.1 Model feeding

14.1.1 Model feeding with tf.data.Dataset

14.1.2 Distributed feeding with tf.Strategy

14.1.3 Model feeding with TFX

14.2 Training schedulers

14.2.1 Pipeline versioning

14.2.2 Metadata

14.2.3 History

14.3 Model evaluations

14.3.1 Candidate vs. blessed model

14.3.2 TFX evaluation

14.4 Serving predictions