10 Training pipelines

 

This chapter covers

  • The essence of training pipelines
  • Tools and platforms you can use to build and maintain training pipelines
  • Scalability and configurability of training pipelines
  • Methods of testing pipelines

There’s an empirical heuristic to distinguish experienced machine learning (ML) engineers from newcomers: ask them to describe a working system’s training procedure in one sentence. Newcomers tend to focus on models, while somewhat experienced individuals include data processing. Mature engineers often describe the pipeline—a list of stages required to produce a trained ML model in the end. In this chapter, we will walk in ML engineers’ shoes to analyze these steps and discuss how to interconnect and orchestrate them.

10.1 Training pipeline: What are you?

10.1.1 Training pipeline vs. inference pipeline

10.2 Tools and platforms

10.3 Scalability

10.4 Configurability

10.5 Testing

10.5.1 Property-based testing

10.6 Design document: Training pipelines

10.6.1 Training pipeline for Supermegaretail

10.6.2 Training pipeline for PhotoStock Inc.

Summary