9 Workflow orchestration

 

This chapter covers

  • Defining workflow and workflow orchestration
  • Why deep learning systems need to support workflows
  • Designing a general workflow orchestration system
  • Introducing three open source orchestration systems: Airflow, Argo Workflows, and Metaflow

In this chapter, we will discuss the last but critical piece of a deep learning system: workflow orchestration—a service that manages, executes, and monitors workflow automation. Workflow is an abstract and broad concept; it is essentially a sequence of operations that are part of some larger task. If you can devise a plan with a set of tasks to complete a work, this plan is a workflow. For example, we can define a sequential workflow for training a machine learning (ML) model. This workflow can be composed of the following tasks: fetching raw data, rebuilding the training dataset, training the model, evaluating the model, and deploying the model.

Because a workflow is an execution plan, it can be performed manually. For instance, a data scientist can manually complete the tasks of the model training workflow we just described. For example, to complete the “fetching raw data” task, the data scientist can craft web requests and send them to the dataset management (DM) service to fetch a dataset—all with no help from the engineers.

9.1 Introducing workflow orchestration

9.1.1 What is workflow?

9.1.2 What is workflow orchestration?

9.1.3 The challenges for using workflow orchestration in deep learning

9.2 Designing a workflow orchestration system

9.2.1 User scenarios

9.2.2 A general orchestration system design

9.2.3 Workflow orchestration design principles

9.3 Touring open source workflow orchestration systems

9.3.1 Airflow

9.3.2 Argo Workflows

9.3.3 Metaflow

9.3.4 When to use

Summary