chapter six

6 Defining dependencies between tasks

 

This chapter covers

  • Defining task dependencies in an Airflow DAG
  • Implementing joins using trigger rules
  • Making tasks execute on certain conditions
  • Seeing how trigger rules affect task execution
  • Using XComs to share state between tasks
  • Simplifying DAGs with the Airflow Taskflow API

We’ve seen how to build a basic directed acyclic graph (DAG) and define simple dependencies between tasks. In this chapter, we’ll dive a bit deeper into how dependencies are defined and explore how to define more complex constructs, such as conditional tasks, branches, and joins. Toward the end of the chapter, we’ll investigate XComs, which allow passing data between different tasks in a DAG run, and discuss the merits and drawbacks of this approach. We’ll also show how the Airflow Taskflow API can simplify DAGs.

6.1 Basic dependencies

Before going into complex task dependency patterns such as branching and conditional tasks, let’s examine the task dependencies we’ve already encountered, including linear chains of tasks (tasks that are executed one after another) and fan-out/fan-in patterns (which involve one task linking to multiple downstream tasks, or vice versa).

6.1.1 Linear dependencies

6.1.2 Fan-in/fan-out dependencies

6.2 Branching

6.2.1 Branching within tasks

6.2.2 Branching within the DAG

6.3 Conditional tasks

6.3.1 Conditions within tasks

6.3.2 Making tasks conditional

6.3.3 Using built-in operators

6.4 Exploring trigger rules

6.4.1 What is a trigger rule?

6.4.2 The effect of failures

6.4.3 Other trigger rules

6.5 Sharing data between tasks

6.5.1 Sharing data using XComs

6.5.2 When and when not to use XComs