chapter six

6 Defining dependencies between tasks

This chapter covers

Defining task dependencies in an Airflow DAG
Implementing joins using trigger rules
Making tasks conditional on certain conditions
How trigger rules affect the execution of your tasks
Using XComs to share state between tasks
Simplifying DAGs with the Airflow Taskflow API

By now, we have seen how to build a basic DAG and define simple dependencies between tasks. Next, we will dive a bit deeper into how dependencies are defined and explore how to define more complex constructs, such as conditional tasks, branches, and joins. Toward the end of the chapter, we’ll also investigate XComs (which allow passing data between different tasks in a DAG run), and discuss the merits and drawbacks of using this type of approach. We’ll also show how Airflow Taskflow API can help simplify DAGs.

6.1 Basic dependencies

Before going into more complex task dependency patterns such as branching and conditional tasks, let’s first take a moment to examine the different patterns of task dependencies that we’ve previously encountered. This includes both linear chains of tasks (tasks that are executed one after another), and fan-out/fan-in patterns (which involve one task linking to multiple downstream tasks, or vice versa). To make sure we’re all on the same page, we’ll briefly go into the implications of these patterns in the next few sections.

6.1.1 Linear dependencies

6.1.2 Fan-in/-out dependencies

6.2 Branching

6.2.1 Branching within tasks

6.2.2 Branching within the DAG

6.3 Conditional tasks

6.3.1 Conditions within tasks

6.3.2 Making tasks conditional

6.3.3 Using built-in operators

6.4 More about trigger rules

6.4.1 What is a trigger rule?

6.4.2 The effect of failures

6.4.3 Other trigger rules

6.5 Sharing data between tasks

6.5.1 Sharing data using XComs

6.5.2 When (not) to use XComs

6.5.3 Using custom XCom backends

6.5.4 XCom cleanup

6.6 Chaining Python tasks with the Taskflow API

6.6.1 Simplifying Python tasks with the Taskflow API

6.6.2 Using the TaskFlow API to define a new DAG

6.6.3 When (not) to use the Taskflow API

6.7 Summary