5 Defining dependencies between tasks

 

This chapter covers:

  • Examining how to define task dependencies in an Airflow DAG.
  • Explaining how to use trigger rules to implement joins at specific points in an Airflow DAG.
  • Showing how to make conditional tasks in an Airflow DAG, which can be skipped under certain conditions.
  • Giving a basic idea of how trigger rules function in Airflow and how this affects the execution of your tasks.
  • Demonstrating how to use XComs to share state between tasks.
  • Examining how Airflow 2’s Taskflow API can help simplify DAGs with many Python tasks and XComs.

In previous chapters, we’ve seen how to build a basic DAG and define simple dependencies between tasks. In this chapter, we will further explore exactly how task dependencies are defined in Airflow and how these capabilities can be used to implement more complex patterns including conditional tasks, branches, and joins. Towards the end of the chapter, we’ll also dive into XComs (which allows passing data between different tasks in a DAG run) and discuss the merits and drawbacks of using this type of approach. We’ll also show how Airflow 2’s new Taskflow API can help simplify DAGs that make heavy use of Python tasks and XComs.

5.1      Basic Dependencies

5.1.1   Linear Dependencies

5.1.2   Fan-in/-out Dependencies

5.2      Branching

5.2.1   Branching within Tasks

5.2.2   Branching within the DAG

5.3      Conditional Tasks

5.3.1   Conditions within tasks

5.3.2   Making tasks conditional

5.3.3   Using built-in operators

5.4      More about Trigger Rules

5.4.1   What is a trigger rule?

5.4.2   The effect of failures

5.4.3   Other trigger rules

5.5      Sharing data between tasks

5.5.1   Sharing data using XComs

5.5.2   When (not) to use XComs

5.5.3   Using custom XCom backends

5.6      Chaining Python tasks with the Taskflow API

5.6.1   Simplifying Python tasks with the Taskflow API

5.6.2   When (not) to use the Taskflow API

5.7      Summary