6 Triggering Workflows

 

This chapter covers:

  • Waiting for certain conditions to be met with sensors
  • Deciding how to set dependencies between tasks in different DAGs
  • Executing workflows via the CLI and REST API

In Chapter 3 we explored how to schedule workflows in Airflow based on a time interval. The time intervals can be given as convenience strings, for example, “@daily”, time delta objects, for example, timedelta(days=3), and cron strings, for example, “30 14 * * *” to trigger every day at 14:30. These are all notations to instruct the workflow to trigger at a certain time or interval. Airflow will compute the next time to run the workflow given the interval, and start the first task(s) in the workflow at the next date and time.

In this chapter, we explore other ways to trigger workflows. This is often desired following a certain action, in contrast to the time-based intervals, which start workflows at predefined times. Trigger actions are often the result of external events; think of a file being uploaded to a shared drive, a developer pushing his code to a repository, or the existence of a partition in a Hive table, which could be a reason to start running your workflow.

6.1      Polling conditions with sensors

6.1.1   Polling custom conditions

6.1.2   Sensors outside the happy flow

6.2      Triggering other DAGs

6.2.1   Backfilling with the TriggerDagRunOperator

6.2.2   Polling the state of other DAGs

6.3      Starting workflows with REST/CLI

6.4      Summary

sitemap