Part 2 Beyond the basics
Now that you’re familiar with Airflow’s basics and able to build some of your own data pipelines, you’re ready to learn some advanced techniques that allow you to build complex cases involving external systems, custom components, and more.
In chapter 7, we’ll examine how to trigger workflows with external input. This allows you to trigger pipelines in response to certain events, such as new files or a call from an HTTP service.
Chapter 8 demonstrates how to use Airflow’s built-in functionality to run tasks on external systems. This extremely powerful feature of Airflow allows you to build pipelines that coordinate data flows across many systems, such as databases, computational frameworks such as Apache Spark, and storage systems.
Next, chapter 9 shows how to build custom components for Airflow, allowing you to execute tasks on systems that aren’t supported by Airflow’s built-in functionality. You can also use this functionality to build components that you can easily reuse across pipelines to support common workflows.
To help increase the robustness of your pipelines, chapter 10 elaborates on strategies you can use to test your data pipelines and custom components. This topic commonly recurs in the Airflow community, so we’ll spend some time exploring it.