14 Operating Airflow in production
This chapter covers
- Dissecting the Airflow scheduler
- Configuring Airflow to scale horizontally using different executors
- Monitoring the status and performance of Airflow visually
- Sending out alerts in case of task failures
In most of the previous chapters, we focused on various parts of Airflow from a programmer’s perspective. In this chapter, we aim to explore Airflow from an operations perspective. A general understanding of concepts such as (distributed) software architecture, logging, monitoring, and alerting is assumed. However, no specific technology is required.
14.1 Revisiting the Airflow architecture
Back in chapter 1, we showed the Airflow architecture displayed in Figure 14.1.
At a minimum, Airflow consists of a few components:
- Webserver
- Scheduler
- Database (also known as Metastore)
- Workers
- Triggerer (optional component, required when working with deferrable operators)
- Executor (not in the image)
Figure 14.1 High-level Airflow architecture

The webserver and scheduler are both Airflow processes. The database is a separate service you must provide to Airflow for storing metadata from the webserver and scheduler. A folder with DAG definitions must be accessible by the scheduler.
The webserver’s responsibility is to visually display information about the status of the pipelines and allow the user to perform certain actions, such as triggering a DAG.
The scheduler’s responsibility is twofold: