9 Communicating with External Systems
This chapter covers:
- Working with Airflow operators performing actions on systems outside Airflow
- Applying operators specific to external systems
- Implementing operators in Airflow doing A-to-B operations in case you need to implement your own
- Testing tasks connecting to external systems
In all previous chapters, we’ve focussed on various aspects of writing Airflow code, mostly demonstrated with examples using generic operators such as the BashOperator and PythonOperator. While these operators can run arbitrary code and thus could run any workload, the Airflow project also holds other operators for more specific use cases; for example for running a query on a Postgres database. These operators have one and only one specific use case - such as running a query. As a result, they are easy to use by simply providing the query to the operator, and the operator internally handles the querying logic. With a PythonOperator, you would have to write such querying logic yourself.
For the record by “external system” we aim at any technology other than Airflow and the machine Airflow is running on. This could be Microsoft Azure Blob Storage, an Apache Spark cluster, or a Google BigQuery data warehouse.