8 Communicating with external systems
This chapter covers
- Working with Airflow operators performing actions on systems outside Airflow
- Applying operators specific to external systems
- Implementing Airflow operators to perform A-to-B operations
- Testing tasks connecting to external systems
In previous chapters, we used mainly generic operators such as the BashOperator and the PythonOperator to keep the focus on understanding the basics of Airflow. This is hardly the best use of Airflow, however. Airflow’s main power lies in its capability to connect to a broad variety of systems (e.g., an Apache Spark cluster, a Google BigQuery data warehouse, and a PostgreSQL database) and orchestrate workloads between them.
To demonstrate, this chapter explores how to install and use additional operators from the Airflow ecosystem to integrate with external systems without having to write custom integration logic. For illustration, we’ll develop two use cases connecting to different external systems and see how specific operators help us move and transform data between these systems.
NOTE Operators are always under development. By the time you read this chapter, there may be new operators that suit your use case but are not described here.