7 Building Custom Components
This chapter covers:
- How to make your DAGs more modular and succinct by implementing custom components for interacting with (remote) systems.
- How to implement a custom hook and how to use this hook to interact with an external system.
- How to design and implement your own custom operator to perform a specific task.
- How to design and implement your own custom sensor.
- How to distribute your custom components as a basic Python library.
One of the strong features of Airflow is that it was designed to be easily extendable to support coordinating jobs across many different types of systems. We have already seen some of this functionality in earlier chapters, where we were able to execute a spark job on a Spark cluster using the SparkSubmitOperator, but you can (for example) also use Airflow to run jobs on an ECS (Elastic Container Services) cluster in AWS using the EcsOperator, to perform queries on a Postgres database using the PostgresOperator, and much more.
However, at some point you may run into the issue that you want to execute a task on a system that is not supported by Airflow. Or you may have a task that you can implement using the PythonOperator, but requires a lot of boilerplate code which prevents others from easily reusing your code across different DAGs. How should you go about this?