7 Building Custom Components
This chapter covers:
- Making your DAGs more modular and succinct by implementing custom components for interacting with (remote) systems.
- Implementing a custom hook and how to use this hook to interact with an external system.
- Designing and implementing your own custom operator to perform a specific task.
- Designing and implementing your own custom sensor.
- Distributing your custom components as a basic Python library.
One strong feature of Airflow is that it can be easily extended to coordinate jobs across many different types of systems. We have already seen some of this functionality in earlier chapters, where we were able to execute a spark job on a Spark cluster using the SparkSubmitOperator, but you can (for example) also use Airflow to run jobs on an ECS (Elastic Container Services) cluster in AWS using the EcsOperator, to perform queries on a Postgres database using the PostgresOperator, and much more.
However, at some point you may run into the issue that you want to execute a task on a system that is not supported by Airflow. Or you may have a task that you can implement using the PythonOperator, but requires a lot of boilerplate code which prevents others from easily reusing your code across different DAGs. How should you go about this?