8 Extending Airflow with custom operators and sensors
This chapter covers
- Making your DAGs more modular and concise with custom components
- Designing and implementing a custom hook
- Designing and implementing a custom operator
- Designing and implementing a custom sensor
- Designing and implementing a custom deferrable sensor
- Distributing your custom components as a basic Python library
One strong feature of Airflow is that it can be easily extended to coordinate jobs across many different types of systems. We have already seen some of this functionality in earlier chapters, where we were able to execute a job for training a machine learning model on Amazon’s SageMaker service using the S3CopyObjectOperator, but you can (for example) also use Airflow to run jobs on an ECS (Elastic Container Service) cluster in AWS using the ECSOperator, perform queries on a Postgres database with the SQLExecuteQuery-Operator, and much more.
However, at some point, you may want to execute a task on a system that is not supported by Airflow, or you may have a task that you can implement using the PythonOperator but that requires a lot of boilerplate code, which prevents others from easily reusing your code across different DAGs. How should you go about this?