11 Running Tasks in Containers
This chapter covers:
- Identifying some challenges involved in managing Airflow deployments that use many different operators, complex dependencies, etc.
- Examining how containerized approaches can help simplify Airflow deployments by providing a uniform way of building and running tasks, whilst also simplifying dependency management.
- Running containerized tasks in Airflow on Docker using the DockerOperator and on Kubernetes clusters using the KubernetesPodOperator.
- Establishing a high-level overview of the workflows involved in developing containerized DAGs based on Docker and Kubernetes.
In previous chapters, we have implemented several DAGs using different Airflow operators, each specialized to perform a specific type of task. In this chapter, we touch upon some of the drawbacks of using many different operators, especially with an eye on creating Airflow DAGs that are easy to build, deploy and maintain. In light of these issues, we take a look at how we can use Airflow to run tasks in containers using Docker and Kubernetes and some of the benefits that this containerized approach can bring.
11.1 Challenges of many different operators
Operators are arguably one of the strong features of Airflow, as they provide great flexibility to coordinate jobs across many different types of systems. However, creating and managing DAGs with many different operators can be quite challenging due to the complexity involved.