10 Testing

 

This chapter covers

  • Testing Airflow tasks in a CI/CD pipeline
  • Structuring a project for testing with pytest
  • Mimicking a DAG run to test tasks that apply templating
  • Faking external system events with mocking
  • Testing behavior in external systems with containers

Previously, we focused on various parts of developing in and with Airflow. But how do you ensure the code you’ve written is valid before deploying it into a production system? Testing is an integral part of software development, and nobody wants to write code, take it through a deployment process, and keep their fingers crossed for all to be okay. Developing like this is obviously inefficient and provides no guarantees on the correct functioning of the software, both in valid and invalid situations.

Now we will dive into the gray area of testing Airflow, which is often regarded as a tricky subject. This is because of Airflow’s nature of communicating with many external systems and the fact that it’s an orchestration system, which starts and stops tasks performing logic, while Airflow itself (often) does not perform any logic. Despite these challenges, however, you can do testing with Airflow, as we’ll see.

10.1 Getting started with testing

10.1.1 Integrity testing all DAGs

10.1.2 Setting up a CI/CD pipeline

10.1.3 Writing unit tests

10.1.4 Pytest project structure

10.1.5 Testing with files on disk

10.2 Working with DAGs and task context in tests

10.2.1 Working with external systems

10.3 Using tests for development

10.4 Testing Complete DAGs

10.4.1 Using dag.test() to test your whole DAG

10.4.2 Emulate production environments with Whirl

10.4.3 Create DTAP environments

10.5 Summary