concept operator in category apache airflow

appears as: operator, Operators, n operator, The operator, An operator, operators, Operator, The operator
Data Pipelines with Apache Airflow MEAP V05

This is an excerpt from Manning's book Data Pipelines with Apache Airflow MEAP V05.

Note the (lowercase) dag is the name assigned to the instance of the (uppercase) DAG class. The instance name could have any name; you can name it e.g. rocket_dag or whatever_name_you_like. We will reference the variable (lowercase dag) in all operators, which tells Airflow which DAG the operator belongs to.

Figure 2.4 DAGs and Operators are used by Airflow users. Tasks are internal components to manage operator state and display state changes (e.g., started/finished) to the user.

13.2.2    AWS-specific hooks and operators

Airflow provides a considerable number of built-in hooks/operators that allow you to interact with a great number of the AWS services. These allow you to (for example) coordinate processes involving moving and transforming data across the different services, as well as the deployment of any required resources. For an overview of all the available hooks and operators, see the Amazon/Aws provider package[106].

Due to their large number, we won’t go into any details of the AWS-specific hooks and operators but would rather refer you to their documentation. However, tables 13.1 and 13.2 provide a brief overview of several hooks and operators, together with the AWS services they tie into and their respective applications. A demonstration of some of these hooks and operators is also provided in the next section.

Listing 13.7
import datetime as dt
import os
from os import path
import tempfile
 
import pandas as pd
 
from airflow import DAG, utils as airflow_utils
from airflow.providers.amazon.aws.hooks.s3 import S3Hook
from airflow.providers.amazon.aws.operators.athena import AWSAthenaOperator
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
 
from custom.operators import GlueTriggerCrawlerOperator
from custom.ratings import fetch_ratings
 
with DAG(
   dag_id="chapter13_aws_usecase",
   description="DAG demonstrating some AWS-specific hooks and operators.",
   start_date=dt.datetime(year=2015, month=1, day=1),
   end_date=dt.datetime(year=2015, month=3, day=1),
   schedule_interval="@monthly",
   default_args={
       "depends_on_past": True
   }
) as dag:
   upload_ratings = PythonOperator(...)
   trigger_crawler = GlueTriggerCrawlerOperator(...)
   rank_movies = AWSAthenaOperator(...)
   upload_ratings >> trigger_crawler >> rank_movies
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest