18 Airflow in GCP
This chapter covers:
- Designing a deployment strategy for GCP using GKE, Cloud Storage, and Google BigQuery.
- An overview of several GCP-specific hooks and operators that allow you to integrate with commonly used GCP services.
- Demonstrating how to use GCP-specific hooks and operators to build a simple serverless recommender system.
The last major cloud provider, Google Cloud Platform (GCP), is actually the best-supported cloud platform in terms of the number of hooks and operators. Almost all Google services can be controlled with Airflow. In this chapter, we’ll dive into setting up Airflow on GCP (18.1), operators and hooks for GCP services (18.2), and the same use case as demonstrated on AWS and Azure, applied to GCP (18.3).
We must also note that GCP features a managed Airflow service named “Cloud Composer”, which is mentioned in more detail in Section 15.3.2. This chapter covers a DIY Airflow setup on GCP, not Cloud Composer.
GCP provides various services for running software. There is no one-size-fits-all, which is why Google (and all other cloud vendors) provide different services for running software.
These services can be mapped on a scale, ranging from fully self-managed and the most flexibility, to managed completely by GCP and no maintenance required: