chapter eleven

11 Building an ML pipeline

This chapter covers

Overview of ML pipelines
Prerequisites for running an ML pipeline in Vertex AI
Model training and deployment: local implementation vs. ML pipeline implementation
Defining an ML pipeline to train and deploy a model
Updating the model training code to work with an ML pipeline
Using generative AI to help create the ML pipeline

In chapter 10, we went through the steps to deploy a deep learning model trained on tabular data. We deployed the model in a web application, first with the model running entirely on our local system, and then having the model deployed to a Vertex AI endpoint. In this chapter, we will go through the further steps to automate the training and deployment process by using an ML pipeline in Vertex AI. We will start by going over the setup steps necessary for an ML pipeline, including defining a Vertex AI dataset. Next, we will contrast the local model training and deployment we have seen from chapter 10 with model training and deployment using an ML pipeline. We will proceed to review the code specifically for the ML pipeline itself, along with the updates to the existing code, required for the model training code to work in the context of an ML pipeline. Finally, we will examine some of the ways that we can apply generative AI and get useful help from its outputs in the workflow for creating an ML pipeline.

11.1 Introduction to ML pipelines

11.1.1 Three kinds of pipelines

11.1.2 Overview Vertex AI ML pipelines

11.2 ML pipeline preparation steps

11.2.1 Create a service account for the ML pipeline

11.2.2 Create a service account key

11.2.3 Grant the service account access to the Compute Engine default service account

11.2.4 Introduction to Cloud Shell

11.2.5 Upload the Service Account key

11.2.6 Uploading the cleaned-up dataset to a Google Cloud Storage bucket

11.2.7 Creating a Vertex AI managed dataset

11.3 Defining the ML pipeline

11.3.1 Local implementation vs. ML pipeline

11.3.2 Introduction to containers

11.3.3 Benefits of using containers in an ML pipeline

11.3.4 Introduction to adapting code to run in a container

11.3.5 Updating the training code to work in a container

11.3.6 The pipeline script

11.3.7 Testing the model trained in the pipeline

11.4 Using generative AI to help create the ML pipeline

11.4.1 Using Gemini for Google Cloud to answer questions about the ML pipeline

11.4.2 Using Gemini for Google Cloud to generate code for the ML pipeline

11.4.3 Using Gemini for Google Cloud to explain code for the ML pipeline

11.4.4 Using Gemini for Google Cloud to summarize log entries

11.4.5 Tuning a foundation model in Vertex AI

11.5 Summary