chapter ten

10 Model deployment

This chapter covers

Deploying a deep learning model in a simple web application on our local system
An introduction to key Google Cloud concepts
An introduction to Vertex AI, the machine learning environment in Google Cloud
Deploying a deep learning model with a Vertex AI endpoint
Adapting the web application to use a Vertex AI endpoint
Getting generative AI assistance with Gemini for Google Cloud

In chapter 9, we reviewed a set of best practices for training a deep learning model with tabular data and introduced the Kuala Lumpur real estate price prediction problem as a challenging tabular problem because of its mixed-type features. In this chapter, we will take the model we trained in chapter 9 and deploy it in a simple web application. First, we will deploy it locally—that is, having both the web server and the trained model on our local system. Next, we will introduce Google Cloud as an alternative way to deploy our model. In fact, we will take the trained model and deploy it with an endpoint in Vertex AI, the machine learning environment in Google Cloud. Finally, we will examine how to use Google’s generative AI assistant Gemini on Google Cloud. The code described in this chapter is available at https://mng.bz/6e1A .

10.1 A simple web deployment

10.1.1 Overview of web deployment

10.1.2 The Flask server module

10.1.3 The home.html page

10.1.4 The show-prediction.html page

10.1.5 Exercising the web deployment

10.2 Public clouds and machine learning operations

10.3 Getting started with Google Cloud

10.3.1 Accessing Google Cloud for the first time

10.3.2 Creating a Google Cloud project

10.3.3 Creating a Google Cloud Storage bucket

10.4 Deploying a model in Vertex AI

10.4.1 Uploading the model to a Cloud Storage bucket

10.4.2 Importing the model to Vertex AI

10.4.3 Deploying the model to an endpoint

10.4.4 Initial test of the model deployment

10.5 Using the Vertex AI deployment with Flask

10.5.1 Setting up the Vertex AI SDK