8 Serverless deep learning

 

This chapter covers

  • Serving models with TensorFlow Lite—a lightweight environment for applying TensorFlow models
  • Deploying deep learning models with AWS Lambda
  • Exposing the lambda function as a web service via API Gateway

In the previous chapter, we trained a deep learning model for categorizing images of clothing. Now we need to deploy it, making the model available for other services.

We have many possible ways of doing this. We have already covered the basics of model deployment in chapter 5, where we talked about using Flask, Docker, and AWS Elastic Beanstalk for deploying a logistic regression model.

In this chapter, we’ll talk about the serverless approach for deploying models—we’ll use AWS Lambda.

8.1 Serverless: AWS Lambda

AWS Lambda is a service from Amazon. Its main promise is that you can “run code without thinking about servers.”

It lives up to the promise: in AWS Lambda, we just need to upload some code. The service takes care of running it and scales it up and down according to the load.

Additionally, you only need to pay for the time when the function is actually used. When nobody uses the model and invokes our service, you don’t pay for anything.

In this chapter, we use AWS Lambda for deploying the model we trained previously. For doing that, we’ll also use TensorFlow Lite—a lightweight version of TensorFlow that has only the most essential functions.

8.1.1 TensorFlow Lite

8.1.2 Converting the model to TF Lite format

8.1.3 Preparing the images

8.1.4 Using the TensorFlow Lite model

8.1.5 Code for the lambda function

8.1.6 Preparing the Docker image

8.1.7 Pushing the image to AWS ECR

8.1.8 Creating the lambda function

8.1.9 Creating the API Gateway

8.2 Next steps

8.2.1 Exercises

8.2.2 Other projects

Summary

sitemap