chapter fifteen

15 Deploying to production

This chapter covers:

Various options to deploy PyTorch models.
Deploying a server for our models.
Exporting our models.
Making good use of the PyTorch JIT with all of this.
Running exported models from C++.
Natively implementing models in C++.
Running our models on mobile.

In Part I we learned a lot about models and Part II left us with a detailed path towards good models for a particular problem. Now that we have these great models, we need to take them where they can be useful. Maintaining infrastructure for executing inference of deep learning models at scale can be impactful from an architectural as well as cost standpoint. While PyTorch started off as a framework focused on research, starting with the 1.0 release a set of production-oriented features were added that today make PyTorch an ideal end-to-end platform from resarch to large-scale production.

What deploying to production means to use will vary with the use case:

· Perhaps the most natural deployment for the models we developed in Part II is that we might set up a network service providing access to our models. We do this in two versions using lightweight Python web frameworks, Flask [1: http://flask.pocoo.org] and Sanic [2: https://sanicframework.org]. The first is arguably one of the most popular of these frameworks and the latter is similar in spirit but leverages Python’s new async/await support for asynchronous operations for efficiency.

15.1 Serving PyTorch models

15.1.1 Our model behind a Flask server

15.1.2 What we wish from Deployment

15 Deploying to production

This chapter covers:

15.1 Serving PyTorch models

15.1.1 Our model behind a Flask server

15.1.2 What we wish from Deployment

15.1.3 Request Batching

15.2 Exporting Models

15.2.1 Interoperability beyond PyTorch with ONNX

15.2.2 PyTorch’s own export — tracing

15.2.3 Our server with a traced model

15.3 Interacting with the PyTorch JIT

15.3.1 What to expect from moving beyond classic Python/PyTorch

15.3.2 The dual nature of PyTorch as Interface and Backend

15.3.3 TorchScript

15.3.4 Scripting the gaps of traceability