4 Model serving patterns

 

This chapter covers

  • Using model serving to generate predictions or make inferences on new data with previously trained machine learning models.
  • Handling the growing number of model serving requests and achieving horizontal scaling with the help of replicated model serving services.
  • Processing large model serving requests by leveraging the sharded services pattern.
  • Assessing model serving systems and determining whether event-driven design would be beneficial for improving resource efficiency.
 
 
 

4.1 What is model serving?

 
 

4.2 Replicated services pattern: Handling growing number of serving requests

 
 

4.2.1 Problem

 

4.2.2 Solution

 
 
 

4.2.3 Discussion

 
 

4.2.4 Exercises

 
 

4.3 Sharded services pattern: Processing large model serving requests with high resolution videos

 
 
 

4.3.1 Problem

 
 
 

4.3.2 Solution

 
 
 

4.3.3 Discussion

 
 
 

4.3.4 Exercises

 
 

4.4 Event-driven processing pattern: Responding model serving requests based on events

 
 
 

4.4.1 Problem

 
 

4.4.2 Solution

 
 
 
 

4.4.3 Discussion

 
 

4.4.4 Exercises

 
 

4.5 References

 
 

4.6 Summary

 
 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest