chapter four
4 Model serving patterns
This chapter covers
- Using model serving to generate predictions or make inferences on new data with previously trained machine learning models.
- Handling the growing number of model serving requests and achieving horizontal scaling with the help of replicated model serving services.
- Processing large model serving requests by leveraging the sharded services pattern.
- Assessing model serving systems and determining whether event-driven design would be beneficial for improving resource efficiency.