chapter sixteen

16 Production infrastructure

This chapter covers

Implementing passive retraining with the use of a model registry
Utilizing a feature store for model training and inference
Selecting an appropriate serving architecture for ML solutions

Utilizing ML in a real-world use case to solve a complex problem is challenging. The sheer number of skills needed to take a company’s data (frequently messy, partially complete, and rife with quality issues), select an appropriate algorithm, tune a pipeline, and validate that the prediction output of a model (or an ensemble of models) solves the problem to the satisfaction of the business is daunting. The complexity of an ML-backed project does not end with the creation of an acceptably performing model, though. The architectural considerations and implementation details can add significant challenges to a project if they aren’t made correctly.

Every day there seems to be a new open sourced tech stack that promises an easier deployment strategy or a magical automated solution that meets the needs of all. With this constant deluge of tools and platforms, making a decision on where to go to meet the needs of a particular project can be intimidating.

16.1 Artifact management

16.1.1 MLflow’s model registry

16.1.2 Interfacing with the model registry

16.2 Feature stores

16.2.1 What a feature store is used for

16.2.2 Using a feature store

16.2.3 Evaluating a feature store

16.3 Prediction serving architecture

16.3.1 Determining serving needs

16.3.2 Bulk external delivery

16.3.3 Microbatch streaming

16.3.4 Real-time server-side

16.3.5 Integrated models (edge deployment)

Summary