4 Designing reliable ML systems
This chapter covers
- Tooling for ML Platform
- Track ML experiments using MLflow experiment tracker
- Store the models created in MLflow model registry
- Register model features in Feast feature store
As we move deeper into ML Engineering, we now tackle a critical challenge: how to reliably track, reproduce, and deploy ML experiments. This chapter introduces essential tools that turn ad-hoc experimentation into production-ready ML workflows. We’ll build a practical ML platform that improves reliability while remaining flexible enough for real-world applications.
In particular, we explore individual components of the ML platform discussed in Chapter 1 (Section 1.3). We will learn about different tooling/applications that help us in tracking our data science experiments, storing the model features, and aiding in pipeline orchestration and model deployment. Our goal would be to show a fully functional mini ML platform with these tools while highlighting interactions between them.
We will start our ML journey the way most data scientists do - By understanding the data. We will perform some EDA, split our dataset into training and testing sets, and run multiple models to get the one that performs best. The initial stages of a data science project are mostly exploratory, and we experiment with different features, model hyperparameters, and frameworks. fe