chapter two

2 What is MLOps?

This chapter covers

Understanding machine learning operations (MLOps) and its role in production ML
Key challenges in building reliable ML systems
How MLOps differs from traditional DevOps
Building confidence through structured ML processes

In chapter 1, we introduced the ML life cycle and the foundational skills needed to become an effective ML engineer. Now, let’s dig deeper into the machine learning operations (MLOps) practices and principles that will help you reliably deliver value through ML systems. ML and ML models are often not the end product of an organization, but rather a means to an end.

The gap between business value generation, requirements, and necessary infrastructure is the primary reason ML and by extension MLOps are hard. Very few companies truly do research on model development and instead reuse architectures and train/adapt off-the-shelf models for specific domains and problem sets. The availability of comprehensive open source libraries such as Hugging Face also potentially make modeling trivial. After defining a problem and identifying an architecture to solve the problem statement, the hard questions come into focus:

How will the model be trained?
How will data get to the model?
How will the model interact with the other services?
Where will the model be run?
How do we make sure the model is accurate over time?

2.1 The iterative MLOps life cycle

2.1.1 Data collection

2.1.2 Exploratory Data Analysis

2.1.3 Modeling and training

2 What is MLOps?

This chapter covers

2.1 The iterative MLOps life cycle

2.1.1 Data collection

2.1.2 Exploratory Data Analysis

2.1.3 Modeling and training

2.1.4 Model evaluation

2.1.5 Deployment

2.1.6 Monitoring

2.1.7 Maintenance, updates, and review

2.2 Why is robust MLOps important ?

2.3 Role of MLOps in a mature organization

2.4 DevOps vs. MLOps

2.5 Levels of MLOps maturity

2.5.1 Level 0: Basic

2.5.2 Level 1: Intermediate

2.5.3 Level 2: Advanced