chapter eleven
Moving models to production is only the first step—keeping them performing reliably over time requires robust monitoring and understanding of their behavior. In this chapter, we’ll explore how to implement comprehensive monitoring for ML systems and gain insights into their decision-making processes (figure 11.1).
We’ll tackle monitoring from two critical angles. First, we’ll set up basic operational monitoring to ensure our services meet performance and reliability requirements.
Then, we’ll implement ML-specific monitoring to detect data drift and track model behavior. Model monitoring can be split up into two main components.
- Basic monitoring
- Data drift monitoring