chapter fourteen

14 Writing production code

This chapter covers

Validating feature data before attempting to use it for a model
Monitoring features in production
Monitoring all aspects of a production model life cycle
Approaching projects with the goal of solving them in the simplest manner possible
Defining a standard code architecture for ML projects
Avoiding cargo cult behavior in ML

We spent the entirety of part 2 of this book on the more technician-focused aspects of building ML software. In this chapter, we’ll begin the journey of looking at ML project work from the eyes of an architect.

We’ll focus on the theory and philosophy of approaches to solving problems with ML from the highly interconnected, intensely complex, and altogether holistic view of how our profession functions. We’ll look at case studies of production ML (all based, in one way or another, on things that I’ve messed up or have seen others mess up) to give an insight into elements of ML development that aren’t frequently talked about. These are the lessons learned (usually the hard way) when we, as a profession, are more focused on the algorithmic aspects of solving problems, rather than where we should be focused:

The data—How it’s generated, where it is, and what it fundamentally is
The complexity—Of the solution and of the code
The problem—How to solve it in the easiest way possible

14.1 Have you met your data?

14.1.1 Make sure you have the data

14 Writing production code

This chapter covers

14.1 Have you met your data?

14.1.1 Make sure you have the data

14.1.2 Check your data provenance

14.1.3 Find a source of truth and align on it

14.1.4 Don’t embed data cleansing into your production code

14.2 Monitoring your features

14.3 Monitoring everything else in the model life cycle

14.4 Keeping things as simple as possible

14.4.1 Simplicity in problem definitions

14.4.2 Simplicity in implementation