chapter two

2 Machine learning lifecycle

Imagine that you are a project manager charged with creating and deploying a new intelligent system to address a problem your organization is facing. Say you are on the innovation team of m-Udhār Solar, a (fictional) pay-as-you-go solar energy provider to poor rural villages that is struggling to handle a growing load of applications. The company is poised to expand from installing solar panels in a handful of pilot districts to all the districts in the state, but only if it can make loan decisions for 25 times as many applications per day with the same number of loan officers. You think machine learning may be able to help. Will the children in new districts be able to study at night to prepare for the fast-approaching state exam?

Is this really a problem to address with machine learning? How would you even begin the project? What steps would you follow? What roles would be involved in carrying out the steps? Which stakeholders’ buy-in would you need to win? And importantly, what would you need to do to ensure that the system is trustworthy? Making a machine learning system trustworthy should not be an afterthought or add-on, but should be part of the plan from the beginning.

The end-to-end development process or lifecycle involves several steps:

problem specification,
data understanding,
data preparation,
modeling,
evaluation, and
deployment and monitoring.

2 Machine learning lifecycle

2.1 A mental model for the machine learning lifecycle

2.2 Problem specification

2.3 Data understanding

2.4 Data preparation

2.5 Modeling

2.6 Evaluation

2.7 Deployment and monitoring

2.8 Summary