chapter six
6 The universal workflow of machine learning
This chapter covers
- Framing a machine learning problem
- Developing a working model
- Deploying your model in production and maintaining it
Our previous examples have assumed that we already had a labeled dataset to start from and that we could immediately start training a model. In the real world, this is often not the case. You don’t start from a dataset; you start from a problem.
Imagine that you’re launching your own machine learning consulting shop. You incorporate, you put up a fancy website, you notify your network. The projects start rolling in:
- A personalized photo search engine for a picture-sharing social network: type in “wedding” and retrieve all the pictures you took at weddings, without any manual tagging needed.
- Flagging spam and offensive text content among the posts of a budding chat app.
- Building a music recommendation system for users of an online radio.
- Detecting credit card fraud for an e-commerce website.
- Predicting display ad click-through rate to decide which ad to serve to a given user at a given time.
- Flagging anomalous cookies on the conveyor belt of a cookie-manufacturing line.
- Using satellite images to predict the location of as-yet-unknown archaeological sites.
It would be very convenient if you could import the correct dataset from keras3::dataset_mydataset() and start fitting some deep learning models. Unfortunately, in the real world, you’ll have to start from scratch.