7 Machine learning

 

This chapter covers

  • Training a machine learning model
  • Using Azure Machine Learning
  • DevOps for machine learning
  • Orchestrating machine learning pipelines

This chapter focuses on the final major workload of a data platform: machine learning (ML). ML is becoming increasingly important as more and more scenarios are supported by artificial intelligence. In this chapter, we will talk about running ML in production, reliably, and at scale. Figure 7.1 highlights our current focus area.

Figure 7.1 Running ML at scale is another major workload any data platform needs to support, along with data processing and analytics.

We’ll start with an ML model that a data scientist might develop on their laptop. This is a model that predicts whether a user is going to be a high spender or not, based on their web telemetry. The model is simple as the main focus is not its implementation, rather, how we can take it and run it in the cloud.

The next section introduces Azure Machine Learning (AML), an Azure service for running ML workloads. We’ll spin up an instance, configure it, then take our model and run it in this environment. We’ll talk about the benefits of using Azure Machine Learning for training models.

Next, we’ll implement DevOps for this workload, like we did for all other components of our platform. We’ll see how we can track everything in Git and deploy our model using Azure DevOps Pipelines. Machine learning combined with DevOps is also known as MLOps.

7.1 Training a machine learning model

7.1.1 Training a model using scikit-learn

7.1.2 High spender model implementation

7.2 Introducing Azure Machine Learning

7.2.1 Creating a workspace

7.2.2 Creating an Azure Machine Learning compute target

7.2.3 Setting up Azure Machine Learning storage

7.2.4 Running ML in the cloud

7.2.5 Azure Machine Learning recap

7.3 MLOps

7.3.1 Deploying from Git

sitemap