7 Machine learning

This chapter covers

Training a machine learning model
Using Azure Machine Learning
DevOps for machine learning
Orchestrating machine learning pipelines

This chapter focuses on the final major workload of a data platform: machine learning (ML). ML is becoming increasingly important as more and more scenarios are supported by artificial intelligence. In this chapter, we will talk about running ML in production, reliably, and at scale. Figure 7.1 highlights our current focus area.

Figure 7.1 Running ML at scale is another major workload any data platform needs to support, along with data processing and analytics.

We’ll start with an ML model that a data scientist might develop on their laptop. This is a model that predicts whether a user is going to be a high spender or not, based on their web telemetry. The model is simple as the main focus is not its implementation, rather, how we can take it and run it in the cloud.

The next section introduces Azure Machine Learning (AML), an Azure service for running ML workloads. We’ll spin up an instance, configure it, then take our model and run it in this environment. We’ll talk about the benefits of using Azure Machine Learning for training models.

Next, we’ll implement DevOps for this workload, like we did for all other components of our platform. We’ll see how we can track everything in Git and deploy our model using Azure DevOps Pipelines. Machine learning combined with DevOps is also known as MLOps.

7.1 Training a machine learning model

7 Machine learning

This chapter covers

Figure 7.1 Running ML at scale is another major workload any data platform needs to support, along with data processing and analytics.

7.1 Training a machine learning model

7.1.1 Training a model using scikit-learn

7.1.2 High spender model implementation

7.2 Introducing Azure Machine Learning

7.2.1 Creating a workspace

7.2.2 Creating an Azure Machine Learning compute target

7.2.3 Setting up Azure Machine Learning storage

7.2.4 Running ML in the cloud

7.2.5 Azure Machine Learning recap

7.3 MLOps

7.3.1 Deploying from Git

7 Machine learning

This chapter covers

Figure 7.1 Running ML at scale is another major workload any data platform needs to support, along with data processing and analytics.

7.1 Training a machine learning model

7.1.1 Training a model using scikit-learn

7.1.2 High spender model implementation

7.2 Introducing Azure Machine Learning

7.2.1 Creating a workspace

7.2.2 Creating an Azure Machine Learning compute target

7.2.3 Setting up Azure Machine Learning storage

7.2.4 Running ML in the cloud

7.2.5 Azure Machine Learning recap

7.3 MLOps

7.3.1 Deploying from Git

Unable to load book!