11 Scaling and deploying Dask

 

This chapter covers

  • Creating a Dask Distributed cluster on Amazon AWS using Docker and Elastic Container Service
  • Using a Jupyter Notebook server and Elastic File System to store and access data science notebooks and shared datasets in Amazon AWS
  • Using the Distributed client object to submit jobs to a Dask cluster
  • Monitoring execution of jobs on the cluster using the Distributed monitoring dashboard

Up to this point, we’ve been working with Dask in local mode. This means that everything we’ve asked Dask to do has all been executed on a single computer. Running Dask in local mode is very useful for prototyping, development, and ad-hoc exploration, but we can still quickly reach the performance limits of a single computer. Just as our hypothetical chef in chapter 1 needed to call in reinforcements to get her kitchen prepped in time for dinner service, we too can configure Dask to spread the work out across many computers to process large jobs more quickly. This becomes especially important in production systems when time constraints apply. Therefore, it’s typical to scale out and run Dask in cluster mode in production.

Figure 11.1 This chapter will cover the last elements of the workflow: deployment and monitoring.

c11_01.eps

11.1 Building a Dask cluster on Amazon AWS with Docker

 
 
 
 

11.1.1 Getting started

 

11.1.2 Creating a security key

 

11.1.3 Creating the ECS cluster

 
 

11.1.4 Configuring the cluster’s networking

 
 

11.1.5 Creating a shared data drive in Elastic File System

 
 
 
 

11.1.6 Allocating space for Docker images in Elastic Container Repository

 
 
 

11.1.7 Building and deploying images for scheduler, worker, and notebook

 
 
 

11.1.8 Connecting to the cluster

 
 
 

11.2 Running and monitoring Dask jobs on a cluster

 
 

11.3 Cleaning up the Dask cluster on AWS

 
 

Summary

 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest