Chapter 3. Function pipelines for mapping complex transformations

 

This chapter covers

  • Using map to do complex data transformations
  • Chaining together small functions into pipelines
  • Applying these pipelines in parallel on large datasets

In the last chapter, we saw how you can use map to replace for loops and how using map makes parallel computing straightforward: a small modification to map, and Python will take care of the rest. But so far with map, we’ve been working with simple functions. Even in the Wikipedia scraping example from chapter 2, our hardest working function only pulled text off the internet. If we want to make parallel programming really useful, we’ll want to use map in more complex ways. This chapter introduces how to do complex things with map. Specifically, we’re going to introduce two new concepts:

  1. Helper functions
  2. Function chains (also known as pipelines)

We’ll tackle those topics by looking at two very different examples. In the first, we’ll decode the secret messages of a malicious group of hackers. In the second, we’ll help our company do demographic profiling on its social media followers. Ultimately, though, we’ll solve both of these problems the same way: by creating function chains out of small helper functions.

3.1. Helper functions and function chains

 
 

3.2. Unmasking hacker communications

 
 

3.3. Twitter demographic projections

 
 

3.4. Exercises

 
 
 
 

Summary

 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage