This chapter covers
- Writing lazy workflows for processing large datasets locally
- Understanding the lazy behavior of map
- Writing classes with generators for lazy simulations
In chapter 2 (section 2.1.2, to be exact), I introduced the idea that our beloved map function is lazy by default; that is, it only evaluates when the value is needed downstream. In this chapter, we’ll look at a few of the benefits of laziness, including how we can use laziness to process big data on our laptop. We’ll focus on the benefits of laziness in two contexts:
- File processing
- Simulations
With file processing, we’ll see that laziness allows us to process much more data than could fit in memory without laziness. With simulations, we’ll see how we can use laziness to run “infinite” simulations. Indeed, lazy functions allow us to work with an infinite amount of data just as easily as we could if we were working with a limited amount of data.