This chapter covers
- Approaches to vectorizing your code
- Controlling vmap() behavior using its parameters
- Analyzing typical cases where you might benefit from auto-vectorization
In chapter 3, you learned how to speed up your calculations by running them on GPUs and TPUs. Then, in chapter 5, you learned another option to speed up your code with compilation and XLA. Now it’s time to learn two more ways to make computations faster: automatic vectorization and parallelization. This chapter is dedicated to auto-vectorization, while chapters 7 and 8 look at parallelizing your computations.
Auto-vectorization provides you with several benefits. First, it simplifies the programming process by allowing you to write simpler functions for processing a single element and then automatically transform them into more complex functions working on batches (or arrays) of elements. Second, it can speed up your computations if your hardware resources and program logic allow you to perform computations for many items simultaneously. This is typically much faster than processing the same array item by item. It won’t usually be faster than a manually vectorized version (though it won’t be significantly slower either). Still, it will be much faster in another dimension: the developer’s productivity and time to vectorize a function by hand.