chapter nine

9 Data analysis using GPU computing

This chapter covers

Using GPU architectures to improve many data analysis algorithms
Using Numba to convert Python code to efficient GPU low-level code
Writing highly parallel GPU code to work on matrices
Using GPU-native data analysis libraries from Python

Graphics processing units (GPUs) were originally designed to make graphics applications more efficient: drawing and animation software, computer-aided design, and, of course, games!

At some point, it became clear that GPUs could not only do graphics processing but could also be used to do all kinds of computing, hence the appearance of general-purpose computing on graphics processing units (GPGPUs). GPUs are attractive because they have substantially more computing power than CPUs. They have been successfully used for many applications, such as scientific computing and artificial intelligence. They have massive applications in data science and in making computing more efficient in general.

9.1 Making sense of GPU computing power

9.1.1 Understanding the advantages of GPUs

9.1.2 The relationship between CPUs and GPUs

9.1.3 The internal architecture of GPUs

9.1.4 Software architecture considerations

9.2 Using Numba to generate GPU code

9.2.1 Installation of GPU software for Python

9.2.2 The basics of GPU programming with Numba

9.2.3 Revisiting the Mandelbrot example using GPUs

9.2.4 A NumPy version of the Mandelbrot code

9.3 Performance analysis of GPU code: The case of a CuPy application

9.3.1 GPU-based data analysis libraries

9.3.2 Using CuPy: A GPU-based version of NumPy

9.3.3 A basic interaction with CuPy

9.3.4 Writing a Mandelbrot generator using Numba

9.3.5 Writing a Mandelbrot generator using CUDA C