chapter twelve

12 GPU languages: getting down to basics

This chapter covers

Understanding the current landscape of native GPU languages
Creating simple GPU programs in each language
Tackling more complex multi-kernel operations
Porting between various GPU languages

This chapter covers lower-level languages for programming GPUs. We call these native languages because they directly reflect features of the target GPU hardware. We cover two of these languages, CUDA and OpenCL, that have become widely used. We also cover HIP, a new variant for AMD GPUs. In contrast to the pragma-based implementation, these GPU languages have a smaller reliance on the compiler. You should use these languages for more fine-tuned control of your program’s performance. How are these languages different than those presented in chapter 11? Our distinction is that these languages have grown up from the characteristics of the GPU and CPU hardware while the OpenACC and OpenMP languages started with high-level abstractions and rely on a compiler to map them to different hardware.

12.1 Features of a native GPU programming language

12.2 CUDA and HIP GPU languages; the low-level performance option

12.2.1 Writing and building your first CUDA application

12.2.2 A reduction kernel in CUDA: life gets complicated

12.2.3 Hipifying the CUDA code

12.3 OpenCL for a portable open source GPU language

12.3.1 Writing and building your first OpenCL application

12.3.2 Reductions in OpenCL

12.4 SYCL: an experimental C++ implementation goes mainstream

12.5 Higher-level languages for performance portability

12.5.1 Kokkos: a performance portability ecosystem.

12.5.2 RAJA for a more adaptable performance portability layer

12.6 Further explorations

12.6.1 Additional reading

12.6.2 Exercises

12.7 Summary