Chapter 11. Reduction and sorting
This chapter covers
At long last, we’re going to stop talking about OpenCL’s structures and functions and start putting them to use. In particular, this chapter focuses on three practical applications of OpenCL: MapReduce, the bitonic sort, and the radix sort. These applications all use a divide-and-conquer methodology to process data in parallel, and they partition data so that each parallel process can operate independently.
The goal of MapReduce isn’t to solve a particular problem, but to provide a framework for solving a class of problems that involve distributed processing. A basic MapReduce implementation consists of a mapping stage, which produces key-value pairs from input data, and a reduction stage, which processes the key-value pairs to produce output data. MapReduce is usually associated with cluster computing, but this chapter will examine how it can be executed on OpenCL-compliant hardware.
There are countless sorting algorithms available in computer science, but few are as well suited to parallel implementation as the bitonic sort and the radix sort. Both algorithms divide unsorted data into groups and then subdivide the groups further for simpler processing. The bitonic sort groups elements according to how their values relate to adjacent values. The radix sort groups elements that have similar digits.