Today, every developer should understand the growing parallelism available within modern CPU processors. Unlocking the untapped performance of CPUs is a critical skill for parallel and high performance computing applications. To show how to take advantage of CPU parallelism, we cover
- Using vector hardware
- Using threads for parallel work across multi-core processors
- Coordinating work on multiple CPUs and multi-core processors with message passing
The CPU’s parallel capabilities need to be at the core of your parallel strategy. Because it’s the central workhorse, the CPU controls all the memory allocations, memory movement, and communication. The application developer’s knowledge and skill are the most important factors for fully using the CPU’s parallelism. CPU optimization is not automatically done by some magic compiler. Commonly, many of the parallel resources on the CPU go untapped by applications. We can break down the available CPU parallelism into three components in increasing order of effort. These are