concept OpenMP in category parallel computing

This is an excerpt from Manning's book Parallel and High Performance Computing MEAP V11 epub.
Your wave simulation application team has gotten the first increment of work for adding OpenMP to the application. But now it seems that the application is occasionally crashing with no explanation. One of your team members realizes that it might be due to thread race conditions. Your team implements an additional step to check for these thread race conditions as part of the commit process.
OpenMP (Open Multi-Processing) is one of the most widely supported open standards for threads and shared memory parallel programming. In this section, we will explain the standard, ease of use, expected gains, difficulties, and the memory models. The version of OpenMP that you see today took some time to develop and is still evolving.
This particular implementation style produces modest parallel performance on a single node. Take note, this implementation could be better. All the array memory is first touched by the master thread during the initialization prior to the main loop as shown in figure 7.5 on the left. This may cause the memory to be located in a different memory region where the memory access time is greater.
Figure 7.5 Adding a single OpenMP pragma on the main vector add computation loop in the figure on the left results in the a and b arrays being touched first by the master thread and the data is allocated near thread zero. The c array is first touched during the computation loop and therefore the memory for the c array is tclose to each thread. On the right, adding an OpenMP pragma on the initialization loop results in the memory for the a and b arrays being placed near the thread where the work is done.
![]()
Now, to improve the OpenMP performance, we insert pragmas in the initialization loops as shown in listing 7.8. The loops will be distributed in the same static threading partition, so the threads that touch the memory in the initialization loop will have the memory located near to them by the operating system as shown in the right side of figure 7.5.