In this chapter, we will cover the tools and the different workflows that you can use to accelerate your application development. We’ll show you how profiling tools for the GPU can be helpful. In addition, we’ll discuss how to deal with the challenges of using profiling tools when working on a remote HPC cluster. Because the profiling tools continue to change and improve, we’ll focus on the methodology rather than the details of any one tool. The main takeaway of this chapter will be understanding how to create a productive workflow when using the powerful GPU profiling tools.
Profiling tools allow for quicker optimization, improving hardware utilization, and a better understanding of the application performance and hotspots. We’ll discuss how profiling tools expose bottlenecks and assist you in attaining better hardware usage. The following bulleted list highlights the commonly used tools in GPU profiling. We specifically show the NVIDIA tools for use with their GPUs because these tools have been around the longest. If you have a different vendor’s GPU on your system, substitute their tools in the workflow. Don’t forget about the standard Unix profiling tools such as gprof that we’ll use later in section 13.4.2.