3 Performance limits and profiling
This chapter covers:
- How to understand the limiting aspect of the performance of your application. Is it flops, memory bandwidth, or reading data from disk?
- How to evaluate hardware performance for the target of the next set of changes. For example, if the plan is to add vectorization to the code, a vectorized benchmark and theoretical performance for vectorized code might be helpful.
- How to measure the current performance of your application
Programmer resources are scarce. You need to target them to where they have the most impact. How do you do this if you don’t know the performance characteristics of your application and the hardware you plan to run on? That is what this chapter is meant to address. By measuring the performance of your hardware and your application, you can determine where it would be most effective to spend your development time.