4 Data design and performance models

 

This chapter covers

  • Why real applications struggle to achieve performance
  • Addressing kernels and loops that significantly underperform
  • Choosing data structures for your application
  • Assessing different programming approaches before writing code
  • Understanding how the cache hierarchy delivers data to the processor

This chapter has two topics that are intimately coupled: (1) the introduction of performance models increasingly dominated by data movement and, thus, necessarily (2) the underlying design and structure of data. Although it may seem secondary to performance, the data structure and its design are critical. This must be determined in advance because it dictates the entire form of the algorithms, code, and later, the parallel implementation.

The choice of data structures and, thereby, the data layout often determines the performance that you can achieve and in ways that are not always obvious when the design decisions are made. Thinking about the data layout and its performance impacts is at the core of a new and growing programming approach called data-oriented design. This approach considers the patterns of how data will be used in the program and proactively designs around it. Data-oriented design gives us a data-centric view of the world, which is also consistent with our focus on memory bandwidth rather than floating-point operations (flops). In summary, for performance, our approach is to think about

4.1 Performance data structures: Data-oriented design

4.1.1 Multidimensional arrays

4.1.2 Array of Structures (AoS) versus Structures of Arrays (SoA)

4.2 Three Cs of cache misses: Compulsory, capacity, conflict

4.3 Simple performance models: A case study

4.3.2 Compressed sparse storage representations

4.4 Advanced performance models

4.5 Network messages

4.6 Further explorations

Summary