6 Memory hierarchy, storage, and networking

 

This chapter covers

  • Making efficient use of CPU cache and main memory
  • Using Blosc to access compressed array data
  • Using NumExpr to accelerate NumPy expressions
  • Designing client/server architectures for very fast networks

It goes without saying that hardware affects performance. But how hardware interacts with performance is not always so obvious. The goal of this chapter is to help you get a better grasp of how, exactly, your machinery can affect your speed and what you can do on the hardware end to improve performance. To that end, we will take a close look at the effects of modern hardware and network architectures on efficient data processing with Python.

There are many counterintuitive implications for software development stemming from hardware considerations. For example, there are quite a few cases where working with compressed data is faster than dealing with uncompressed data. Conventional wisdom suggests that the cost of both decompressing and analyzing data would be much more expensive than just analyzing data. After all, when we decompress, we are adding more computations. So how can this be computationally more efficient? It turns out that modern hardware architectures can play tricks with “obvious” observations.

6.1 How modern hardware architectures affect Python performance

6.1.1 The counterintuitive effect of modern architectures on performance

6.1.2 How CPU caching affects algorithm efficiency

6.1.3 Modern persistent storage

6.2 Efficient data storage with Blosc

6.2.1 Compress data; save time

6.2.2 Read speeds (and memory buffers)

6.2.3 The effect of different compression algorithms on storage performance

6.2.4 Using insights about data representation to increase compression

6.3 Accelerating NumPy with NumExpr

6.3.1 Fast expression processing

6.3.2 How hardware architecture affects our results

6.3.3 When NumExpr is not appropriate

6.4 The performance implications of using the local network

6.4.1 The sources of inefficiency with REST calls