chapter eight

8 Systems

 

This chapter covers

  • Optimizing instruction throughput with pipelining and branch prediction.
  • Maximizing core utilization with Simultaneous Multithreading (SMT).
  • Understanding when SMT helps and when it doesn't.
  • Diagnosing system unresponsiveness with soft and hard lockups.

Software does not run on magic; it runs on hardware. The racing driver Jackie Stewart once said, "You don't have to be an engineer to be a racing driver, but you do have to have Mechanical Sympathy." The same applies to programming. You don't need to be a chip designer to write efficient code, but understanding how the CPU executes instructions and how the kernel schedules tasks lets you work with the machine rather than against it.

The chapter is in two halves. The first looks at modern CPU architecture — instruction pipelining, branch prediction, and simultaneous multithreading — three techniques that hardware uses to extract more performance from each clock cycle, and that determine which kinds of code run fast and which don't. The second half drops down a layer further into the Linux kernel, looking at one of the most useful diagnostic concepts in low-level production debugging: the difference between a soft lockup and a hard lockup, and what each tells you about the underlying problem. Both halves are about the same thing — knowing enough about what's happening under the abstractions to make better decisions on top of them.

8.1 Computer Architecture

8.1.1 Instruction Pipelining

8.2 Simultaneous Multithreading

8.2.1 Understanding SMT

8.2.2 Origins of SMT

8.2.3 How Does SMT Work?

8.2.4 Performance

8.2.5 Linux kernel Scheduling

8.2.6 The End of the SMT Era?

8.2.7 Linux

8.3 Summary