8 Wait-free synchronization

 

This chapter covers

  • Understanding synchronization and mutual exclusion
  • Working with atomics and memory barriers
  • Building your own wait-free data structures

In the previous chapter, we explored typical sources of redundant work and strategies to eliminate them, thereby reducing latency. However, optimizing CPU usage alone may only sometimes suffice to meet stringent latency requirements. In such cases, leveraging the parallelism offered by multiple CPUs becomes crucial. If your application allows for data partitioning—a technique discussed in Chapter 5 that involves dividing data into independent chunks—you can scale your performance by adding more CPUs. This approach can significantly optimize latency in many use cases and workloads.

8.1 Mutual exclusion

8.1.1 Mutexes

8.1.2 Read-write locks

8.1.3 Spinlocks

8.2 Problems with mutual exclusion

8.2.1 Inefficiency

8.2.2 Priority inversion

8.2.3 Convoying

8.2.4 Deadlocks

8.3 Atomics

8.3.1 Atomic operations

8.3.2 Anatomy of a spinlock

8.4 Memory barriers

8.4.1 Types of memory barriers

8.4.2 Compiler barriers

8.4.3 Memory reordering example

8.5 Wait-free synchronization

8.5.1 Progress conditions

8.5.2 Consensus number

8.5.3 Wait-free queues

8.5.4 Wait-free stacks

8.5.5 Wait-free linked-lists

8.6 Putting it together: Building a single-producer, single-consumer (SPSC) queue

8.7 Summary