8 Wait-free synchronization
This chapter covers
- Understanding synchronization and mutual exclusion
- Working with atomics and memory barriers
- Building your own wait-free data structures
In the previous chapter, we explored typical sources of redundant work and strategies to eliminate them, thereby reducing latency. However, optimizing CPU usage alone may only sometimes suffice to meet stringent latency requirements. In such cases, leveraging the parallelism offered by multiple CPUs becomes crucial. If your application allows for data partitioning—a technique discussed in Chapter 5 that involves dividing data into independent chunks—you can scale your performance by adding more CPUs. This approach can significantly optimize latency in many use cases and workloads.