Most web sites out there tell you to use the concurrency primitives provided by your OS because this stuff is hard to understand. That's just not useful advice for RAMCloud, since we care so much about performance (and we're willing to go through as much pain as necessary to get it).

Main Issues

We assume the compiler may reorder instructions or remove them altogether for efficiency unless told otherwise.

We assume the processor may reorder instructions and delay stores indefinitely unless told otherwise.

A correct concurrency primitive must account for both of these issues.

Compiler Tools

Processor Tools

Survey of Existing Concurrency Primitives