Load-Link/Store-Conditional
From Wikipedia, the free encyclopedia
In computer science, load-link (LL, also known as "load-linked" or "load and reserve") and store-conditional (SC) are a pair of instructions that together implement a lock-free atomic read-modify-write operation.
Load-link returns the current value of a memory location. A subsequent store-conditional to the same memory location will store a new value only if no updates have occurred to that location since the load-link. If any updates have occurred, the store-conditional is guaranteed to fail, even if the value read by the load-link has since been restored. As such, a LL/SC pair is stronger than a read followed by a compare-and-swap (CAS), which will not detect updates if the old value has been restored.
Contents |
[edit] Weak LL/SC
Real implementations of LL/SC do not always succeed if there are no concurrent updates to the memory location in question. Any exceptional events between the two operations, such as a context switch, another load-link, or even (on many platforms) another load or store operation, will cause the store-conditional to spuriously fail. Older implementations will fail if there are any updates broadcast over the memory bus.
This is often called weak LL/SC by researchers, as it breaks many theoretical LL/SC algorithms. Weakness is relative, and some weak implementations can be used for some algorithms.
The only limitation of weak LL/SC versus CAS is that the latter is usually guaranteed to be fair, while the former is not. This is also true of strong LL/SC.
[edit] Implementations
All of Alpha, PowerPC, MIPS, and ARM have LL/SC instructions: ldl_l/stl_c and ldq_l/stq_c (Alpha), lwarx/stwcx (PowerPC), ll/sc (MIPS), and ldrex/strex (ARM version 6 and above).
Most platforms provide multiple sets of instructions for different data sizes, e.g. ldarx/stdcx for doubleword on the PowerPC.
Some CPUs require the address being accessed exclusively to be configured in write-through mode.
All of these platforms provide weak LL/SC. The PowerPC implementation is the strongest, allowing an LL/SC pair to wrap loads and even stores to other cache lines. This allows it to implement, for example, lock-free reference counting in the face of changing object graphs with arbitrary counter reuse (which otherwise requires DCAS).