Load-Link/Store-Conditional

From Wikipedia, the free encyclopedia

In computer science, load-link (LL, also known as "load-linked" or "load and reserve") and store-conditional (SC) are a pair of instructions that together implement a lock-free atomic read-modify-write operation.

Load-link returns the current value of a memory location. A subsequent store-conditional to the same memory location will store a new value only if no updates have occurred to that location since the load-link. If any updates have occurred, the store-conditional is guaranteed to fail, even if the value read by the load-link has since been restored. As such, a LL/SC pair is stronger than a read followed by a compare-and-swap (CAS), which will not detect updates if the old value has been restored.

Contents

[edit] Weak LL/SC

Real implementations of LL/SC do not always succeed if there are no concurrent updates to the memory location in question. Any exceptional events between the two operations, such as a context switch, another load-link, or even (on many platforms) another load or store operation, will cause the store-conditional to spuriously fail. Older implementations will fail if there are any updates broadcast over the memory bus.

This is often called weak LL/SC by researchers, as it breaks many theoretical LL/SC algorithms. Weakness is relative, and some weak implementations can be used for some algorithms.

The only limitation of weak LL/SC versus CAS is that the latter is usually guaranteed to be fair, while the former is not. This is also true of strong LL/SC.

[edit] Implementations

All of Alpha, PowerPC, MIPS, and ARM have LL/SC instructions: ldl_l/stl_c and ldq_l/stq_c (Alpha), lwarx/stwcx (PowerPC), ll/sc (MIPS), and ldrex/strex (ARM version 6 and above).

Most platforms provide multiple sets of instructions for different data sizes, e.g. ldarx/stdcx for doubleword on the PowerPC.

Some CPUs require the address being accessed exclusively to be configured in write-through mode.

All of these platforms provide weak LL/SC. The PowerPC implementation is the strongest, allowing an LL/SC pair to wrap loads and even stores to other cache lines. This allows it to implement, for example, lock-free reference counting in the face of changing object graphs with arbitrary counter reuse (which otherwise requires DCAS).

[edit] See also

[edit] References

  • D. Detlefs, P. Martin, M. Moir and Guy L. Steele, Jr. "Lock-free reference counting", 20th annual ACM symposium on Principles of Distributed Computing, 2001, pp. 190-199 [1]
  • Kirk Reinholtz. "Atomic Reference Counting Pointers", C/C++ Users Journal, December, 2004 [2]
In other languages