Compare-and-swap
From Wikipedia, the free encyclopedia
In computer science, the compare-and-swap CPU instruction ("CAS") (or the CMPXCHG instruction in the x86 and Itanium architectures) is a special instruction that atomically compares the contents of a memory location to a given value and, if they are the same, modifies the contents of that memory location to a given new value. The result of the operation must indicate whether it performed the substitution; this can be done either with a simple boolean response (this variant is often called compare-and-set), or by returning the value read from the memory location (not the value written to it).
CAS is used to implement synchronization primitives like semaphores and mutexes, as well as more sophisticated lock-free and wait-free algorithms. Maurice Herlihy (1991) proved that CAS can implement more of these algorithms than atomic read, write, and fetch-and-add, and that, assuming a fairly large amount of memory, it can implement all of them [1].
Algorithms built around CAS typically read some key memory location and remember the old value. Based on that old value, they compute some new value. Then they try to swap in the new value using CAS, where the comparison checks for the location still being equal to the old value. If CAS indicates that the attempt has failed, it has to be repeated from the beginning: the location is re-read, a new value is computed and the CAS is tried again.
Some of these algorithms are affected by and must handle the problem of a false positive match, or the ABA problem. It's possible that between the time the old value is read and the time CAS is attempted, some other processors or threads change the memory location two or more times such that it acquires a bit pattern which matches the old value. The problem arises if this new bit pattern, which looks exactly like the old value, has a different meaning: for instance, it could be a recycled address, or a wrapped version counter.
CAS, and other atomic instructions, are unnecessary in uniprocessor systems, because the atomicity of any sequence of instructions can be achieved by disabling interrupts while executing it. However, often disabling interrupts is too expensive to be practical, so even programs only intended to run on uniprocessor machines will benefit by using them, as in the case of Linux's futexes.
In multiprocessor systems, it is impossible and undesirable to disable interrupts on all processors at the same time. Even with interrupts disabled, two or more processors could be attempting to access the same semaphore's memory at the same time. The compare-and-swap instruction allows any processor to atomically test and modify a memory location, preventing such multiple processor collisions.
Contents |
[edit] Extensions
Since CAS operates on a single pointer-sized memory location, while most lock-free and wait-free algorithms need to modify multiple locations, several extensions have been implemented.
- Double compare-and-swap compares two unrelated memory locations with two expected values, and if they're equal, sets both locations to new values. This operation only exists on a few Motorola processors[2], but is widely assumed in papers on lock-free data structures.
- Double-wide compare-and-swap operates on two adjacent pointer-sized locations (or, equivalently, one location twice as big as a pointer). On later x86 processors, the CMPXCHG8B and CMPXCHG16B instructions[3] serve this role.
- Single compare, double swap compares one pointer but writes two. The Itanium's cmp8xchg16 instruction[4] implements this, where the two written pointers are adjacent.
[edit] See also
[edit] References
- ^ Herlihy, Maurice (January, 1991). "Wait-free synchronization". ACM Trans. Program. Lang. Syst. 13 (1): 124-149.
- ^ CAS2. Retrieved on 2007-12-15.
- ^ Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2A: Instruction Set Reference, A-M (PDF). Retrieved on 2007-12-15.
- ^ Intel Itanium Architecture Software Developer’s Manual Volume 3: Instruction Set Reference (PDF). Retrieved on 2007-12-15.
[edit] External links
- compare_and_swap Kernel Service
- "Lock-Free and Practical Deques using Single-Word Compare-And-Swap" by Håkan Sundell, Philippas Tsigas
- "Fast, Reactive and Lock-free: Multi-word Compare-and-swap Algorithms" by Phuong Ha-Hoai, Philippas Tsigas
- "A Practical Multi-Word Compare-And-Swap Operation" by Timothy L. Harris, Keir Fraser, Ian A. Pratt
- "Lock-Free Linked Lists Using Compare-and-Swap" by John D. Valois
- "A Practical Nonblocking Queue Algorithm Using Compare-and-Swap" by Chien-Hua Shann, Ting-Lu Huang, Cheng Chen
- "A Nonblocking Algorithm for Shared Queues Using Compare-and-Swap" by S. Prakash, Yann Hang Lee, T. Johnson
- Discussion "Lock-Free using cmpxchg8b..."
- Java method
AtomicReferenceFieldUpdater.compareAndSet(T, V, V)