Seqlock

From Wikipedia, the free encyclopedia

A seqlock (short for "sequential lock") is a special locking mechanism used in Linux for supporting fast writes of shared variables between two parallel operating system routines. The semantics stabilized as of version 2.5.59, and they are present in the 2.6.x stable kernel series. The seqlocks were developed by Stephen Hemminger and originally called frlocks.

It is a reader-writer consistent mechanism which avoids the problem of writer starvation. A seqlock consists of storage for saving a sequence number in addition to a lock. The lock is to support synchronisation between two writers and the counter is for indicating consistency in readers. In addition to updating the shared data, the writer increments the sequence number, both after acquiring the lock and before releasing the lock. Readers read the sequence number before and after reading the shared data. Whenever the sequence number is an odd number, a reader will assume that a writer has changed the data since it started reading. Whenever a consistent set of information is required, readers will retry until they read an even sequence number.

It should be noted that the reader never blocks, but it may have to retry if a write is in progress; this speeds up the readers in the case where the data was not modified, since the do not have to acquire the lock as they would in with traditional read-write lock. Also, writers do not wait for readers, whereas with traditional read-write locks they do, leading to potential resource starvation in a situation where there are a number of readers (because the writer must wait for there to be no readers). Because of these two factors, seqlocks are more efficient than traditional read-write locks for the situation where there are many readers and few writers.

It should also be noted that the technique will not work for data that contains pointers, because any writer could invalidate a pointer that a reader has already followed.

This was first applied to system time counter updating. Each time interrupt updates the time of the day; there may be many readers of the time for operating system internal use and applications, but writes are relatively infrequent and only occur one at a time.