Deadlock

This article is about the computer science concept. For other uses, see Deadlock (disambiguation).
Both processes need resources to continue execution. P1 requires additional resource R1 and is in possession of resource R2, P2 requires additional resource R2 and is in possession of R1; neither process can continue.

In concurrent programming, a deadlock is a situation in which two or more competing actions are each waiting for the other to finish, and thus neither ever does.

In a transactional database, a deadlock happens when two processes each within its own transaction updates two rows of information but in the opposite order. For example, process A updates row 1 then row 2 in the exact timeframe that process B updates row 2 then row 1. Process A can't finish updating row 2 until process B is finished, but process B cannot finish updating row 1 until process A is finished. No matter how much time is allowed to pass, this situation will never resolve itself and because of this, database management systems will typically kill the transaction of the process that has done the least amount of work.

In an operating system, a deadlock is a situation which occurs when a process or thread enters a waiting state because a resource requested is being held by another waiting process, which in turn is waiting for another resource held by another waiting process. If a process is unable to change its state indefinitely because the resources requested by it are being used by another waiting process, then the system is said to be in a deadlock.[1]

Deadlock is a common problem in multiprocessing systems, parallel computing and distributed systems, where software and hardware locks are used to handle shared resources and implement process synchronization.[2]

In telecommunication systems, deadlocks occur mainly due to lost or corrupt signals instead of resource contention.[3]

Examples

Any deadlock situation can be compared to the classic "chicken or egg" problem.[4] It can also be considered a paradoxical "Catch-22" situation.[5]

Here's a real life example. Alice and Bob are poor roommates, and they can't buy a microphone and a pair of headphones each. They decide to share – Alice buys the headphones, Bob buys a microphone. They decide on using the simplest possible tiebreaking rule: "calling it" – whoever says first "I need the microphone" gets the microphone; the same for the headphones. One day, they both decide independently to record a song; both Alice and Bob need the microphone to record, and the headphones to hear what they're recording. They both race to get exclusive locks on both resources. As it happens, Alice gets to call it on the microphone, but Bob calls it first on the headphones. As soon as this happens, they're both in a deadlock: Alice has a lock on the microphone but waits for a lock on the headphones, but Bob doesn't want to give up on his lock on the headphones because he's waiting for a lock on the microphone.

Moving onto the source code level, a deadlock can occur even in the case of a single thread and one resource (protected by a mutex). Assume there is a function f1 which does some work on the resource, locking the mutex at the beginning and releasing it after it's done. Next, somebody creates a different function f2 following that pattern on the same resource (lock, do work, release) but decides to include a call to f1 to delegate a part of the job. What will happen is the mutex will be locked once when entering f2 and then again at the call to f1, resulting in a deadlock if the mutex is not reentrant (i.e. the plain "fast mutex" variety).

Necessary conditions

A deadlock situation can arise if all of the following conditions hold simultaneously in a system:[6]

  1. Mutual exclusion: at least one resource must be held in a non-shareable mode.[1] Only one process can use the resource at any given instant of time.
  2. Hold and wait or resource holding: a process is currently holding at least one resource and requesting additional resources which are being held by other processes.
  3. No preemption: a resource can be released only voluntarily by the process holding it.
  4. Circular wait: a process must be waiting for a resource which is being held by another process, which in turn is waiting for the first process to release the resource. In general, there is a set of waiting processes, P = {P1, P2, …, PN}, such that P1 is waiting for a resource held by P2, P2 is waiting for a resource held by P3 and so on until PN is waiting for a resource held by P1.[1][7]

These four conditions are known as the Coffman conditions from their first description in a 1971 article by Edward G. Coffman, Jr.[7] Unfulfillment of any of these conditions is enough to preclude a deadlock from occurring.

Avoiding database deadlocks

An effective way to avoid database deadlocks is to follow this approach from the Oracle Locking Survival Guide:

Application developers can eliminate all risk of enqueue deadlocks by ensuring that transactions requiring multiple resources always lock them in the same order.[8]

This single sentence needs some explanation:

Deadlocks offer a challenging problem to correct as they result in data loss, are difficult to isolate, cause unexpected problems, and are time-consuming to fix. Modifying every section of software code in a large database-oriented system in order to always lock resources in the same order when the order is inconsistent takes significant resources and testing to implement.

Deadlock handling

Most current operating systems cannot prevent a deadlock from occurring.[9] When a deadlock occurs, different operating systems respond to them in different non-standard manners. Most approaches work by preventing one of the four Coffman conditions from occurring, especially the fourth one.[10] Major approaches are as follows.

Ignoring deadlock

In this approach, it is assumed that a deadlock will never occur. This is also an application of the Ostrich algorithm.[10][11] This approach was initially used by MINIX and UNIX.[7] This is used when the time intervals between occurrences of deadlocks are large and the data loss incurred each time is tolerable.

Detection

Under deadlock detection, deadlocks are allowed to occur. Then the state of the system is examined to detect that a deadlock has occurred and subsequently it is corrected. An algorithm is employed that tracks resource allocation and process states, it rolls back and restarts one or more of the processes in order to remove the detected deadlock. Detecting a deadlock that has already occurred is easily possible since the resources that each process has locked and/or currently requested are known to the resource scheduler of the operating system.[11]

Deadlock detection techniques include, but are not limited to, model checking. This approach constructs a finite state-model on which it performs a progress analysis and finds all possible terminal sets in the model. These then each represent a deadlock.

After a deadlock is detected, it can be corrected by using one of the following methods:

  1. Process termination: one or more processes involved in the deadlock may be aborted. We can choose to abort all processes involved in the deadlock. This ensures that deadlock is resolved with certainty and speed. But the expense is high as partial computations will be lost. Or, we can choose to abort one process at a time until the deadlock is resolved. This approach has high overheads because after each abort an algorithm must determine whether the system is still in deadlock. Several factors must be considered while choosing a candidate for termination, such as priority and age of the process.
  2. Resource preemption: resources allocated to various processes may be successively preempted and allocated to other processes until the deadlock is broken.

Prevention

Deadlock prevention works by preventing one of the four Coffman conditions from occurring.

Avoidance

Deadlock can be avoided if certain information about processes are available to the operating system before allocation of resources, such as which resources a process will consume in its lifetime. For every resource request, the system sees whether granting the request will mean that the system will enter an unsafe state, meaning a state that could result in deadlock. The system then only grants requests that will lead to safe states.[13] In order for the system to be able to determine whether the next state will be safe or unsafe, it must know in advance at any time:

It is possible for a process to be in an unsafe state but for this not to result in a deadlock. The notion of safe/unsafe states only refers to the ability of the system to enter a deadlock state or not. For example, if a process requests A which would result in an unsafe state, but releases B which would prevent circular wait, then the state is unsafe but the system is not in deadlock.

One known algorithm that is used for deadlock avoidance is the Banker's algorithm, which requires resource usage limit to be known in advance.[1] However, for many systems it is impossible to know in advance what every process will request. This means that deadlock avoidance is often impossible.

Two other algorithms are Wait/Die and Wound/Wait, each of which uses a symmetry-breaking technique. In both these algorithms there exists an older process (O) and a younger process (Y). Process age can be determined by a timestamp at process creation time. Smaller timestamps are older processes, while larger timestamps represent younger processes.

Wait/Die Wound/Wait
O needs a resource held by Y O waits Y dies
Y needs a resource held by O Y dies Y waits

Another way to avoid deadlock is to avoid blocking, for example by using non-blocking synchronization or read-copy-update.

Livelock

A livelock is similar to a deadlock, except that the states of the processes involved in the livelock constantly change with regard to one another, none progressing. This term was defined formally at some time during the 1970s—an early sighting in the published literature is in Babich's 1979 article on program correctness.[14] Livelock is a special case of resource starvation; the general definition only states that a specific process is not progressing.[15]

A real-world example of livelock occurs when two people meet in a narrow corridor, and each tries to be polite by moving aside to let the other pass, but they end up swaying from side to side without making any progress because they both repeatedly move the same way at the same time.

Livelock is a risk with some algorithms that detect and recover from deadlock. If more than one process takes action, the deadlock detection algorithm can be repeatedly triggered. This can be avoided by ensuring that only one process (chosen arbitrarily or by priority) takes action.[16]

Distributed deadlock

Distributed deadlocks can occur in distributed systems when distributed transactions or concurrency control is being used. Distributed deadlocks can be detected either by constructing a global wait-for graph from local wait-for graphs at a deadlock detector or by a distributed algorithm like edge chasing.

Phantom deadlocks are deadlocks that are falsely detected in a distributed system due to system internal delays but don't actually exist. For example, if a process releases a resource R1 and issues a request for R2, and the first message is lost or delayed, a coordinator (detector of deadlocks) could falsely conclude a deadlock (if the request for R2 while having R1 would cause a deadlock).

See also

References

  1. 1 2 3 4 5 Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 237. ISBN 9788126509621. Retrieved 29 January 2012.
  2. Padua, David (2011). Encyclopedia of Parallel Computing. Springer. p. 524. ISBN 9780387097657. Retrieved 28 January 2012.
  3. Schneider, G. Michael (2009). Invitation to Computer Science. Cengage Learning. p. 271. ISBN 0324788592. Retrieved 28 January 2012.
  4. Rolling, Andrew (2009). Andrew Rollings and Ernest Adams on game design. New Riders. p. 421. ISBN 9781592730018. Retrieved 28 January 2012.
  5. Oaks, Scott (2004). Java Threads. O'Reilly. p. 64. ISBN 9780596007829. Retrieved 28 January 2012.
  6. Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 239. ISBN 9788126509621. Retrieved 29 January 2012.
  7. 1 2 3 Shibu, K. (2009). Intro To Embedded Systems (1st ed.). Tata McGraw-Hill Education. p. 446. ISBN 9780070145894. Retrieved 28 January 2012.
  8. "Oracle Locking Survival Guide". Archived from the original on 15 January 2008.
  9. Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 237. ISBN 9788126509621. Retrieved 29 January 2012.
  10. 1 2 Stuart, Brian L. (2008). Principles of operating systems (1st ed.). Cengage Learning. p. 446. Retrieved 28 January 2012.
  11. 1 2 Tanenbaum, Andrew S. (1995). Distributed Operating Systems (1st ed.). Pearson Education. p. 117. Retrieved 28 January 2012.
  12. Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 244. Retrieved 29 January 2012.
  13. Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 248. Retrieved 29 January 2012.
  14. Babich, A.F. (1979). "Proving Total Correctness of Parallel Programs". IEEE Transactions on Software Engineering SE–5 (6): 558–574. doi:10.1109/tse.1979.230192.
  15. Anderson, James H.; Yong-jik Kim (2001). "Shared-memory mutual exclusion: Major research trends since 1986".
  16. Zöbel, Dieter (October 1983). "The Deadlock problem: a classifying bibliography". ACM SIGOPS Operating Systems Review 17 (4): 6–15. doi:10.1145/850752.850753. ISSN 0163-5980.

Further reading

External links

This article is issued from Wikipedia - version of the Tuesday, February 02, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.