Talk:Deadlock

From Wikipedia, the free encyclopedia

Contents

[edit] Deadly embrace?

Shouldn't this article have the term "Deadly Embrace" somewhere in it as an alternate term for Deadlock? --grr 23:12, 12 December 2006 (UTC)

[edit] Poor examples

An example of a deadlock occurs frequently in database products. Client applications using the database may require exclusive access to a given table, and in order to gain exclusive access they ask for a lock. If two client applications both attempt to lock the same table at the same time, neither will receive it, there is no general way to decide who to give the lock to. In this case both clients will wait for the lock forever.

Um, no. Just give it to one of the clients which will run and finish, then the other one gets a chance to run. I'm giong to replace this with a better example MatthewWilcox 12:55, 15 Dec 2004 (UTC)

Agreed; deadlock is impossible with only one resource, which is what that text describes. I'd guess the original writer had encountered a deadlock with row-level locking and misunderstood. PostgreSQL's documentation describes this.[1] I'd say this is a better example than two tables, because it's best to avoid contributing to the perception that databases use table-level locks internally. Specifically, by your text here:

(But this particular type of deadlock is easily prevented, e.g., by using an all-or-none resource allocation algorithm.)

are you proposing something like LOCK table_1, table_2 IN EXCLUSIVE MODE; at the beginning of the transaction? That severely limits concurrency. I'd say it's better to use a per-row example like the PostgreSQL one and suggest resolving it application-side in one of two ways:

  • using a consistent ordering
  • catch the "would deadlock" exception generated by the database server and replay the transaction

- Slamb 20:37, 20 August 2006 (UTC)

Diagree w/ Matthew & Slamb; In practice, most database systems can deadlock on what seems to be a single resource, and independent of granularity (table, row, etc.) This is due to the use of a reader-writer lock, used to improve concurrency. Unfortunatly reader-writer locks can introduce a deadlock condition sometimes called a conversion deadlock, where two threads both obtain a read lock on the resource, and then attempt to convert it to a write lock. In my experience, at least for databases, this form of deadlock is far more common in practice than a "classic" deadlock involving two seperate resources. I can come up with an example... --Burt Harris 15:42, 24 August 2006 (UTC)

Burt: I'm sold; I think your explanation belongs in the article. In hindsight, I think I've only avoided getting bitten by this because,

  • in general, I use fancy MVCC databases in the read committed isolation level. I'm cautious about changing things based on a non-repeatable read, so if I do a select then update, I always do "select ... for update". As a side effect, I never upgrade the lock.
  • the homegrown database I use at work doesn't even allow upgrading a lock from shared to exclusive. I should ask if this is why.

Maybe this should go in a new section? Even as common as you say it is, it's a little more complex and so maybe shouldn't be the first example. -Slamb 03:08, 25 August 2006 (UTC)

[edit] Link to Henry Wong

In the "External links" section, the link to Henry Wong, author of "Advanced Synchronization in Java Threads", leads to the wikipedia page of a fictional character from the Digimon series.

[edit] Definition

"In the computing world deadlock refers to a specific problem when two or more processes are waiting on the same shared resource."

Is this correct? AFAIK, two or more processes are not waiting on the same resource in a deadlock. If I'm correct, what would be the correct definition be worded like?

LjL 00:19, 19 Apr 2005 (UTC)

Yes you are right. I tried to rephrase it. Matteo
Thanks. Sounds quite OK now, although it might not be 100% correct in the case of more than two processes... but I doubt there is a short definition for that case. Hmm... what about stating the definition for two processes only, and then mentioning that it has to be extended to treat cases when more are involved? LjL 10:50, 19 Apr 2005 (UTC)
What do you say if we put a link to the section with the Coffman conditions? Something like: In the computing world deadlock refers to a state of the system where two or more processes are blocked waiting for a resource (see Necessary conditions for a detailed description). This is not perfect but maybe better that the previous one. What do you say? Matteo 11:04, 2005 Apr 19 (UTC)
No, I like your current definition better. I'd mix the two like this, "... when two processes are each waiting for the other to release a resource, or more than two processes are waiting for resources in a circular chain (see Necessary conditions)". Besides, the Coffman conditions are only necessary, not sufficient, and so do not make a definition, do they? LjL 11:54, 19 Apr 2005 (UTC)
Nice! I changed it ... Matteo 12:21, 2005 Apr 19 (UTC)

[edit] Article naming

Seems to me the term "deadlock" doesn't apply primarily to computing. The more general term 'deadlock' meaning any stalemate situation is probably more deserving of the article title, as long as it doesn't end up being just a dicdef. The other meaning of deadlock - a type of mechanical lock mechanism - should also go somewhere, I think. Should this be moved to Deadlock (computing) leaving the main title for a disambiguation page? Graham 11:52, 7 October 2005 (UTC)

[edit] Decidability of Deadlock Detection

The article said:

Instead deadlock detection and clean up is used by employing an algorithm that tracks the circular waiting and kills one or more of the processes such that the deadlock is removed.

This problem is undecidable in general, because the halting problem can be rephrased as a deadlock scenario.

This is not true, or at least confusing. Detecting a deadlock that has already happened is, given sufficient information about actors (threads, clients etc.) and the resources they have locked and requested in a blocking manner, simple. Telling whether a given program will or may cause deadlocks, on the other hand, is very hard, and in general undecidable.

Or, if that explanation doesn't suffice: Imagine a program that controls train traffic. When it contains a bad bug, trains may crash into one another. When a train wreck happens in this way, it's easy to detect: Just follow the smoke and the trail of ambulances. ;-) However, telling whether execution of the program may lead to a train crash before a train crash actually occurred is difficult (undecidable) in the general case. What people usually do in order to be sure is to write the program in such a way that its correctness (impossibility of train crashes) can be proven. This of course requires adherence to a subset of possible programs, since not all correct programs can be proven to be correct (-> Gödel's incompleteness theorem). Aragorn2 18:36, 16 March 2006 (UTC)

[edit] Another method for preventing deadlocks?

I'm not really an expert on concurrency and I've only skimmed over the current article, but it seems to me as though one of the methods the Linux kernel uses for avoiding deadlocks is missing from the article. My understanding is that for fine-grained locking, the Linux kernel requires that locks on resources always be obtained in a fixed order. For example, assuming that locks with a lower order need to be locked first, this means that a thread with a lock on resource 5 is not allowed to request a lock on resource 3 without relinquishing its lock on resource 5. Perhaps someone with a greater knowledge and better understanding of concurrency would like to correct me about this or add it to the article? - James Foster 14:43, 2 May 2006 (UTC)

The article (as of today's viewing) does mention this approach under "Deadlock prevention":

The circular wait condition: A process ranking may be imposed such that the highest ranking process will always obtain its resources, by preemption if necessary. A hierarchy may be used to determine a partial ordering of resources; where no obvious hierarchy exists, even the memory address of resources has been used to determine ordering.

though I say it's worthy of expansion. Specifically with regard to the Linux kernel, LWN has a good article.[2] I'm also not sure why this text says "partial ordering" - to avoid all deadlock in this way, there must be, for all sets of simultaneously-held locks, a total ordering. You can't just know class A locks come before class B locks; you need to know that instance A1 comes before instance A2. (The writer might have been trying to say that if locks A and B will never be held at the same time, you don't need to order them. But that's a different statement.) If I find some time, I'll try to dig up some papers on CiteSeer to use as references.

Also, I don't think the preemption sentence really belongs here. Giving priority to one process is a strategy to take after detection of circular wait requests, not prevention of them. And it certainly needs more explanation - if process A's already-held locks are released, there needs to be a strategy for rolling back the work it's already done, waiting for the other process to finish, then trying to replay process A's work later. -Slamb 21:13, 20 August 2006 (UTC)

You can't just know class A locks come before class B locks -- and why not?
Imagine we extend the pencil-and-ruler example in the article to a bunch of people trying to draw straight lines. As long as everyone agrees to grab class A locks (the rulers) before grabbing a class B locks (the pencils), then there will be no deadlock, right? The people do *not* need to know that instance "red pencil" comes before instance "black pencil", right? --70.189.73.224 20:15, 27 September 2006 (UTC)
The people do need to know "red pencil" comes before "black pencil" if they will be grabbing both specifically - "I need the red pencil", then "I need the black pencil" rather than "I need a pencil", then "I need another pencil". The former is the situation the LWN article is describing with its "locks that are allocated dynamically". The latter is the situation described in, say, the Banker's algorithm. (I think the LWN article writer would say that each resource type in the Banker's algorithm is a statically-allocated N-ary semaphore, and each lock type in Linux is N dynamically-allocated binary semaphores.) -Slamb 22:09, 27 September 2006 (UTC)

[edit] Merge?

Should this article be merged with the preemption (computing) and circular wait articles? (There is no hold-and-wait page, and the mutual_exclusion page probably contains enough information to be seperated.) Another possibility would be to make a seperate page for Coffman conditions and merge preemption, circular wait and mutual exclusion with that. MagiMaster 08:47, 10 July 2006 (UTC)

  • I agree for circular wait but not for preemption. Preemption is a general concept that should be defined on his own without necessarily speaking about deadlocks. Matteo 09:19, 10 July 2006 (UTC)
  • Perfect. Circular wait(one of the methods of deadlock prevention) is quite relevent. Definitely deserves to be merged with this article.
  • I'm OK with merging this with circular wait, but as Matteo has already stated, preemption isn't exlusively linked to deadclocks and most people would expect it to have an article of its own when searching for it. Denis Kasak 16:34, 15 August 2006 (UTC)
  • I'm with Matteo; agree on circular wait, since it's only ever discussed as a precondition to deadlock, but not preemption (computing). The latter is a topic in its own right, relevant to lockless systems. The usage of "preemption" here is talking about lock-breaking and rollback, which is a more complex, deadlock-specific subject. Slamb 21:20, 20 August 2006 (UTC)
  • Yes, I agree as well on both points. Circular wait does not belong as an independent topic. JWHPryor 14:45, 26 August 2006 (UTC)
  • Circular wait should be merged, no question. Preemption, though, should remain as is. « SCHLAGWERKTalk to me! 17:59, 10 November 2006 (UTC)
  • Agree--Yxen 22:32, 22 November 2006 (UTC)

[edit] Organization

I'm quite confused by the following sections, which seem to be primarily discussing:

  • Deadlock avoidance - resource manager detecting potential deadlock and raising error
  • Deadlock prevention - resource consumer avoiding deadlock in the first place
  • Deadlock detection - (1) resource manager/consumer post-detection rollback process, (2) static verification that a consumer is deadlock-free

I'd like to explicitly mention that there are often two players involved - the manager (perhaps a kernel or RDBMS) and the consumer (application), and then break them apart into:

  • Deadlock avoidance - consumer avoiding deadlock in the first place, passing mention of static verification
  • Deadlock detection - manager detecting deadlock as it happens
  • Deadlock recovery - consumer/manager rollback process

and keep the roles clear in the writing of each section. Comments? Agree/disagree? -Slamb 21:37, 20 August 2006 (UTC)

Hmm. This paper talks about the "prevention" vs. "avoidance" distinction and argues that it's erroneous. Apparently algorithms that require each transaction to start by declaring all the resources it might later acquire (like the Banker's algorithm) are typically called "avoidance"...except for some reason resource preallocation. Other algorithms (like a static ordering, which they argue that doesn't actually work with semaphores) are called "prevention".

What actually seems to set apart the prevention schemes from the avoidance schemes is that the prevention ones involve just the consumer; the avoidance ones involve the consumer giving the manager information it needs to consider "safe states" when scheduling resource acquisitions. I don't (yet) see any material that really gives a good definition of the two terms, though. Unfortunately, I don't have easy access to a lot of the books these papers reference. -Slamb 02:08, 28 August 2006 (UTC)

[edit] Neglected deadlock situations

I think many developers have the misconception that "resource" = "mutex", and that's not necessarily true. I'd like to add counterexamples to the page. I'm trying to avoid original research, though. Are there good citable examples around?

Here's a situation that came up in my own work:

  1. My Python graphing program writes to a gnuplot control pipe.
  2. gnuplot writes a bunch of stupid warnings to stderr, a pipe back to my Python graphing program.
  3. gnuplot blocks at PIPEBUF bytes, waiting for my program to consume them.
  4. My program blocks at PIPEBUF bytes, waiting for gnuplot to consume them.
  5. Deadlock.

I think it's a good counterexample. No locks involved and furthermore, the program that acquires a release isn't even the one expected to release it! The solution is also interesting - I essentially removed one of the shared resources by making gnuplot write to a temporary file, which my program examines after the fact.

Another neglected situation is a deadlock where a single context of execution has both resources and is waiting for itself. This can sometimes happen in asynchronous designs. I can come up with an example of this, but again it'd be from my own work, not something citable. -Slamb 21:56, 20 August 2006 (UTC)