Talk:Command-query separation

From Wikipedia, the free encyclopedia

Command-query separation is a former featured article candidate. Please view the links under Article milestones below to see why the nomination failed. For older candidates, please check the archive.
March 1, 2004 Featured article candidate Not promoted

[edit] older entries

I'm not all that comfortable with the discussion of race conditions. I don't see how CQS reduces race conditions; if anything, it adds new ones. Avoiding race conditions wasn't one of the goals of CQS, and I don't see how it achieves that. At the very least, I think an example is warranted. --P3d0 00:23, 12 Aug 2003 (UTC)

No response in over a week, so I have removed the paragraph in question. For the record, here it is:
For example, eliminating value returns from state modifications avoids the possiblity of race conditions between old and new states. This may at first seem to over-complicate the program by requiring two methods to be used (with explicit synchronization in threaded systems) where before a single method would suffice, but in practice this actually makes explicit hidden assumptions about data dependencies which would otherwise be invisible to the programmer.
Anyone who wants to re-insert it ought at least to provide an example, but ideally should describe this in more detail. --P3d0 14:15, 21 Aug 2003 (UTC)

I just took a look at the current race condition coverage and I am not convinced that it is accurate. The current coverage states that breaking CQS by having one method change state and return a value will prevent a race condition. I don't believe that is the case, at least not with most programming languages. In the way most languages are interpreted or compiled and run in a multi-threaded environment, the thread scheduler can interrupt one thread at any point, even if the interrupt happens in the middle of a function call. My example is that your non-CQS function could get interrupted after it changed the state but before it returned the value. (In Java, the synchronized function modifier may ameliorate this partially, but that is a feature of one language, and even it is not fully safe.) Thus, your non-CQS function is just as susceptible to a race condition as a CQS function. Therefore, I believe that the original paragraph was correct in saying that CQS was superior by forcing you to handle all assumptions about data dependencies and synchronization explicitly. 66.92.232.102 17:07, 27 August 2007 (UTC)

Here's my thinking. Local variables are generally not susceptible to race conditions, because other threads can't access your stack frame. That makes them a handy way to avoid a race conditions:

set_field( new_value ):
  old_value = this.field;
  while compare_and_swap( this.field, old_value, new_value ) == false;
  return old_value;

The caller of this function is guaranteed that the returned value is what was in the field immediately before new_value was put there. I can think of no way to write this function in a CQS-compliant manner without locks. Can you? --P3d0 22:48, 27 August 2007 (UTC)

The example feels contrived. Nevertheless, I take your point is that the compare-and-swap atomic operation is a non-CQS routine, but it is a well-known hardware and operating system routine. I suppose that the answer is that all engineering is a trade-off. For the sake of code correctness and maintenance, you should strive to make all assumptions in your code explicit, rather than hiding them behind layer after layer of semantics. In general, if there is a data dependency between two or more values, this should be expressed with a lock. However, in some relatively rare circumstances the need for better performance justifies sacrificing some maintainability of the code. Hint: if you are not writing an operating system microkernel, enterprise-grade database engine, video game graphics engine, a massive scientific simulation, or a very resource-constained embedded program, you probably do not need to be doing this.

Furthermore, it is important to note that the atomicity of the non-CQS compare-and-swap function is only guaranteed by a hardware instruction. If we make exceptions for the handful of operating system routines that are atomic and non-CQS, we find that writing your own non-CQS functions does not guarantee atomicity. Using your example, your functional requirement is to read an old value from a shared variable, and replace it with a new value, and you have a data dependency between the old value and the new current value of the shared variable. To meet these functional requirements, two post conditions must hold true at the end of your function for all possible interleaving of threads: 1) the return value is the last value of the shared variable before the update occurred, and; 2) the value of the shared variable equals the value of new_value. Without a lock over the function call, your function still has a race condition! True, the compare_and_swap operation is atomic and the surrounding loop is guaranteed to replace the most recent old_value with new_value. (Except for a pathological race condition in which the thread is interrupted every time at the beginning or end of the loop and the shared variable is always changed, causing an infinite loop; such an interleaving feels contrived so I am giving you the benefit of the doubt.) However, what if the thread is interrupted immediately after the compare-and-swap has returned true but before the function returns? In that case, the value of the shared variable will not be new_value, but something unexpected.

Solutions: 1) Use CQS and explicit locks. Your data dependencies are explicit for all developers to see, or; 2) decide that performance trumps maintainability and use compare-and-swap directly, without wrapping it in a function. Note: It seems likely that if you truly have a data dependency between old_value and the new current value of the shared variable, you are going to end up performing several more operations on them, which usually entails locks. In conclusion, I still feel that the assertion that CQS reduces race conditions by forcing programmers to think explicitly about data dependencies and locks is warranted, with a few performance-related exceptions. 66.92.232.102 10:47, 28 August 2007 (UTC)

[edit] "devised by"

While it may be true that "it was devised by Bertrand Meyer as part of his pioneering work on the Eiffel programming language" (citation would be nice), that makes it sound like a great whole new idea. I doubt that it was so "pioneering" as that suggests. Even Pascal had the notion and separation of procedure (command) and function (query), though it likely didn't spell out that functions shall not touch the inner state (or maybe it did, don't have formal docs), and the third paragraph of the texts outlines nicely that it's quite analogical to the goto/control flow issue.

[edit] multi-threading issue

The examples fail to address what the issue is. It would be helpful if it stated explicitly what the pattern tries and why it fails.

AFAICS the pattern is (trying to be) a unique-key generator, i.e. subsequent calls of the function "value()" shall return unique values. The first attempt "increase_x_and_return_it" is no more or less wrong in a multithreaded case than the last one (which is identical apart from syntax).

More specific, there are four operations:

  • 1. Get global key X.
  • 2. Increase local copy of X' to (X + 1)'.
  • 3. Store back (X + 1)' to X.
  • 4. Return (X + 1)'.

The article is more confusing than clarifying on the topic that steps 1-3 must be executed without interference, on the contrary it somehow gives the impression that step 4 matters - which doesn't.

The above comments in 'older entries' don't help much either:

  • 1. The compare_and_swap is IMHO rather pointless, because it actually just does the same thing the other examples do, but hidden behind a fancy name and working "correctly" (miraculously without explanation that it has to do the same amount of locking steps 1-3, programmers will know that it's *meant-to-be* an atomic operation on CPU level, but this is not an article for programmers who already know...)
  • 2. The compare_and_swap is even more wrong if you see it from the keygen angle: It again focuses on the notion that step 4 is the problem which it isn't. (Aside from that, it behaves unspecified if old_value is not equal to this.field (because another thread B executed the whole operation between thread A going from old_value := this.field to while compare_and_swap.)