Talk:Quicksort

From Wikipedia, the free encyclopedia

Archive: /core implementations

Contents

[edit] (Older discussions)

Quicksort can actually be done in O(n log n) time worst case, by carefully choosing the pivot - the algorithm to do so is a bit complex though. See http://www.comp.mq.edu.au/courses/comp333/Lecture/divide-and-conquer_4.pdf for details. -- SJK

Just choose the true median as the pivot. You can find the true median in Θ(n) time, and since partitioning takes Θ(n) time too, you are not increasing the complexity. Since the median guarantees perfect halving, you have O(n log n). You definitely increase the constant factor, though. Also, finding the median in Θ(n) time is quite complicated. -- Hari
For the record, this is mentioned in the Relationship to selection section. Derrick Coetzee 17:11, 17 Sep 2004 (UTC)

One simple way to avoid worst-case of sorting sorted input is to add a loop before Quicksort to check if the elements are already sorted. This addition takes only O(n) time and does not lead to change in the overall complexity of O(nlogn). Ashvin


The article uses list to mean sequence of elements. But the terminology is confusing because quicksort does not work on linked lists. I think list should be changed to array everywhere in the article. -- Arvindn

Er, QuickSort works just fine on linked lists, as long as you write your implementation using linked lists and not arrays... Or so I'd think Chinju

Yep. The key operation is "partition", which is essentially an operation to split a queue/stack/bag into two queues/stacks/bags. You can define a nice partition function for linked lists. -- Hari
Quicksort actually is particularly suitable for doubly linked lists, because it only accesses elements sequentially (although in both orders). It doesn't even require any extra storage. For singly linked lists though mergesort is probably the way to go. Heapsort doesn't work on linked lists at all. Perhaps this is worth adding to some comparison sections. Derrick Coetzee 17:11, 17 Sep 2004 (UTC)

This so-called "realistic data" in which one needs to use a random number generator to ensure proper performance... What kind of data would that be? Data that's already been pre-sorted to make things hard for you? It seems to me this is business about using a random number generator is the result of inaccurate thinking about probability rather than a realistic concern (except, of course, in the case mentioned in the article where the data is fed to the algorithm by an adversary). Does anyone mind if I remove most of the discussion about random number generator usage?

Also, the article mentions that Quicksort can be optimized since every element needs only to be compared against the (mostly unchanging) pivot element ("The inner loop which performs the partition is often very amenable to optimization as all comparisons are being done with a single pivot element"). What exactly is meant by this? How exactly can one take advantage of that situation for optimization? Just wondering...Chinju 21:53 Mar 6, 2003 (UTC)

I adjusted the random pivot wording based on real-world knowledge that lists are often partially sorted and random isn't the best for that situation. Random is fine if the data isn't partially sorted but humans don't have to be purists when optimizing algorithms.:) The seldom changing pivot element may let it be held in a register or improve the locality of memory references. JamesDay 15:46, 22 Oct 2003 (UTC)~



"This works well in practice but chance can still place the lowest element in the middle and cause worst case performance."

Unclear. Putting the list's lowest element in the middle would not cause worst case performance, or even make it worse than O(n log (n)). If it kept happening at every level of the recursion, that would cause worst case performance all right, but that's very unlikely, and is a possibility for any quick and simple way of choosing a pivot.

I agree. I've reworded that and added a bit to the one I described as "best", since that does have higher overhead and can't always be best. In the rewriting I did I was trying to get away from the over-emphasis of the performance with attack data and left in a criticism which middle element choice didn't deserve so much in the general case. JamesDay 12:47, 28 Oct 2003 (UTC)



Sorry about the unnecessary alphabetizing. I must be going blind since I completely overlooked the intro to the section that the explained ordering of algorithms. Richss 03:17, Sep 8, 2004 (UTC)

Frankly, I found the ordering by length of code peculiar, considering most of the shorter examples are in non-mainstream languages and aren't efficient in practice. In particular, 5 different functional language examples may be overkill. Perhaps alphabetical would be less biased. Derrick Coetzee 03:49, 8 Sep 2004 (UTC)
  • 5 different functional language examples may be overkill —actually, the first eight samples are written in functional languages... nevertheless, the first four are all fundamentally different from each other (even the Miranda and NGL entries, which look somewhat alike, are really different: the former is value-level, the latter function-level). The Erlang, Haskell, and Miranda entries are indeed completely analogous, though (however, which to leave out and how to justify the choice? for then we should also choose between the Java and C# implementations) The OCaml and CL entries are closely analogous in spirit to those three, but expressed very differently. Regarding efficiency, some of those first eight implementations are not only clear, but very fast. If speed were the only concern, then the right example would be in assembler.
Anyway, the article is about the algorithm not about popular or mainstream prog langs, and this provides some support to having the shortest implementations come first. Alphabetical order is best when political correctness or the like are in order. IMO, this is not the case: it is not a matter of giving preference to one or another prog lang, it is a matter of keeping the interests of the reader of the article above everything else. This is, in fact, one reason for moving the samples section to the end of the article, right past the External Links, See Also, and References section. With some of the longest samples among the first, an alphabetical ordering makes it very hard for the article reader to realize that there are vastly different ways of expressing the quicksort algorithm in different prog langs. It becomes Just imagine if someone comes with an assembler implementation that takes 200 lines... Furthermore, the shortest-first arrangement is more informative than the alphabetical one in the sense that with the latter it is very hard for the user to find out several things that can be readily glanced with the former: which implementations are alike, wich is the shortest one, etc. If there were dozens of implementations, then the alphabetical listing would indeed come in handy in order to locate a particular one, but this doesn't apply for small numbers. — danakil 05:06, Sep 8, 2004 (UTC)
My thought had not been about somebody reading the article for pleasure (whom you wish to keep interested). Rather, I was thinking about what someone like myself would be doing, which is using it as a reference (encyclopedia) to see an example in whatever language I am using.Richss 12:04, Sep 8, 2004 (UTC)
  • I'm sorry, Richss. I hadn't intended to convey the idea of reading for pleasure (although now that you mention it, it doesn't seem to be a goal to strive to keep the reader entertained). By keeping the interests of the reader [above all] I certainly meant to tender to the interests of those who use Wikipedia as a reference tool.—danakil 16:24, Sep 8, 2004 (UTC)
It's true that efficiency is not the main concern, and I concede that the functional examples are sufficiently different, but can you really call a sort which isn't O(n log n) at all quicksort, even if it makes the same exchanges in the same order? (referring here to the example using list comprehensions - I'm not sure how the others work out) I agree that the number of examples is so far small enough that it's easy enough to spot your favourite language in the listing. Derrick Coetzee 14:25, 8 Sep 2004 (UTC)
  • I agree with you: it seems reasonable to require that a sort be actually O(n log n) in order for it to be called quicksort. Nevertheless, I'm sure there would be quite a few moors entrenched in that coast. Furthermore, there are quite a few different Haskell implementations, I wonder if all of them suffer from the same problem (I've heard several people claim that Miranda is much faster than the common Haskell implementations—Miranda is a commercial product while Haskell's use is mainly academic). On the other hand, the J,NGL,OCaml entries are, IME, certainly O(n log n), though. And I would be surprised if the Erlang and CL ones weren't. One final remark about performance: if someone was implementing production quicksort using Java, C, C++ or C#, he/she would likely go the non-recursive route. It would be nice to have an example of that. — danakil 16:15, Sep 8, 2004 (UTC)


Just out of curiosity, can somebody provide an explanation (possibly just a link to an explanation) of why the list comprehensions make the Haskell implementation of quicksort have an average case complexity of Theta(n^2) instead of Theta(n log n)? Thanks. Chinju 05:28, 26 Sep 2004 (UTC)

Doesn't seem Theta(n^2) to me. It is inefficient since partitioning that way means traversing the list twice, but it's still a Theta(n) operation, as well as the performed list concatenation (which could also be avoided). So the final complexity just depends on the number of times you'll perform those operations (which depends on how the partitions turn out). --jadrian 00:35, 29 Sep 2004 (UTC)


Yeah, it seems Theta(n log n) in the average case to me, despite the inefficiencies you pointed out, which is why I was curious as to how the quadratic complexity was derived. If I can't find any explanations soon, I think I'll remove the notice about being "O(n^2) on average". --Chinju 19:38, 30 Sep 2004 (UTC)

I don't oppose this change. The original statement probably had to do with excessive delay of computation in Haskell implementations without memoization, in that if you fully expand out the quicksort expression for a given Haskell list before evaluating it, you end up with a very large expression. Just a guess though. Derrick Coetzee 20:32, 6 Oct 2004 (UTC)

"Note that the partition procedure only requires the ability to traverse the list sequentially in both directions; therefore, quicksort is not confined to operating on arrays (it can be used, for example, on doubly-linked lists)."

No, it does not require that. Doing it with singly linked lists is trivial. --jadrian 23:34, 16 Oct 2004 (UTC)

You're right, this isn't hard, since the order of elements in the resulting sublists is unimportant, allowing you to build them both backwards. I'll update the comment. Derrick Coetzee 04:40, 17 Oct 2004 (UTC)

[edit] Capitalization

Perhaps this is a stupid question, but is quicksort capitalized (Quicksort) or not (quicksort)? What does the original paper do? What's common practice? Derrick Coetzee 04:48, 17 Oct 2004 (UTC)

A well-implemented quicksort requires at most lg(N) partition boundaries on the stack. Instead of recursing for both left and right partitions, recurse for the *smaller* partition, then "adjust" the partition boundaries at the current call level and loop. For N, say, one billion, the depth of recursion will not exceed 30 (as at each level of recursion the number of elements is at most half what it was at the preceding level). One-sided recursion is an *old* idea.

Pseudo-C Pseudo-code to illustrate the point:

quicksort(array, left, right) //say right-left+1 = Z, then... {

 while (left-right>threshold)
 {
   pivotPosition = partition ( array, left, right);
   if (pivotPosition-left < right-pivotPosition)
   {
     quicksort(array, left, pivotPosition-1); //at most Z/2;
     left = pivotPosition+1;
   }
   else
   {
     quicksort(array, pivotPosition+1, right); //at most Z/2
     right = pivotPosition-1;
   }
 }
 insertionSort(array, left, right);

}

-- james_barbetti@yahoo.com

If you read a little further in the article you'll find a description of partially tail-recursive quicksort (which is what you describe) including pseudocode. Deco 23:59, 20 Nov 2004 (UTC)

[edit] My recent drastic changes

I moved almost all of the examples from this article to quicksort implementations; I think they were getting far too numerous, exceeding the rest of the article in size by multiple times. I attempted to leave behind a small set such that no two were too similar, the popular languages were mostly covered, and most of the basic implementation techniques were covered. I also extracted some of the introsort discussion to an article on introsort, not because it was too long but because I'd like to see introsort expanded in the future and it wasn't poised to do that as a section in this article. I've probably upset some people whose favourite language was moved out, and I removed some I rather like myself, but I think it benefits our readers to keep the focus on the quicksort algorithm itself. Deco 00:54, 21 Nov 2004 (UTC)

[edit] changing Θ() to O() ?

I see someone changed a bunch of Θ() to O(). Is this correct, or a misunderstanding of the debate at Talk:Big_O_notation ? --DavidCary 07:15, 15 Dec 2004 (UTC)

It's called Big "O" Notation, but I thought it was technically written as Θ().--НиколайTalk 00:55, 27 September 2006 (UTC)

[edit] C implementation

Has anyone tested the C implementation? There seems to be a bug in the algorithm, and the swap function won't work, as C uses call by value.

Swap could be implemented as a macro — it's really acting as pseudocode there. I'll have a look at the rest of it later. Deco 17:52, 28 Dec 2004 (UTC)

The C implementation does not distinguish between pivot keys and pivot indices. The implementation shown works, but only because both keys and indices are integers.

numbers[left] = pivot // pivot key
pivot = left          // pivot index

To clarify and ease adaptation to other key types, I think the pivot index should be named pivotIndex, i.e. the algorithm should read

void q_sort(int numbers[], int left, int right)
{
  int l_hold = left;
  int r_hold = right;
  int pivot = numbers[left];
  int pivotIndex;
  int t;
...
  numbers[left] = pivot;
  pivotIndex = left;
  left = l_hold;
  right = r_hold;
  if (left < pivotIndex)
    q_sort(numbers, left, pivotIndex-1);
  if (right > pivotIndex)
    q_sort(numbers, pivotIndex+1, right);
...

--Schmid 09:42, 13 March 2007 (UTC)

[edit] C++ Code

The C++ code was taken from [link removed; blocked by spamfilter —Ruud 20:22, 23 April 2006 (UTC)] without notification of the original author. That's very bad style and shouldn't happen on wikipedia. I removed it for now, hoping someone comes up with another version.

[edit] That partition algorithm seems wrong

The partition algorithm given seems wrong to me. If the first item is larger than the pivot, for example, it swaps it with itself and then moves on. Am I missing something, or is this just a mistake? RSpeer 20:57, Feb 21, 2005 (UTC)

I added some statements to try to help address your confusion. The swap is only done if the pivot is less than or equal to the pivot (not greater), but it might so happen that an element larger than the pivot is swapped into a position preceding the pivot's final position. The trick is that an element may be moved multiple times before reaching its final place, so this is okay. I hope that makes some sense. Deco 02:24, 24 Feb 2005 (UTC)

It seems that other people are confused by the partitioning algorithm as well. (I guess the frustrated anon defacing the page would be one example.) What is the advantage to this algorithm, and why don't we use the one from CLRS? RSpeer 03:07, Mar 21, 2005 (UTC)

[edit] updated the C code

i added a little more code to the C example.

i felt that the original example provided on the wiki was elegant and good for explanatory purposes, however, i felt it would be prudent to include a C example that could be copy-pasted into a source file.

i added an optimization to the original code's partitioning logic because i noticed it copied a ton of elements unnecessarily.

i also pulled a variable out of the if() block and made it static to save some memory.

using the first element as the pivot is a clever, elegant solution to the pivot selection problem; however, i wanted demonstrate that this solution could be adapted to an arbitrarily selected pivot very easily. hence, i used a bit of 'forbidden' preprocessor magic to keep the code usable while still conveying this point.

hope my edits are okay with everyone and thank you all for this page, it's very informative!

i cleaned up the C++ code (made the pivot more clear) and added an optimization from the C code.


[edit] Partitioning statement seems wrong

This statement looks wrong to me:

Because the above pseudocode procedure will sometimes swap elements which are equal, it is slightly unstable.

To my knowledge, there is no known in-place stable partitioning algorithm (and I've looked). The issue is not merely that it sometimes swaps equal elements (which is easy to fix), but that it can swap an element over an element which is equal to it. Consider this list:

7, 7, 3, 1, 4

The very first exchange would exchange the 4 with the first 7. Now, the first 7 is following the second 7, where before it was preceding it. This is an unstable exchange.

Of course, if you partition not-in-place, by building a list of those less-or-equal and those greater, each preserving the original list order, then the sort is stable.

In short, I'm deleting it, because it's wrong. :-) Deco 06:11, 4 Mar 2005 (UTC)

[edit] in-place sort

For the edit that was just made (deleting "it's an in-place sort"): the definition of in-place sort was wrong. Whoever wrote it said that it meant that elements already in their proper place would not be moved. That is completely false: in-place sort means that the algorithm has O(1) (constant) memory requirement. That's on the page now; everything is now as it should be regarding that. Decrypt3 18:56 EDT, 4 April 2004

When Deco claims "there is no known in-place stable partitioning algorithm", I'm pretty sure that *includes* Quicksort. If Wikipedia is going to standardize on a definition of the term In-place algorithm that excludes Quicksort, we need some other term to describe algorithms like Quicksort and Tree traversal. What term would you suggest? What do you call algorithms that are not in-place" ? --DavidCary 05:03, 13 Mar 2005 (UTC)
The partitioning step of quicksort is in-place. It's the recursive step that isn't. RSpeer 06:56, Mar 13, 2005 (UTC)
Yes, my comment was meant to emphasize that most in-place partitioning algorithms are unstable, and most stable ones are not in place. Most variants of quicksort, however, are not in-place because they use a stack of depth at least log n (in fact, if the values being sorted may be arbitrarily large, it requires log2n space, where n is the total number of bits). The distinction between additional storage on the call stack and explicit additional storage is really a purely artificial one created by hardware and programming languges. You can easily write a version of quicksort that makes no recursive calls but manages an explicit stack of O(log n) size.
The closest thing I can think of for describing the class of algorithms you refer to is DSPACE(log2 n), which is the class of problems that can be solved in space log2 n, where n is the total number of bits in the input. Any algorithm that has log n recursive depth and log n bits in each stack frame lies in it, including quicksort and recursive tree traversal. Tree traversal has a number of simple in-place versions, though. Deco 07:59, 13 Mar 2005 (UTC)
To clarify, I refer here to the partially tail-recursive variant of quicksort, and to tree traversal of a complete binary tree. Other cases can require more space. Deco 08:10, 13 Mar 2005 (UTC)
Some additional comments: quicksort is not a partitioning algorithm. It is a sorting algorithm. It uses partitioning as a subroutine. Algorithms that aren't in-place are sometimes called not-in-place, and I don't know of any briefer name. Deco 20:35, 13 Mar 2005 (UTC)

[edit] Capitalization

This article uses both "Quicksort" and "quicksort", in a hideously inconsistent way. Which way should it be? - Fredrik | talk 09:47, 21 Mar 2005 (UTC)

I've asked before — no one can seem to decide. I would vote we do it however Hoare did it, but unfortunately I have no access to his paper. Mathworld seems to use lowercase quicksort however, and this seems nicer to me and more consistent with other sort names. I'm leaning towards quicksort if there is some consent. Deco 04:54, 29 Mar 2005 (UTC)

[edit] Average complexity

The average complexity given can't be right. Quicksort is O(n log n), which makes Θ(1.39 log n) impossible. --P3d0 15:27, May 12, 2005 (UTC)

[edit] Could anyone write this addition on the article?

[This information has been removed]

I'm afraid this violates our Wikipedia: No original research policy. Deco 19:27, 20 May 2005 (UTC)
You could have informed me before. It took me two hours writing that explanation in english. Oh, well. I thought this would be the perfect page to release the optimization for everyone. I'm going to remove the C source code, too. Thank you, anyways.
Who could have informed you before? --P3d0 23:51, May 20, 2005 (UTC)
Hey people, isn't a kind of harsh to delete the post and just tell a newbie that wikipedia is not a place for original research? While this cannot be put to the article, we can suggest that he (or possibly she) post it to Usenet or something else. Also, I don't think we have to remove the post. We tend to keep this kind of somehow irrelevant posts at the talkpage. See Japanese war crimes for example. Finally, I think P3do, your comment is a little too unhelpful or unfriendly; apparently, this guy is upset. I mean aren't we allowed to have little more fun and friendly yet irrelevant conversations in the talkpage?
Since his method appears to be neat, I restored it to my user page. See User:TakuyaMurata/Barcelo's method. I am very curious if this really results in actual improvement. -- Taku 00:27, May 21, 2005 (UTC)
He removed the information himself after being told that it violates the NOR policy. Fredrik | talk 07:13, 21 May 2005 (UTC)
I don't get where he stores his 'blank' space. Furthermore, I don't see how it's any improvement over swapping. -- AllyUnion (talk) 07:09, 21 May 2005 (UTC)
I'll try to explain it a bit better. As you have to take one element from the list - the pivot - to store it in a variable/register in order to make the comparisons, you can use the position it occupied as a 'blank space'. The pivot is already stored in a variable, so you only have to return the pivot to the 'blank' space at the end of the algorithm, and use it meanwhile.
This is very helpful if the 'blank' space is at the start (or end) of the list, because then you can search the list from the other side, and when you find an element which is lower than the pivot, copy it to the 'blank space' directly, overwriting it. After you copy an element, it becomes the new 'blank space', so you can start searching from the other side.
It's important to note that the 'blank space' stores old info until it's overwritten, so this might cause that an index moves too far, even outside the limits of the array. However, this problem can be fixed without losing efficiency.
This requires much less reading/writing than the swapping function, which must preserve the values of both elements, and thus needs to store one of them in a 'temp' variable. This improvement should be more noticeable as the size of the elements increases.
Hmm... my version of understanding of quicksort was that the pivot was stored in the list... and that the only temporary variables you had was the index positions which were passed recursively. -- AllyUnion (talk) 07:52, 25 May 2005 (UTC)
Sorry if it looks confusing, but it's already confusing to explain it in my own language. -- DrJones 10:28, 21 May 2005 (UTC)
Could you please change the name to Hoyos' method? Barcelo is my second surname, which works much like the middle initial in english. -- DrJones 10:28, 21 May 2005 (UTC)
I already put an encouraging comment and suggested he post it to Usenet at Talk: Quicksort implementations. I have nothing against the guy, but of course we're not putting original research in the article. It's still cool that he wrote it up though and I'd encourage him to pursue these ideas and show them to people. If it does end up in a paper somewhere, we really could add it then. Deco 23:33, 21 May 2005 (UTC)
My bad :) As Fredrik pointed out, that was all my misunderstanding. Deco, I should have known what kind of person you are by now. Also I missed Talk:Quicksort implementations. I was unaware of that article. -- Taku 23:49, May 21, 2005 (UTC)

Just a detail. For test purposes, sort routines are shown sorting an array of simple integers or the like. In usage, there is in general a data aggregrate (a "record", perhaps, in other terminology) being sorted on the basis of some aspect of the aggregrate, the key, which might be an integer (or a string of integers, or a text string, or whatever) which has its value stored in the pivot variable v (as distinct to using an index to the location in the array: the values in the array are going to be moved). Well, if there is a plan to take advantage of the "space" available by copying out the pivot's key value, this space is only fully available if the associated data is copied out as well. Having a pivot key whose value is not necessarily one of the values in the array being sorted but one chosen via computation (if the keys are amenable to computation) such as PivotKey:=(A[Left].Key + A[Right].Key)/2 or otherwise are further variants. NickyMcLean 21:38, 9 October 2006 (UTC)

[edit] Introductory pseudocode change

I tried to rewrite the first pseudocode to be easier to understand, per Fredrik's comments at Wikipedia:Featured article candidates/Quicksort. I based this version on the Haskell version, but I avoided using list comprehensions or funny things like that, expressing it using more familiar imperative constructs and some English. I moved the other version into a new section and added explanation for the in-place partition algorithm, which is really rather subtle. Deco 03:09, 25 May 2005 (UTC)

I have to say, i don't find the pseudocode any easier to read than the C version - in particular it seems extremely verbose for what is really a very simple algorithm. Further, i would imagine that most people have more experience with a C-like language. Is the pseudocode merely designed to mimic that found in Cormen + Rivest (which benefits from much nicer formatting than here) or is there some other reason for having it? --Shashmik11 12:52, 13 Jun 2005 (UTC)

The idea of pseudocode is that you can understand it without knowing what language-specific symbols like & mean. Sure, it's perfectly natural to you, because you know C. But actually, lots of people are now learning to program in languages other that C. RSpeer 13:28, Jun 13, 2005 (UTC)

I accept that some people might not know C, but the idea that the pseudo-code shown is free from language-specific details is wrong - it seems to be some odd variant of algol. I still contend that code written in this way isn't actually any more readable, especially things like select a pivot value which should really be typeset differently from the rest of the code. All in all i think the pseudocode would be much better if it wasn't typeset using a fixed width font, but maybe that's just me. Shashmik11 14:45, 13 Jun 2005 (UTC)

I agree it should ideally not be typeset in this way. However, typesetting it properly makes it impossible to edit, which I think would be against the wiki spirit. The point of the change was not to make it easier to read but easier to understand - the typical C version uses an in-place partition algorithm that is, while simple, not as obvious as the out-of-place partition algorithm used in some functional implementaions and the first pseudocode example. The language/syntax is largely irrelevent, as long as it doesn't rely on symbols that are unfamiliar to a large portion of programmers (like the typical functional language example). Incidentally, I've heard people compare my pseudocode to Pascal, ML, C, Python, and now Algol — I really wish they'd make up their mind which language I ripped off. Deco 04:18, 14 Jun 2005 (UTC)
Point taken. Maybe wikipedia should get some sort of pseudo-code renderer in a similar manner to the way equations are rendered. How would one campaign for something like this?--Shashmik11 11:41, 18 Jun 2005 (UTC)
That'd be cool. I'd start with Wikipedia: Village pump (technical), since this would be a feature request. You might want to talk somewhere on meta about it, since it affects all Wikimedia projects. Somehow I imagine, though, that you might have trouble garnering support for something with an "easy workaround" (doing it how we do it now). Deco 00:33, 19 Jun 2005 (UTC)
The pseudo code is not understandable. I'm a good C programmer, and I can't figure out what the hell the iterative one means. Its such an ugly mess, and it has smart-ass comments that don't help at all. It doesn't even attempt to explain what a two-argument Push() is supposed to do - tho I think it loads all the values from a range.. Fresheneesz 18:03, 4 November 2006 (UTC)
This may come as a surprise to some, but programmes can be written in computer languages other than C. The pseudocode is (was) a slightly edited version of a programme written in Pascal, and the special point is that being from an actual source file of a working programme, it is known to work, rather than being some hands-waving generalised description that alas proves vague to someone actually using it to prepare code. Removed from the source was stuff that related to the animation of this sort method on a screen and certain other details relating to accounting of compares and swaps. Yes, I erred in failing to fully describe exactly what I meant by Push(a,b) and Pop(a,b): given languages such as Hesketh, intensely tricky things might be triggered thereby, but in fact they are simple enough, so simple that I failed to imagine the necessity for explication since I thought the notions of push and pop well-known. I agree that Hesketh code is odd, due to unfamiliarity (hah, try APL!), but the author has explained its operation. Transliteration to a pseudo-C may soothe some fear of the different, at the usual price of drivel trailing around the page. It is beyond me why a simple statement such as if L2 = R2 then swap(a,b); is thought better shown sprawling across three lines. And one of the brilliances of C is the difficulty over using =, thus the introduction of "equals". Capitalisation of some variables but not others had better be done carefully as some compilers distinguish the two forms. Avoiding such trivial mistakes is one reason for starting with a working source file. Removing "smart-arse" (not ass: read Chaucer) comments may sooth frustration, but also removed are comments noting that a one-element span needs no further consideration, that the scan and swap process is striving to attain a certain condition, and hints that a few "go to" statements might reduce wasted repetition. Comments encased in {...}, the odd Pascal style, may well cause cognitive dissonance to a C-monoglot, but //(space) is just flabby... I have a version of QuickSort that does not use a pivot element but the reference I have gives a source code (not Pascal, and shock awe, not C) featuring a number of "go to" statements that are difficult to finesse with concealments such as "while true do" and the like especially if wasteful repetition is to be avoided. Delicate minds might be upset at the sight.
I agree that many typefaces show damn all differences between the glyphs for Il1, but the courier typeface for the fixed-spacing components does well enough. Fixed-spacing typefaces allow indentation and alignment of code pieces to be presented without difficulty, in the absence of proper typesetting arrangements. NickyMcLean 21:01, 5 November 2006 (UTC)

I see that months after you C-favouring fellows insisted on converting the pseudocode of the iterative version to C there are still mistakes in your pseudocode being fixed (the latest being due to typography yet again: misreading l and 1 is a favourite), whereas the decried Pascal/Algolish pseudocode was taken from the source of an actual working programme that had been checked. But C is so wonderful that the rest of us have to endure recondite gibberish such as { and } scattered across the page. So is the new pseudocode correct yet? I don't know. Prior to the latest change, the enthusiasts would have answered yes to that, and they would have been wrong.

So here is a stripped-down version of a Pascal programme that does work, that (being Pascal with its rigid typing of parameters) has the array A to be sorted for elements first to last as a global. Not explicitly declared is its type which might be some sort of data aggregate, which aggregate contains a component of type SortStuffKeyType that defines the type of the component Key upon which the ordering is to be based, and likewise, a procedure Swap to interchange two elements of the array A whose precise details surely need not be explicated.

Procedure QuickSort(First,Last: longint);
 var v: SortStuffKeyType; 
 var sp: integer;
 var l,l2,p,r,r2: longint;
 type pair = record l,r: longint; end;
 Var Stash: array[1..32] of pair;    
 BEGIN
  sp:=1;                              
  stash[1].l:=first; stash[1].r:=last;
  while sp > 0 do                  
   begin                           
    l:=stash[sp].l; r:=stash[sp].r; sp:=sp - 1;
    While l < r do                 
     begin                         
      p:=(l + r) div 2;            
      v:=A[p].key;                  
      l2:=l; r2:=r;                   
      repeat                          
       while A[l2].key < v do l2:=l2 + 1;
       while A[r2].key > v do r2:=r2 - 1;
       if l2 <= r2 then               
        begin                         
         if l2 <> r2 then Swap(A[l2],A[r2]);
         l2:=l2 + 1; r2:=r2 - 1;      
        end;                          
      until l2 >= r2;                 
      if r2 - l > r - l2 then         
       begin                          
        if l < r2 then                
         begin                        
          sp:=sp + 1; stash[sp].l:=l; stash[sp].r:=r2;
         end;
        l:=l2;                        
       end                            
       else                           
        begin                         
         if l2 < r then               
          begin                       
           sp:=sp + 1; stash[sp].l:=l2; stash[sp].r:=r;
          end;
         r:=r2;                       
        end;                          
     end;              
   end;             
 END;

And to keep Fresheneesz happy, there are no comments. NickyMcLean 20:04, 12 February 2007 (UTC)

[edit] omega?

What is Ω(n)? (In the section "Version with in-place partition") -- Tarquin 07:51, 17 Jun 2005 (UTC)

It's asymptotically bounded from below by n, not from above. Ω(n) means it takes time at least on the order of n, where O(n) would mean at most on the order of n.

[edit] too complex

im an advanced cmputer user yet i didnt understand anything can somebody please explain it more simply

If you are indeed an 'advanced cmputer user' then you should be able to rephrase the explanation yourself, if you find it too complex. I believe it is sufficiently clear.
  1. Pick an element (generally referred to as the pivot) from the list, any element will do.
  2. Put everything bigger than the pivot to the right, and everything smaller to the left. (right and left are actually lists of their own)
  3. So long as there is more than one element in either of these lists, apply steps one and two again (to each list, left and right).

Eventually the entire list will be in order. If you have a pouch of coins, and you pick out a random coin. It's 20 cents. That means you put all the 1,2,5 and 10 cent pieces to the left and the 50, $1 and $2 coins to the right. (If there are multiple copies of the coin, then choose whether they go to the left, right or stay with the pivot 20cent piece (this is up to the programmer). Then, in your first pile you have 1,2,5 and 10 cent coins. Pick a random one -> you get 2cents. then put to the left all of the 1cent coins, to the right everything else. now, you dont have to sort the 1cent coins because they are all the same (essentially). So you just sort the 2, 5 and 10 cent pieces. When that is done, you add them all together, including the pivot. Then you go back to the 50, $1 and $2 coins. Sort them in the same way.

[edit] Quicksort implementations overload

Once again the quicksort implementations section is growing too large. I moved several implementations to the quicksort implementations article.

To inhibit this growth and to prevent forking between this page and quicksort implementations, I have created a template called {{Quicksort core implementations}} containing the implementations listed on both pages. Any further edits will have to be made to Template:Quicksort_core_implementations or to quicksort implementations (hopefully most will be to the latter). I will put an HTML comment in both pages with a direct link to the template editing page. Deco 00:23, 29 November 2005 (UTC)

Update: I didn't realise this, but the section editing links for the sections transcluded from the template do work correctly and edit the correct sections of the template. This should keep editing them relatively easy while preventing divergence. Deco 00:30, 29 November 2005 (UTC)
Another update: If you have this article on your watchlist, please also keep Template:Quicksort_core_implementations on your watchlist to help prevent transcluded vandalism to this page. This occurred today. Thanks. Deco 20:47, 30 November 2005 (UTC)

[edit] Line Removed

", especially if a cryptographically secure pseudo-random number generator is used to reduce the chance of an attacker predicting the "random" elements".

Actually the cryptographic strength of the generator has little effect. Cryptographic strengh only prevents your adversary working out your seed from generated numbers. Since the generated numbers are restricted to the computer using the generator, an attacker can just as easily find out the seed itself as numbers that might reveal the seed.

The point of this comment was that the seed should not be a chosen in a predictable fashion, such as a fixed value or the current time, such that the random numbers can be determined without access to the sorting program's state. Not sure how best to put this. Deco 01:38, 10 January 2006 (UTC)

Simply state that it helps if the seed is not predictable. Linking cryptographically secure generators is a little misleading because even the strongest generators won't help if they are seeded the same way every time.

Good idea. I added the following:
One defense against this is to use highly unpredictable random seeds, such as numbers generated from hardware random number generators or entropy pools, and to avoid exposing partial information that might reveal information about the random seed or sequence.
Edit if you like. Deco 22:53, 10 January 2006 (UTC)

[edit] Changes to quicksort core implementations

I have removed all languages from this template, execpt for C, Haskell and Prolog. The C++ example offered nothing spectacularly new over the C example. J being an esoteric language and Joy and SML not offering much over Haskell. Also, however well meant using templates to store encyclopedic contents is a bad idea. Unless someone objects, I will substitute it. Cheers, —Ruud 12:57, 28 January 2006 (UTC)

All the other implementations would belong on WikiSource or WikiForge, in my humble opinion. —Ruud 12:59, 28 January 2006 (UTC)

Reducing number of languages sounds good to me. The purpose of the template, however, is to avoid duplicated content between two articles. This was done not on a whim but to fix an immediate problem where duplicated content was forking as people made "fixes" and "improvements" to one version or the other of the code. You haven't given any concrete reason for why "using templates to store encyclopedic contents is a bad idea", but I don't believe this to be true.
I honestly wouldn't mind moving the other impls to another wiki, but we need to provide an outlet for these people or they'll just keep adding them here. Either that or someone needs to take on the onerous burden of moving stuff out every time somebody adds something. And believe me, the HTML comments don't stop them.
Also, you should've moved these implementations to quicksort implementations, or if you believe some of them should've been deleted, at least removed textual references to them. It's not clear to me whether or not you intended to delete them instead of move them so I haven't taken any action. Deco 07:34, 2 February 2006 (UTC)
I did not move them because I did not see that the implentations at quicksort implementations differed from those in the core implementation. Wikipedia:Template_namespace states that templates should not be used to store article content. —Ruud 02:30, 3 February 2006 (UTC)
I don't believe the makers of that policy anticipated this specific circumstance. It's good advice in general, but here it solves a specific problem. The deletions of the other implementations is okay I guess, I can fix up the references. Deco 06:15, 3 February 2006 (UTC)

[edit] Changes to partition implementation

Recently a couple anonymous editors made some changes to the partition pseudocode which made it incorrect (and also really confusing). I was pretty sure the original was correct, but just to make sure I did a direct translation to C++ and tested it for a bit:

#include <algorithm>
using std::swap;

template <class T>
int partition(T a[], int left, int right, int pivotIndex) {
    int pivotValue = a[pivotIndex];
    swap(a[pivotIndex], a[right]); // Move pivot to end                         
    int storeIndex = left;
    for (int i=left; i<=right-1; i++) {
        if (a[i] <= pivotValue) {
            swap(a[storeIndex], a[i]);
            storeIndex = storeIndex + 1;
        }
    }
    swap(a[right], a[storeIndex]); // Move pivot to its final place             
    return storeIndex;
}

It seems to check out. Thought someone else might find this useful. Deco 10:20, 2 February 2006 (UTC)

[edit] Check these recent changes

I don't have time to check these right now, but these recent changes by 213.228.191.42 should be verified. This user also made these changes to Quicksort implementations. The 'print' comments in the second set of changes makes me pretty sure these edits are not correct. If nobody gets to this in a couple days, I'll look at it myself. – Doug Bell talkcontrib 02:17, 3 February 2006 (UTC)

I already reverted them here. This is, as far as I can tell, intentional sneaky vandalism. Deco 06:16, 3 February 2006 (UTC)

[edit] Introsort

Should introsort be added to the "other optimizations" section of this article? --Deryck C. 06:29, 13 March 2006 (UTC)

Already there. Deco 00:00, 21 June 2006 (UTC)

[edit] "Beatiful" Haskell-version

Removed...

qsort []     = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)

Because, well... what the heck is this? Is it just a call to an existing qsort function in the Haskell API? I don't know Haskell. ~ Booya Bazooka 19:16, 1 June 2006 (UTC)

No, it's a definition for a function called "qsort". It's a recursive definition. Read it like this: "To qsort the empty list, return the empty list. To qsort a list whose first element is x (call the rest of the list xs), return the result of qsorting those elements of xs less than x, followed by x, followed by the result of qsorting those elements of xs greater than or equal to x." —Bkell (talk) 19:51, 1 June 2006 (UTC)
Well, if you say it works, I suppose I believe it. Transwikiing to the Wikibook. ~ Booya Bazooka 21:01, 1 June 2006 (UTC)

[edit] Animation

Would it be ok to replace the static image with something like this animation?
Image:Sorting quicksort anim.gif
The file size is (even after some tradeoffs) 273kiB; I'm not quite sure if that is too large for the article (or if it still needs to be improved). --RolandH 22:12, 20 June 2006 (UTC)

Cool animation, I like it, very well done. I'm not sure exactly where the best place in the article is for it; ideally everything that it illustrates should be relatively close to it in the text. Deco 23:58, 20 June 2006 (UTC)
Thanks for the acknowledgement. Hm, I've had a go and tried inserting the new image at various places, but it either overlaps and clashes with the non-wrapping source code boxes or doesn't fit well near the beginning, especially when your image (which in one respect is better as it shows the algorithm workings in a less specific way -- not necessarily the in-place version of quicksort) is kept in place. I've also tried to simply replace your image, but even then the layout of the leading section is less balanced afterwards. --RolandH 11:23, 22 June 2006 (UTC)
Update: I should note that although this image is cool, it shouldn't replace the static image, since we still need a static image for print. Deco 23:06, 9 October 2006 (UTC)

Something seems wrong with the second animation (the diamond points one), at least me: I'm assuming that the vertical offset it the criteria by which the array is sorted. So then if you look towards the end of the animation you have points that are really low and yet they somehow become the highest points int the array. Did anyone else notice this?

I think the animation is great. It should be nominated as the picture of the day. Is there any way to slow it down? Epachamo 01:11, 31 January 2007 (UTC)

[edit] saving stack space

this is from Chuck Bradley, bradley at tiac.net. In the section about saving stack space by deferring the larger of the two portions identified by the partition function, I think the +1 and -1 can be eliminated in the IF statement. The code as written does not seem to cause the wrong branch to be taken, but just makes the reader work harder. The +1 and -1 in the recursive calls are needed. If more clarity is worth a few more lines, consider something like this: LEFTSIZE = Q - FIRST \ RIGHTSIZE = LAST - Q \ IF LEFTSIZE < RIGHTSIZE \

  etc

Thanks for all the effort that went into the article.

Thanks for your comments, Chuck. I liked your idea of introducing a left size and right size, but I prefer a computation that is based on the inclusive bounds (upper - lower + 1), since this aligns more closely with the recursive call arguments. Deco 23:05, 9 October 2006 (UTC)

[edit] In-place partition

This article states that "The disadvantage of the simple version above is that it requires Ω(n) extra storage space". This comes immediately after the actual C code in the previous section. But, that C code is in-place. What is not in-place is the pseudocode in the previous section (that appends each of the n items to be partitioned into one of three list objects). I found this confusing; the C code seems misplaced because it really doesn't match the pseudocode in light of how the text immediately following it makes it sound like it is not in-place.

[edit] IIterative version

Iterative version pseudocode is a mess with many bugs (I don't dislike pseudocode, I mean line 6: " { Pop(L, r)/2;" and others [where "L" is written instead of "1"]).

The initial version was a slightly pseudocoded Pascal source file which in its original compiled and worked (tested against less than two elements, all elements equal, already sorted, reverse sorted, nearly sorted, as well as random) which for an algorithm notorious for mistypes (< for <=, etc) is a comfort. However someone disliked Pascal and transliterated it to pseudo-C, randomly using L and l for no apparent reason (just typing with or without shift, presumably) and someone since has mangled the pseudosource. What "Pop(L,r)/2;" is meant to mean is obscure to me. NickyMcLean 20:15, 30 November 2006 (UTC)

[edit] Second animation

Another example of quicksort sorting a list of random numbers.
Another example of quicksort sorting a list of random numbers.

I removed the animation on the right from the page. First of all it seems wrong, and second of all - it doesn't help understanding quicksort in the least. Fresheneesz 20:58, 15 December 2006 (UTC)

[edit] C code minor change

I've moved the q_sort() function above the function quickSort(), thus making it a bit more C confirmant (q_sort() calls quickSort() , and the previous snippet wouldn't compile without a previous definition (which is not presented) of quickSort() )

Minor C code optimization:

 if (left < pivot)
   q_sort(numbers, left, pivot-1);
 if (right > pivot)
   q_sort(numbers, pivot+1, right);

should be

 if (left < pivot - 1)
   q_sort(numbers, left, pivot-1);
 if (right > pivot + 1)
   q_sort(numbers, pivot+1, right);

no need to run q_sort again for single items.

[edit] Basic C implementation is incorrect

Greetings;

I have tested the basic implementation and it is not correct. Just copy the code and give the function a randomly generated number arrays and you will notice that some of them input disappear in favor of others. Why? Simply because during the partition section, we have this: if(left != right){

 numbers[left] = numbers[right];  //numbers[left] disappears here, it should be swapped using a tmp
 left++;

} same for the right side of it.. So correct code is this: if(left != right){

 tmp = numbers[left]; numbers[left] = numbers[right]; numbers[right] = tmp; //swapping
 left++;

} hkernel 17:27, 1 February 2007 (UTC)

[edit] Indentation Style

The code indentations on this page are despicably inconsistent. pick one method and stick with it!

do you want do indent thus:

int function()
{     doStuffHere();
      return 1;
}

thus:

 int function() {
       doStuffHere();
       return 1;
 }

or thus:

 int function()
 {
      doStuffHere();
      return 1;
 }

I don't want a tabbing-style argument here, but do want consistency. —The preceding unsigned comment was added by 69.248.112.19 (talk) 18:55, 11 February 2007 (UTC).

[edit] Java??

What on earth is the Java pseudocode doing at the third code block? Not only it's entirely redundant, inconsistent, messy (check the spacing) and frustratingly hard to grasp, it also does a terrible job of getting the point across. Either change it to C (if it really needs to be there) or just remove it (much better). --124.120.189.96 13:07, 8 March 2007 (UTC)

[edit] Common Lisp

Hello, here is a simple implementation of the Quick Sort algorithm in Common Lisp [1]:

 (defun qsort (lst)
   (when lst
     (let* ((x  (car lst))
            (xs (cdr lst))
            (lt  (loop for y in xs when (< y x) collect y))
            (gte (loop for y in xs when (>= y x) collect y)))
       (append (qsort lt) (list x) (qsort gte)))))

- Chrisdone 18:57, 24 March 2007 (UTC)