Talk:Comb sort

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Computer science, which aims to create a comprehensive computer science reference for Wikipedia. Visit the project page for more information and to join in on related discussions.
Start rated as start-Class on the assessment scale
Mid rated as mid-importance on the assessment scale

Contents

[edit] Rivals quicksort?

I think something must be wrong with the Python implementation if this truly is supposed to rival quicksort. Even a simple bubble sort in Python beats it.


Combsort is an improved version of bubblesort that can be almost as good as more complex algorithms like quicksort.

Combsort's useage varies with the comb gapsize offset, so I suspect some of those empty fields might need "varies" as opposed to O notation responses. The most optimized gapsize I've ever seen offered is at http://world.std.com/~jdveale/combsort.htm and uses a table. Alternatively, the original version ( http://cs.clackamas.cc.or.us/molatore/cs260Spr01/combsort.htm ) uses /=1.3 instead. I suspect that the other number, /='1.279604943109628' , is more precise.


How in depth should these encyclopedia articles be? Another entry could be Hybrid Sort which uses Quicksort on lists larger than 6 elements and switches to Bubblesort when the Quicksort divisions are small enough (6 or less elements), because on very small lists Bubblesort is actually faster. Then there are the many specialized methods for special data. Examples are Bentley Sedgewick, Sadakane's algorithm, and Forward Radix Sort for the sorting of all suffixes of a string as is required for the Burrows Wheeler Transform.

Also, the algorithms should be split into two groups: the comparison based ones that cannot be faster than n log n but do not need additional knowledge of the data, these being Quick Sort, Heap Sort, Merge Sort, Bubble Sort, etc., and the ones that do depend on the data being within a range of values or otherwise comparable to an external reference, these being Bucket Sort and Radix Sort.


Come to think of it, I think combsort might not be stable.


How is this different to shell sort? According to NIST, shell sort completely sorts at each iteration whereas comb sort does only one pass per iteration.


Rabbits and turtles are fine, but does anyone know what the asymptotic complexity is? Arvindn 08:04, 9 Oct 2004 (UTC)


Surprisingly the asymptotic complexity is O(N^2) (yes, the mean is O(N^2), not just the worst case!) but with a very, very low coefficient (it's something above 10^-80, for a gap factor of 1.3, but how much more I can't say). Combsort still has a Turtle problem.

The number of elements, M, that can be sorted to the last place in the array after the first P passes has an upper bound (which can be calculated from P and the gap factor G) regardless of N. For any reasonable gap factor G you can select a P such that the remaining ~ log(N)/log(G) passes cannot move an element from last place in the array to the median element of the array, as each pass moves each value downwards at most once. For G=1.3 for example, P=7 will do.

Say we know the median value in a randomly ordered array of distinct values, and we rewrite the array, writing 0 to every element containing a value less than the median, and 1 to every element containing a value more than the median. The probability that the value 0 is found in -all- the elements that can be sorted to the last place in the array after the first P passes is 2^(-M). So the probability of O(N^2) running time is at least 2^(-M), for some M, determined from G, regardless of N. So the asymptotic complexity is O(N^2).

Finishing off with an insertion sort (rather than doing any bubble passes with interval of 1) deals with the "bad news" input mentioned above, and probably improves the mean complexity. It may even be O(N*log(N)). The worst case running time, however, is probably still worse than order (N*log(N)). How much worse, I can't say.

Alternating between ascending and descending passes (as per the cocktail shaker sort) also addresses the problem (in fact, gap factors approaching 1.4 become workable if you do this) perhaps because all "turtles" are effectively "rabbits" half the time. It might still be possible to construct "bad news" input, but I don't have good enough math skills to figure out the asymptotic complexity of "cocktail shaker combsort" as N tends to infinity. Not by a long shot.

james_barbetti@yahoo.com

[edit] C++ Implimentation

I just wrote a version for C++

Hmm... I should template it, shouldn't I? I'll work on that.

Update: Done!

template <typename RandomAccessIterator>
void combSort11(RandomAccessIterator first, RandomAccessIterator last)
{
   bool flag = true;
   int leng = last - first;
   int offset = leng;
   RandomAccessIterator array = first;
   const double shrink = 1.247330950103979;
   while (offset!=1 && !flag)    // finished only if offset is 1 and no swaps were made
   {
       flag = true;
       if (offset>1)
           offset /= shrink;
       if (offset==9 || offset==10)    // makes it Combsort11
           offset = 11;
       for (int i=0; i<(leng-offset); i++)
           if (array[i]>array[i+offset])
           {
               std::swap(array[i], array[i+offset]);
               flag = false;
           }
   }
}

DevastatorIIC 01:14, 8 March 2006 (UTC)

I guess flag should be initialized to false instead. --Como (talk) 14:33, 9 April 2008 (UTC). No, no, no: the && should be ||. --Como (talk) 14:41, 9 April 2008 (UTC)

i wrote a java implementation example... see if its compact/efficient enough plz

 Pulveriser 16:24, 17 April 2006 (UTC)
The C++ implementation doesn't work. The Java implementation currently on the article page doesn't work for the array (1 20 30 40 50 6 70 80 90). I'm removing the Java implementation from the article for the moment; frankly, I don't think we need a Java implementation if the Python implementation is correct (which I'm not sure of, but it seems plausible). One imperative language is enough. --Quuxplusone 21:24, 11 May 2006 (UTC)

you're right.. doesn't work. did a comparison test between a mergesort algo and it doesnt work sometimes... will fix it. and i think the python implementation is insufficient since not everyone knows python and a pseudo code is absent. once a pseudo code is made then perhaps only one langugae will be sufficient. Pulveriser 09:16, 25 May 2006 (UTC)

UPDATE: fixed it... had forgot to add the "continue as bubblesort till no swapping" condition. am posting it here.

private static int newGap(int gap) {
                gap=gap*10/13;
                if(gap==9||gap==10)
                        gap=11;
                if(gap<1)
                        return 1;
                return gap;
        }
        
        private static void sort(int a[]) {
                int gap=(int)(a.length);
                int temp;
                boolean found=true;
                for(;gap>1||found;) {
                        found=false;
                        gap=newGap(gap);
                        for(int i=0;i<=a.length-gap;i++) {
                                found=false;
                                if(a[i]>a[i+gap-1]) {
                                        found=true;
                                        temp=a[i];
                                        a[i]=a[i+gap-1];
                                        a[i+gap-1]=temp;
                                }
                        }
                }
        }

Pulveriser 09:46, 25 May 2006 (UTC)

oh yeah, almost forgot... decide whether the Java implementatiopn should be put up on the page... if i don't hear any objections in a week, i'll put it up. Pulveriser 09:50, 25 May 2006 (UTC)

oops! forgot a line in the code! Pulveriser 10:35, 25 May 2006 (UTC)

damn, forgot another line... Pulveriser 10:37, 25 May 2006 (UTC)

[edit] Brick sort

Comb sort is a rediscovery and popularization of an algorithm that had been known for years in the theoretical CS community under the names "brick sort", or "Dobosiewicz sort". See, for instance:

Wlodzimierz Dobosiewicz: An Efficient Variation of Bubble Sort. Inf. Process. Lett. 11(1): 5-6 (1980)

I think this article should give proper credit to Dobosiewicz rather than solely mention the Byte paper.

I think Dobosiewicz's home page is the following: http://www.cis.uoguelph.ca/?q=user/6 hence, it might be a good idea to reach him and invite him to comment/contribute hist view on the history of comb/brick sort...

The "brick sort" with which I'm familiar is the one in Knuth, where you just compare each element to the one on its right, then the one on its left, then the one on its right..., so named because a diagram of its sorting network looks like a brick wall. Comb sort is nothing like brick sort, except that it's also a sorting network. Comb sort is also much more efficient; brick sort is even less efficient than bubble sort, unless you can do the comparisons in parallel. --Quuxplusone 21:24, 11 May 2006 (UTC)
Hmm, So much for my calling it brick sort, that's a mistake.

Sedgewick credits Dobosiewicz with having observed in 1980 that the insertion passes in Shell sort could be replaced with bubble sort passes, and then devised the variation in question. The author then proceeds with defining 3 sorts which improve 3 "naive" sort: brick, bubble and shaker by performing those algorithms in multi-pass decreasing-gap fashion.

So clearly, Dobosiewicz is the inventor, and the guys in the Byte paper merely rediscovered it.

I finally got a copy of Dobosiewicz's paper. It's true: he used a starting gap of n/2 an then he applied, after every bubble-like pass, a shrink factor of 1.3333... while gap>1. Then he applied bubblesort with a simple optimization consisting in stopping each pass at the position of the last swap of the previous pass. He compared the algorithm with quicksort, shellsort and bubblesort, with up to 10000 elements. Knuth proposed in "The Art of Computer Programming, vol3, Sorting and Searching", 1973, exercise 5.2.1-40, a variation of shellsort that made it very very similar to combsort Dobosiewiczsort. He proposed, for the passes with gap>1, just comparing one pair of elements ("Thus, the result of interchanges is not propagated..." as it would in the normal insertionsort applied in shellsort). The only difference is the final pass. Knuth: insertionsort. Dobosiewicz: optimized bubblesort. Lacey&Box: unoptimized (not even with triangular loops) bubblesort. Sincerely, I would call it Dobosiewiczsort. At least, Knuth and Dobosiewicz should be mentioned in the article. --Como (talk) 09:31, 23 May 2008 (UTC)
And, by the way, the final sort would be faster with insertionsort, as Knuth proposed. Note that insertionsort naturally favors the case of "almost sorted" data. It does this far more efficiently than bubblesort. --Como (talk) 09:41, 23 May 2008 (UTC)

[edit] Comparison sort?

Is comb sort a comparison sort? ~ Booyabazooka 22:31, 11 May 2006 (UTC)

(this comment was copied from my talk pageBooya Bazooka)
Yes, comb sort is a comparison sort. It compares the values to be sorted, and sorts them accordingly. A non-comparison sort uses keys, not the values, to sort. I'm not sure if comb sort belongs in the comparison sort article (as it only lists well-known ones (and comb sort isn't)), but you could mention that it is in its own article. DevastatorIIC 09:06, 25 May 2006 (UTC)
Thanks - I asked not with the intention of putting it in the Comparison sort article, but for putting it in the category. ~ Booya Bazooka 09:12, 25 May 2006 (UTC)

[edit] Implementations

The current pseudo code in the article is buggy. It doesn't use the swaps variable so loops endlessly. After fixing that, the loop conditions are wrong. Will fail for arrays containing 0, 1 or 2 elements. --John Corbett <wikipedia@pictographer.com>

I say someone writes a pseudocode and or C implementation, then move the others to Wikibooks, ala bubble sort. --DevastatorIIC 13:51, 31 May 2006 (UTC)

I concur on this point; althought it should unquestionably be pseudocode, not C (some of us, ahem, don't know C yet...). I'm hesitant to go ahead and transwiki these implementations now, because that would leave this article without any code at all. ~ Booya Bazooka 17:45, 31 May 2006 (UTC)
Is this anywhere close to good? Feel free to tear it apart. --DevastatorIIC 01:47, 1 June 2006 (UTC)
Done tearing apart ;)
Modified it to use a more rigid syntax. I think yours was a tad too close to literary prose. Jafet 07:51, 23 July 2006 (UTC)
function update_gap(gap)
    gap := gap / 1.3
    if gap = 9 or 10
        gap := 11
    endif
    if gap < 1
        gap := 1
    endif
    return gap
endfunction

function combsort(data)
    var list gap, swapped, index
    gap := length(data)
    swapped := true
    while gap > 1 or swapped = true
        gap := update_gap(gap)
        swapped := false
        for index := {0 to length(data) - gap}
            if data[index + gap] < data[index]
                swap(data[index + gap], data[index])
                swapped := true
            endif
        endfor
    endwhile
endfunction

[edit] Broken links

Both http://world.std.com/~jdveale/combsort.htm and http://cs.clackamas.cc.or.us/molatore/cs260Spr01/combsort.htm are gone from the web for reasons unknown. I noticed that many other webpages outside of Wikipedia also reference these nonexistent webpages. A brief search failed to relocate these articles. Jafet 07:51, 23 July 2006 (UTC)

[edit] What does this sentence mean...?

This sentence in the intro doesn't make sense to me:

"improves on bubble sort and rivals in speed more complex algorithms like Quicksort"

The fragment "more complex algorithems ..." seems misplaced. Is there a missing work or something? perhaps "more than"??? Fresheneesz 19:39, 20 September 2006 (UTC)

answer: "rivals" is a (tensed) verb.

[edit] False complexity at Sorting algorithm

This algorithm is not an O(n log n) worst case or average case algorithm, as noted above - but is listed as such at sorting algorithm. Did the original article include a complexity analysis? If not, it shouldn't list any complexity there - it's OR to do our own nontrivial algorithm analysis. If Dobosiewicz's algorithm is the same, I'm sure he included an analysis that we can use.

Also, er, the note claiming it has "small code size" in sorting algorithm is silly - shellsort has very similar code and better complexity and is accompanied by no such note.

Also, the phrasing "almost as good as [...] quicksort" is implausible - for both small and large lists I seriously doubt it could be competitive in any practical implementation, because I believe it does a lot more comparisons overall. Dcoetzee 03:43, 7 December 2007 (UTC)


I've tested it for a while. The "and swaps = 0" clause makes it look like... O(n2) worst case. My experiments showed that this clause is indeed necessary... as long as I use a shrink factor of 1.3. In some cases it required up to 6 combs with gap 1. Though, if I use the other number (by the way, that phi is the golden ratio!!) and round the gap in a conservative way (gap := (gap * 1000 + 999) / 1247), then no additional comb of gap 1 is required (in my tests). I tried all possible permutations of up to 11 elements (no repeated values) and some millions of random permutations of up to 100,000 elements.

Now the question is: can anybody prove it? (that a single last comb of gap 1 is enough, and therefore the algorithm is O(n log n) worst case with a shrink factor of 1.247...) Sadly, the links above in Talk:Comb_sort#Rivals_quicksort.3F seem to be broken. Where did all that come out from??? --Como (talk) 19:21, 3 April 2008 (UTC)

Update: I searched the critical shrink factor. The smallest bad shrink factor I found is 9/7. Every smaller shrink factor killed all turtles in my experiments. Here is my algorithm in non-cryptic C:

 void combsort_7_9 (unsigned int A[], unsigned int len)
 {
   unsigned int i, j, tmp, gap;
   
   gap = len;
                   // No "sorted" flag! The number of
   while (gap>1)   // iterations only depends on the size
   {
     if (gap<10)              // Below 10, descend one
       gap = gap - 1;         // by one until zero
     else
       gap = (gap*7)/9 + 1;   // Integer division!!
                              // +1: ensure shrink factor
     i = 0;                   //     strictly < 9/7
     j = gap;
     
     while (j<len)     // Make a pass with current gap
     {
       if (A[i]>A[j])    // If not sorted ...
       {
         tmp = A[i];
         A[i] = A[j];      // ... swap!
         A[j] = tmp;
       }
       
       i = i + 1;        // Advance
       j = j + 1;
     }
   }
 }

Can anybody find a killer sequence that remains unsorted after this? --Como (talk) 10:44, 17 April 2008 (UTC)

Update: Somebody did find a killer sequence! See "Combsort: shrink factor for guaranteed O(n log n) worst case time?" in comp.theory, thread started on Thu, 10 Apr 2008. --Como (talk) 15:20, 3 June 2008 (UTC)