Talk:Comb sort

From Wikipedia, the free encyclopedia

Contents

[edit] Rivals quicksort?

I think something must be wrong with the Python implementation if this truly is supposed to rival quicksort. Even a simple bubble sort in Python beats it.


Combsort is an improved version of bubblesort that can be almost as good as more complex algorithms like quicksort.

Combsort's useage varies with the comb gapsize offset, so I suspect some of those empty fields might need "varies" as opposed to O notation responses. The most optimized gapsize I've ever seen offered is at http://world.std.com/~jdveale/combsort.htm and uses a table. Alternatively, the original version ( http://cs.clackamas.cc.or.us/molatore/cs260Spr01/combsort.htm ) uses /=1.3 instead. I suspect that the other number, /='1.279604943109628' , is more precise.


How in depth should these encyclopedia articles be? Another entry could be Hybrid Sort which uses Quicksort on lists larger than 6 elements and switches to Bubblesort when the Quicksort divisions are small enough (6 or less elements), because on very small lists Bubblesort is actually faster. Then there are the many specialized methods for special data. Examples are Bentley Sedgewick, Sadakane's algorithm, and Forward Radix Sort for the sorting of all suffixes of a string as is required for the Burrows Wheeler Transform.

Also, the algorithms should be split into two groups: the comparison based ones that cannot be faster than n log n but do not need additional knowledge of the data, these being Quick Sort, Heap Sort, Merge Sort, Bubble Sort, etc., and the ones that do depend on the data being within a range of values or otherwise comparable to an external reference, these being Bucket Sort and Radix Sort.


Come to think of it, I think combsort might not be stable.


How is this different to shell sort? According to NIST, shell sort completely sorts at each iteration whereas comb sort does only one pass per iteration.


Rabbits and turtles are fine, but does anyone know what the asymptotic complexity is? Arvindn 08:04, 9 Oct 2004 (UTC)


Surprisingly the asymptotic complexity is O(N^2) (yes, the mean is O(N^2), not just the worst case!) but with a very, very low coefficient (it's something above 10^-80, for a gap factor of 1.3, but how much more I can't say). Combsort still has a Turtle problem.

The number of elements, M, that can be sorted to the last place in the array after the first P passes has an upper bound (which can be calculated from P and the gap factor G) regardless of N. For any reasonable gap factor G you can select a P such that the remaining ~ log(N)/log(G) passes cannot move an element from last place in the array to the median element of the array, as each pass moves each value downwards at most once. For G=1.3 for example, P=7 will do.

Say we know the median value in a randomly ordered array of distinct values, and we rewrite the array, writing 0 to every element containing a value less than the median, and 1 to every element containing a value more than the median. The probability that the value 0 is found in -all- the elements that can be sorted to the last place in the array after the first P passes is 2^(-M). So the probability of O(N^2) running time is at least 2^(-M), for some M, determined from G, regardless of N. So the asymptotic complexity is O(N^2).

Finishing off with an insertion sort (rather than doing any bubble passes with interval of 1) deals with the "bad news" input mentioned above, and probably improves the mean complexity. It may even be O(N*log(N)). The worst case running time, however, is probably still worse than order (N*log(N)). How much worse, I can't say.

Alternating between ascending and descending passes (as per the cocktail shaker sort) also addresses the problem (in fact, gap factors approaching 1.4 become workable if you do this) perhaps because all "turtles" are effectively "rabbits" half the time. It might still be possible to construct "bad news" input, but I don't have good enough math skills to figure out the asymptotic complexity of "cocktail shaker combsort" as N tends to infinity. Not by a long shot.

james_barbetti@yahoo.com

[edit] C++ Implimentation

I just wrote a version for C++

Hmm... I should template it, shouldn't I? I'll work on that.

Update: Done!

template <typename RandomAccessIterator>
void combSort11(RandomAccessIterator first, RandomAccessIterator last)
{
   bool flag = true;
   int leng = last - first;
   int offset = leng;
   RandomAccessIterator array = first;
   const double shrink = 1.247330950103979;
   while (offset!=1 && !flag)    // finished only if offset is 1 and no swaps were made
   {
       flag = true;
       if (offset>1)
           offset /= shrink;
       if (offset==9 || offset==10)    // makes it Combsort11
           offset = 11;
       for (int i=0; i<(leng-offset); i++)
           if (array[i]>array[i+offset])
           {
               std::swap(array[i], array[i+offset]);
               flag = false;
           }
   }
}

DevastatorIIC 01:14, 8 March 2006 (UTC)


i wrote a java implementation example... see if its compact/efficient enough plz

 Pulveriser 16:24, 17 April 2006 (UTC)
The C++ implementation doesn't work. The Java implementation currently on the article page doesn't work for the array (1 20 30 40 50 6 70 80 90). I'm removing the Java implementation from the article for the moment; frankly, I don't think we need a Java implementation if the Python implementation is correct (which I'm not sure of, but it seems plausible). One imperative language is enough. --Quuxplusone 21:24, 11 May 2006 (UTC)

you're right.. doesn't work. did a comparison test between a mergesort algo and it doesnt work sometimes... will fix it. and i think the python implementation is insufficient since not everyone knows python and a pseudo code is absent. once a pseudo code is made then perhaps only one langugae will be sufficient. Pulveriser 09:16, 25 May 2006 (UTC)

UPDATE: fixed it... had forgot to add the "continue as bubblesort till no swapping" condition. am posting it here.

private static int newGap(int gap) {
                gap=gap*10/13;
                if(gap==9||gap==10)
                        gap=11;
                if(gap<1)
                        return 1;
                return gap;
        }
        
        private static void sort(int a[]) {
                int gap=(int)(a.length);
                int temp;
                boolean found=true;
                for(;gap>1||found;) {
                        found=false;
                        gap=newGap(gap);
                        for(int i=0;i<=a.length-gap;i++) {
                                found=false;
                                if(a[i]>a[i+gap-1]) {
                                        found=true;
                                        temp=a[i];
                                        a[i]=a[i+gap-1];
                                        a[i+gap-1]=temp;
                                }
                        }
                }
        }

Pulveriser 09:46, 25 May 2006 (UTC)

oh yeah, almost forgot... decide whether the Java implementatiopn should be put up on the page... if i don't hear any objections in a week, i'll put it up. Pulveriser 09:50, 25 May 2006 (UTC)

oops! forgot a line in the code! Pulveriser 10:35, 25 May 2006 (UTC)

damn, forgot another line... Pulveriser 10:37, 25 May 2006 (UTC)

[edit] Brick sort

Comb sort is a rediscovery and popularization of an algorithm that had been known for years in the theoretical CS community under the names "brick sort", or "Dobosiewicz sort". See, for instance:

Wlodzimierz Dobosiewicz: An Efficient Variation of Bubble Sort. Inf. Process. Lett. 11(1): 5-6 (1980)

I think this article should give proper credit to Dobosiewicz rather than solely mention the Byte paper.

I think Dobosiewicz's home page is the following: http://www.cis.uoguelph.ca/?q=user/6 hence, it might be a good idea to reach him and invite him to comment/contribute hist view on the history of comb/brick sort...

The "brick sort" with which I'm familiar is the one in Knuth, where you just compare each element to the one on its right, then the one on its left, then the one on its right..., so named because a diagram of its sorting network looks like a brick wall. Comb sort is nothing like brick sort, except that it's also a sorting network. Comb sort is also much more efficient; brick sort is even less efficient than bubble sort, unless you can do the comparisons in parallel. --Quuxplusone 21:24, 11 May 2006 (UTC)
Hmm, So much for my calling it brick sort, that's a mistake.

Sedgewick credits Dobosiewicz with having observed in 1980 that the insertion passes in Shell sort could be replaced with bubble sort passes, and then devised the variation in question. The author then proceeds with defining 3 sorts which improve 3 "naive" sort: brick, bubble and shaker by performing those algorithms in multi-pass decreasing-gap fashion.

So clearly, Dobosiewicz is the inventor, and the guys in the Byte paper merely rediscovered it.

[edit] Comparison sort?

Is comb sort a comparison sort? ~ Booyabazooka 22:31, 11 May 2006 (UTC)

(this comment was copied from my talk page ~ Booya Bazooka)
Yes, comb sort is a comparison sort. It compares the values to be sorted, and sorts them accordingly. A non-comparison sort uses keys, not the values, to sort. I'm not sure if comb sort belongs in the comparison sort article (as it only lists well-known ones (and comb sort isn't)), but you could mention that it is in its own article. DevastatorIIC 09:06, 25 May 2006 (UTC)
Thanks - I asked not with the intention of putting it in the Comparison sort article, but for putting it in the category. ~ Booya Bazooka 09:12, 25 May 2006 (UTC)

[edit] Implementations

The current pseudo code in the article is buggy. It doesn't use the swaps variable so loops endlessly. After fixing that, the loop conditions are wrong. Will fail for arrays containing 0, 1 or 2 elements. --John Corbett <wikipedia@pictographer.com>

I say someone writes a pseudocode and or C implementation, then move the others to Wikibooks, ala bubble sort. --DevastatorIIC 13:51, 31 May 2006 (UTC)

I concur on this point; althought it should unquestionably be pseudocode, not C (some of us, ahem, don't know C yet...). I'm hesitant to go ahead and transwiki these implementations now, because that would leave this article without any code at all. ~ Booya Bazooka 17:45, 31 May 2006 (UTC)
Is this anywhere close to good? Feel free to tear it apart. --DevastatorIIC 01:47, 1 June 2006 (UTC)
Done tearing apart ;)
Modified it to use a more rigid syntax. I think yours was a tad too close to literary prose. Jafet 07:51, 23 July 2006 (UTC)
function update_gap(gap)
    gap := gap / 1.3
    if gap = 9 or 10
        gap := 11
    endif
    if gap < 1
        gap := 1
    endif
    return gap
endfunction

function combsort(data)
    var list gap, swapped, index
    gap := length(data)
    swapped := true
    while gap > 1 or swapped = true
        gap := update_gap(gap)
        swapped := false
        for index := {0 to length(data) - gap}
            if data[index + gap] < data[index]
                swap(data[index + gap], data[index])
                swapped := true
            endif
        endfor
    endwhile
endfunction

[edit] Broken links

Both http://world.std.com/~jdveale/combsort.htm and http://cs.clackamas.cc.or.us/molatore/cs260Spr01/combsort.htm are gone from the web for reasons unknown. I noticed that many other webpages outside of Wikipedia also reference these nonexistent webpages. A brief search failed to relocate these articles. Jafet 07:51, 23 July 2006 (UTC)

[edit] What does this sentence mean...?

This sentence in the intro doesn't make sense to me:

"improves on bubble sort and rivals in speed more complex algorithms like Quicksort"

The fragment "more complex algorithems ..." seems misplaced. Is there a missing work or something? perhaps "more than"??? Fresheneesz 19:39, 20 September 2006 (UTC)

answer: "rivals" is a (tensed) verb.