Talk:Merge sort

From Wikipedia, the free encyclopedia

On a boring weekend, I sat down with my computer trying to implement some sorting algorithms. Then I started working on a Mergesort implementation. I didn't have internet access, so all the help I had was foggy memories of this page. Everything went fine until I started working on the merge function (I was programming in Java, BTW). I tried to make a merge function that could merge the two ranges without using a temporary array, something like the above merge function in C, but without the v1 and v2 arrays. (I thought that I had seen such a method on this page - came back tonight and saw that I hadn't ;-)

I had limited success in making the function work - is it at all possible?

moved from article page -- John Owens 20:48 Apr 21, 2003 (UTC)
# (cur) (last) . . 15:33 Apr 21, 2003 . . 217.209.32.132 (The tale of Joe Random Hacker trying to make a merge function without using temporary arrays)

Contents

[edit] Replacing the C sample code

I think the current C sample code isn't very useful. The purpose of sample code on an algorithm page has to be to convey how the algorithm works, in a concise and readable manner. The point is not to provide an efficient implementation that users can copy&paste.

I propose replacing the lengthy C sample by the following Python code:

def merge(a, b):
  # Merging is trivial if one list is empty.
  if len(a) == 0: return b
  if len(b) == 0: return a
  
  # Merge non-empty lists by taking the smaller one of the two
  # head elements, and recursively merging the remaining lists.
  if a[0] <= b[0]: return a[0:1] + merge(a[1:], b)
  else:            return b[0:1] + merge(a, b[1:])

def mergesort(a):
  # Lists of 0 or 1 elements are already sorted.
  if len(a) <= 1: return a
  
  # Split list in half, sort halves separately and merge.
  m = len(a) // 2  # // is integer division
  return merge(mergesort(a[0:m]), mergesort(a[m:]))

I think Python is a good choice here because

  • the syntax is uncluttered and fairly easy to read, even if one isn't familiar with the details of Python
  • array slice and append operations make for a much more concise formulation of the algorithm, where the C code has to shuffle around single elements in loops.
  • While the algorithm could probably be expressed even more elegantly in a functional language ike Scheme, I think an imperative formulation will be easier to understand for the majority of readers.

If there are no objections, I will go ahead and make the changes. --K. Sperling 12:45, 17 Jul 2004 (UTC)

The definition given in this article is for a Binary merge sort. Higher orders of merge are possible (e.g. Ternary merge sort, divide unordered list into three lists of approximately equal size, sort those, then merge the three ordered lists). Well-written Mergesort and Heapsort implementations have approximately the same performance (if anything Mergesort usually wins by a whisker).

wait a moment... This code is erraneous!!!


I tested the mergesort code in float and int variations in C. The combinations of Sort and Merge functions here on Wikipedia don't work in a simple C compiler. After compilation the results of a run are sorted in the wrong order!

-- Glenn.L.Williams@nasa.gov   This paragraph is the opinion of the author only and not that of NASA!!!

[edit] MIPS? Miranda?

I removed the MIPS assembly version of the code, because this isn't the 99 Bottles of Beer page, and I don't think anybody is going to be enlightened about how the algorithm works by pages of assembly code.

I also suggest that the Miranda example should be removed. It's almost the same as the Haskell one - it seems to differ only in the syntax for taking the left half and right half of a list - and Haskell is more well known than Miranda, at least based on the one data point that I've never heard of Miranda.

RSpeer 17:29, Apr 21, 2005 (UTC)

[edit] Analysis of moves is suspect

The assertion that quicksort's best case behavior is N log N moves must depend on some very specific implementation. If the element in the middle is used as the pivot (not a good idea, but often seen), then input that is already in order will result in 0 moves. I'd favor removing the discussion of best case behavior.

I'm also suspicious about the assertion that merge sort does better on average than quicksort on moves. Merge sort will move all the elements, log N times. On average, quicksort will find half the elements on the proper side of the pivot, so it should only have to move 1/2 the elements, log N times.

Algorithmic analysis is not my field, so I won't attempt to edit the article. But it would be nice to have an expert check out the validity of the analysis. —The preceding unsigned comment was added by Jpl (talk • contribs) .

It's not talking about the number of moves; it's talking about the number of steps the algorithm has to take. Even if quicksort doesn't actually do any swaps, it still has to inspect elements in the list; and it's been proven that any comparison-based sorting algorithm must take at least O(N log N) time.
With respect to your statement "it should only have to move 1/2 the elements, log N times", well, this is O-notation we're talking about, so that's actually equivalent to saying O(N log N), since N/2 log(N) is also O(N log N). --Saforrest 04:26, 11 April 2006 (UTC)
The main page says In terms of moves, merge sort's worst case complexity is O(n log n)—the same complexity as quicksort's best case. We've already established that the best case number of compares is O(N log N), so of course the number of steps can be no better than O(N log N). I don't think we're in disagreement about the behavior, but the article asserts that quicksort best case moves is O(N log N), and that's simply false (and largely irrelevant). --jpl 22 June 2006

[edit] Question about the C implementation

I have one question... Imagine you have one array of integers, {40, 3} for instance. Using the current implementation of mergesort, begin == end -1 so the function returns. However, the array is not sorted. Or did I missunderstand anything? Thank you!

In C, conventionally (and also in this case), end pointers point past the end of the sequence, so begin == end signifies an empty seqence. --K. Sperling June 30, 2005 22:25 (UTC)

[edit] What about bottom-up merging?

It ain't necessarily true that Mergesort requires more recursive method calls than Quicksort. If Mergesort and Quicksort are delegating to another sort routine (e.g. Insertion Sort) for small subarrays, then comparing the number of recursive method calls is pointless. If Mergesort is implemented bottom-up (as per Knuth, "Art Of Programming", volume 3, 2nd Edition, Natural and Straight Merge algorithms) rather than top-down there need be *NO* recursive method calls at all. Recursive call overhead is "small change" when N is large anyway: it contributes at most an O(N) component to average running time.

The bottom-up merge formulation is typically something like this:

 1. divide the array into subarrays of size M (the last subarray may be <M elements); sort each such subarray
 2. merge the first two subarrays of size M, then the next two, and so on
    (if there is an odd number of subarrays, leave the last one untouched)
    (if there is an even number of subarrays, the last one may contain less than M elements)
 3. set M = M * 2
 4. if M<N, go back to step 2

I would not be at all surprised to find that the Mergesort Von Neumann proposed (way back in 1945) was bottom-up rather than top-down. The recursive top-down mergesorts (that are now so much more popular) were probably invented somewhat later.

Quicksort's real advantage over Mergesort (in terms of average running time) comes down to the amount of record movement. Quicksort typically requires a shade below half the record movement of (Binary) Mergesort (the ratio depends on how the Quicksort pivot values are selected).

Re: the article's assertion that Mergesort's best case only "takes about half as much time" as the worst case. That depends. For typical mergesort implementations the best case requires half as many comparisons but just as many record moves. For some implementations (e.g. Natural Merge - see Knuth) only O(N) comparisons and O(N) moves are required for the best case (such implementations tend to have a longer *average* running time, however).

-James Barbetti (08-Jul-2005)

[edit] Natural Mergesort

This type of merge sort I read about first in 'Scheme and the Art of Programming', George Springer and Daniel P. Friedman.
They call it a natural mergesort.

Here is a first stab (in python) at a bottom up , inplace merge sort:

def count_sorted( items, start=0 ):
  for x in range(start+1,len(items)):
    if items[x-1]>items[x]:
      return x - start ;
  return len(items) - start ;

def reinsert( items, val, start, end ):
  for x in range( start, end-1 ):
    if items[x+1] > val :
      items[x] = val
      return
    else:
      items[x] = items[x+1]
  items[end-1] = val


def merge_sublists( items, left, right ):
  for x in range(0,left):
    if items[x] > items[left]:
      val = items[x]
      items[x] = items[left]
      reinsert( items, val, left, left+right )

def merge( items ):
  left= count_sorted(items)
  while left < len(items):
    right = count_sorted( items, left )
    merge_sublists( items, left, right )
    left = left + right

A possible optimization is that when the longest ordered sublist is one, find the longest reverse ordered list and reverse it.

-David Medlock, Jan 13, 2006

[edit] minor edit to Optimizing

Comparing RAM to a tape drive is a bit of a stretch, but if "in some sense" means sequential access only, it's not too far-fetched. Still, even if RAM is the slowest of the various memory cache levels, it is much, much faster than a tape. A naive reader might have interpreted the original text to mean that RAM is comparable in speed to a slow tape drive. So I changed "slow" to "fast" (which still preserves the relative speeds of the various caches mentioned, but also preserves the speed relative to tape drives).

---

Perhaps it would be worth mentioning when this matters. These issues are only relevant with large sets of elements. In particular, I believe databases (e.g. SQL and whatnot) tend to use mergesort.

[edit] Could use a Lisp example

I think a Lisp example of the code for merge sort would be illuminating. Due to the nature of Lisp this variation of merge sort is going to have to be bottom-up rather than top-down (to preserve O(nlogn) efficiency, anyway). I'm working on figuring it out but in the mean time if someone else already knows it that'd be great too. --Cyde 05:22, 11 November 2005 (UTC)

I don't think we need more example code, in fact I think we already have way too much. These example code sections seem to be used more as advertisements for everybody's respective favorite language. For example the Ruby and Python samples differ in nothing but syntax details. The Haskell and Miranda ones are also practically identical. I'd actually like to remove the samples completely or at least move them into a separate article or onto wikisource. The question to ask is, are the samples illustrating the algorithm, or are they just there to illustrate how elegant and superior language X is? It seems to me that it's mostly the latter.
And don't get me wrong, I'm not accusing you of lobbying for Lisp. Just the fact that you mentioned adding more samples reminded me that I'd wanted to say something on that subject for some time now :-)
And a section on the bottom-up version of merge sort WOULD be very good to have. --K. Sperling (talk) 01:00, 12 November 2005 (UTC)
This happens on every sorting algorithm page. My approach has been to break these out into a separate article, just keeping a few of the most illustrative ones in the original. See quicksort implementations and insertion sort implementations. Deco 03:09, 21 November 2005 (UTC)

[edit] Removal of implementations

I intend to remove the implementations from this page if Wikipedia:Articles for deletion/Selection sort implementations is successful. They will be transwikied to the existing implementation page at WikiSource. —donhalcon 06:00, 5 March 2006 (UTC)

Already done :) —Ruud 23:19, 5 March 2006 (UTC)

[edit] C code

I have a working implementation in C. Not optimizations done to make it clear what is happening...

/* Inital call to mergeSort should be arraysize - 1 */
void mergeSort(int a[], int low, int high)
{
        int length, mid;
        int *temp1, *temp2;
        int tmp_pos1, tmp_pos2;
        int i;

        /* Array is 1 element and sorted so return */
        if (low == high)
                return;

        /* Divide the array in half and sort */
        length = high - low + 1;
        mid = (low + high) / 2;

        mergeSort(a, low, mid);
        mergeSort(a, mid+1, high);

        /* Put half the array in 1 temp array */
        temp1 = malloc(sizeof(int)*(mid-low+1));
        temp2 = malloc(sizeof(int)*(high-mid));
        i = low;
        for (tmp_pos1 = 0; tmp_pos1 < (mid-low+1); tmp_pos1++)
        {
                temp1[tmp_pos1] = a[i];
                i++;
        }
        /* and the other half in other temp array */
        i = mid+1;
        for (tmp_pos2 = 0; tmp_pos2 < (high-mid); tmp_pos2++)
        {
                temp2[tmp_pos2] = a[i];
                i++;
        }
        
        /* Merge the two temp array back into the original array */
        tmp_pos1 = 0;
        tmp_pos2 = 0;
        i = low;
        while ((tmp_pos1 < (mid-low+1)) && (tmp_pos2 < (high-mid)))
        {
                if (temp1[tmp_pos1] < temp2[tmp_pos2]) 
                {
                        a[i] = temp1[tmp_pos1];
                        tmp_pos1++;
                        i++;
                } else {
                        a[i] = temp2[tmp_pos2];
                        tmp_pos2++;
                        i++;
                }
        }

        /* At this point, we've completely merge one of the arrays
           back into the original array.  Now, just tack on the 
           remaining elements of the other temp array */
        while (tmp_pos1 < (mid-low+1)) {
                a[i] = temp1[tmp_pos1];
                tmp_pos1++;
                i++;
        }

        while (tmp_pos2 < (high-mid)) {
                a[i] = temp2[tmp_pos2];
                tmp_pos2++;
                i++;
        }

        free(temp1);
        free(temp2);
}