Talk:Insertion sort

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Computer science, which aims to create a comprehensive computer science reference for Wikipedia. Visit the project page for more information and to join in on related discussions.
B rated as B-Class on the assessment scale
High rated as high-importance on the assessment scale

Contents

[edit] sort

Question: Which language is the example implementation written in?

The example is written in Python. I'm planning on simplifying it a bit, in particular to remove the cmp function being passed as a parameter:

  • A built-in function cmp() already exists, and means something different (-ve if a<b, 0 if a=b, +ve if a>b);
  • IMO, Python's lambda notation and default arguments are not appropriate for near pseudo-code;
  • "a > b" is much better for trying to understand an algorithm than a function call.
  • Custom Python types can define their own comparisons anyway, so a comparator parameter is unnecessary.

--Carey Evans

Could the example semi-pseudocode on the main page please be cleaned so that it looks like one language, rather than C with a dab of Eiffel? 193.203.149.227 18:34, 28 May 2007 (UTC)

Should heapsort be listed as type of insertion sort ? --Taw

No. Heapsort is more closely related to selection sort. --Damian Yerrick

Actually heapsort and insertion sort have a pretty close relation. In both, you are taking the elements of the list and successively inserting them into a data structure from which you can, at the end, extract the sorted sequence. In selection sort you actually search the unsorted elements; in these sorts you simply present the unsorted elements sequentially. Deco 05:54, 29 Mar 2005 (UTC)

[edit] Removed mergesort statement

I removed this statement:

This is even less than Merge sort for finite n, though both do O(n log n) comparisons.

Not only is this silly (you can't sort infinite lists in finite time, especially not with this sort) but it assumes that merge sort runs in exactly n log n time, rather than just Θ(n log n) time. The truth is that:

\log n! = \log \prod_{i=1}^n i = \sum_{i=1}^n \log i.

Now, we can bound it on both sides with integrals:

n\log n - n + 1 = \int_{x=1}^{n} \log x < \sum_{i=1}^n \log i < \int_{x=1}^{n+1} \log x = (n+1)\log(n+1) - n

So in fact log n! is Θ(nlogn), no different from mergesort (if we could somehow avoid the time for the swaps.) Deco 05:54, 29 Mar 2005 (UTC)

Erm...what? XreDuex (talk) 17:50, 13 March 2008 (UTC)

[edit] Implementations

I removed the following from the article, as most (if not all) are listed on the wikibook page. They can be added back if desired --DevastatorIIC 14:00, 31 May 2006 (UTC)

I've fixed the C example of insertionSort, now works properly without memory underflow, if the unsorted element shoul dbe inserted on the first position, the algorithm could insert the number on address *(array - 1).

Also i've changed the a[] variable for array[] for more readability. This example may be more suitable for understanding because no sub-functions are used.

17:35, 16 Sep 2006

DELETED THE IMPLIMENTATIONS FROM THE TALK PAGE ::

Since they've been deleted for more than a year, they're just clutter. Fresheneesz (talk) 09:43, 5 December 2007 (UTC)

[edit] More implementations

For some reason, people feel compelled to add implementations to this page. Here are some I just removed from the page:

c++ Example:

#include <iostream>
#include <cstdio>
 
//Originally Compiled tested with g++ on Linux
using namespace std;
 
bool swap(int&, int&); //Swaps Two Ints
void print_array(int* ar, int); //Nothing Just Shows The Array Visually
int ins_sort(int*, int); //The Insertion Sort Function
 
int main()
{
        int array[9] = {4, 3, 5, 1, 2, 0, 7, 9, 6}; //The Original Array
        desc(array, 9);
        *array = ins_sort(array, 9);
        cout << "Array Sorted Press Enter To Continue and See the Resultant Array" << endl
             << "-----------------8<-------------------------------->8--------------";
        getchar();
        desc(array, 9);
        getchar();
        return 0;
}
int ins_sort(int* array, int len)
{
        for (int i = 0; i < len; i++)
        {
                int val = array[i];
                int key = i;
                cout << "key(Key) = " << key << "\tval(Value) = " << val << endl;
                for (; key >= 1 && array[key-1] >= val; --key)
                {
                        cout << "Swapping Backward\tfrom (key) " << key << " of (Value) " << array[key] << "\tto (key) " << key-1
                             << " of (Value) " << array[key-1];
                        cout << "\n\t" << key << " <----> " << key-1 << "\t( " << array[key] << "<----> " << array[key-1] << " )";
                        swap(array[key], array[key-1]);
                        desc(array, 9);
                }
        }
        return *array;
}
bool swap(int& pos1, int& pos2)
{
        int tmp = pos1;
        pos1 = pos2;
        pos2 = tmp;
        return true;
}
void print_array(int* ar, int len)
{
        cout << endl << "Describing The Given Array" << endl;
        for (int i = 0; i < len; i++)
                cout << "_______" << "\t";
        cout << endl;
        for (int i = 0; i < len; i++)
                cout << " | " << i << " | " << "\t";
        cout << endl;
        for (int i = 0; i < len; i++)
                cout << " ( " << ar[i] << " ) " << "\t";
        cout<<endl;
        for (int i = 0; i < len; i++)
                cout << "-------" << "\t";
        getchar();
}

[edit] picture requires explanation

What does that "Example of insertion sort" show exactly? What do the x and y coordinates represent? Without this kind of explanation it isn't helpful at all! Fresheneesz (talk) 09:41, 5 December 2007 (UTC)

[edit] Removed claim about real-time apps

I removed this:

It can be seen as an advantage for some real-time applications that selection sort will perform identically regardless of the order of the array, while insertion sort's running time can vary considerably.

This seems silly. Not only is it false, since caches could disrupt the running time of selection sort; if it's truly important to make insertion sort as uniformly slow as selection sort, one could always add delay loops or useless array accesses. --P3d0 (talk) 17:58, 5 February 2008 (UTC)

No argument from me. The runtime of selection sort is more predictable, but for real-time applications it suffices to determine a reasonable upper bound on runtime, which can be done by sorting a reversed list, and my expectation is that insertion sort is faster in the worst case than selection sort is in the best case. Dcoetzee 20:35, 6 February 2008 (UTC)

[edit] insertion sort psudocode

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void insertionSort1(int *A, int length) {
        int i, j, value; /* This is from the psudocode */
        for (i = 1; i < length-1; i++) {
                value = A[i];
                j = i - 1;
                while (j >= 0 && A[j] > value) {
                        A[j + 1] = A[j];
                        j = j-1;
                }
                A[j+1] = value;
        }
}

void insertionSort2(int *A, int length) {
        int i, j, value; /* Corrected psudocode */
        for (i = 1; i < length; i++) {
                value = A[i];
                j = i - 1;
                while (j >= 0 && A[j] > value) {
                        A[j + 1] = A[j];
                        j = j-1;
                }
                A[j+1] = value;
        }
}

int main() {
        int length = 10;
        int array1[length];
        int array2[length];
        int i;
        srand(time(NULL));
        printf("Original array:\n");
        for (i = 0; i < length; i++) {
                array1[i] = array2[i] = random() % (2 * length);
                printf("\tarray[%d]: %d\n", i, array1[i]);
        }
        insertionSort1(array1, length);
        insertionSort2(array2, length);
        printf("Output of insertion sort 1 (from psudocode):\n");
        for (i = 0; i < length; i++)
                printf("\tarray1[%d] = %d\n", i, array1[i]);
        printf("Output of insertion sort 2 (corrected psudocode):\n");
        for (i = 0; i < length; i++)
                printf("\tarray2[%d] = %d\n", i, array2[i]);
        return 0;
}

/* OUTPUT: 
Original array:
        array[0]: 1
        array[1]: 16
        array[2]: 19
        array[3]: 5
        array[4]: 12
        array[5]: 4
        array[6]: 17
        array[7]: 10
        array[8]: 11
        array[9]: 13
Output of insertion sort 1 (from psudocode):
        array1[0] = 1
        array1[1] = 4
        array1[2] = 5
        array1[3] = 10
        array1[4] = 11
        array1[5] = 12
        array1[6] = 16
        array1[7] = 17
        array1[8] = 19
        array1[9] = 13
Output of insertion sort 2 (corrected psudocode):
        array2[0] = 1
        array2[1] = 4
        array2[2] = 5
        array2[3] = 10
        array2[4] = 11
        array2[5] = 12
        array2[6] = 13
        array2[7] = 16
        array2[8] = 17
        array2[9] = 19
*/

I do understand that the psudocode is zero-indexed (despite the i = 1 in the loop), however if you look closely at the code, A[length-1] (the last element in the array) will never be touched, much less sorted. —Preceding unsigned comment added by Jordanbray (talk • contribs) 19:00, 16 February 2008 (UTC)

It looks to me like the original worked fine and the revision is broken. Please can you double-check? OoS (talk) 18:47, 17 February 2008 (UTC)

Yes, the revision is broken. On the final pass through the outer loop, i=length(A) followed by value=A[i] generates an array index out of bounds exception. Silly rabbit (talk) 19:29, 17 February 2008 (UTC)
I think Jordan does not understand that the pseudocode notation "for i = 1 to length(A)-1" includes length(A)-1. The problem is in how the pseudocode was translated into the C code above, not in the pseudocode itself. Silly rabbit (talk) 19:32, 17 February 2008 (UTC)

[edit] Pseudocode fix

Implemented the pseudocode in Java today (Eclipse is my SDK). It does not sort the last index. Removing the -1 after array.length[A] (or length[A] if you prefer the pseudocode) sorts the last variable and does not throw an exception. I haven't traced the code yet to find out why, but I figured I'd fix the implementation on the article first. XreDuex (talk) 18:12, 13 March 2008 (UTC)

Please see the discussion of the preceding section. In the last loop, setting value = A[length(A)] will give an index out of bounds exception. It's likely that your Java implementation made the same error as the C implementation given above. I have reverted your erroneous correction to the pseudocode. silly rabbit (talk) 18:23, 13 March 2008 (UTC)
Instead of correcting a supposed "erroneous" error, you could try implementing it in Eclipse. No exception is thrown with or without the -1 after array.length, but with it the final index is not sorted. It also seems as if you don't understand which "-1" I am referring to (although my crappy wording in my original post probably didn't help). I am talking about the "-1" in the for loop's conditions.
public static void insSort(int[] a)
{
        for (int count = 1; count < a.length; count++)
        {
                int value = a[count];
                int temp = count-1;
                while (temp >= 0 && a[temp] > value)
                {
                        a[temp + 1] = a[temp];
                        temp = temp-1;
                }
                a[temp+1] = value;
        }
}
I also do not see how the discussion on the C implentation is relevant at all, as no error results. XreDuex (talk) 17:54, 14 March 2008 (UTC)
He's right - you did make the same error as the C program above. The upper bound in the pseudocode is inclusive, while the upper bound in your for loop is exclusive. This requires an adjustment of the loop bound by 1. I suppose the pseudocode could be clarified, but this is a minor detail that would distract from the topic at hand if given too much attention. Dcoetzee 00:31, 15 March 2008 (UTC)