Selection sort

From Wikipedia, the free encyclopedia

Selection sort is a sorting algorithm, specifically an in-place comparison sort. It has Θ(n2) complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity, and also has performance advantages over more complicated algorithms in certain situations. It works as follows:

  1. Find the minimum value in the list
  2. Swap it with the value in the first position
  3. Repeat the steps above for remainder of the list (starting at the second position)

Here is an example of this sort algorithm sorting a total of five elements:

31 25 12 22 11
11 25 12 22 31
11 12 25 22 31
11 12 22 25 31

[edit] Implementation

It is a very intuitive sorting algorithm which is simple to implement on a computer, as in the following C implementation, which makes use of a swap function:

void selectionSort(int a[], int size)
{
  int i, j, min;

  for (i = 0; i < size - 1; i++)
  {
    min = i;
    for (j = i+1; j < size; j++)
      if (a[j] < a[min])
        min = j;

    swap(a[i], a[min]);
  }
}

[edit] Analysis

Selection sort is very easy to analyze since none of the loops depend on the data in the array. Selecting the lowest element requires scanning all n elements (this takes n - 1 comparisons) and then swapping it into the first position. Finding the next lowest element requires scanning the remaining n - 1 elements and so on, for a total of (n - 1) + (n - 2) + ... + 2 + 1 = Θ(n2) comparisons. Each of these scans requires one swap for a total of n - 1 swaps (the final element is already in place). Thus, the comparisons dominate the running time, which is Θ(n2).

[edit] Comparison to other Sorting Algorithms

Among simple average-case Θ(n2) algorithms, selection sort always outperforms bubble sort and gnome sort, but is generally outperformed by insertion sort. Insertion sort is very similar in that after the kth iteration, the first k elements in the array are in sorted order. Insertion sort's advantage is that it only scans as many elements as it needs to in order to place the k + 1st element, while selection sort must scan all remaining elements to find the k + 1st element.

Simple calculation shows that insertion sort will therefore usually perform about half as many comparisons as selection sort, although it can perform just as many or far fewer depending on the order the array was in prior to sorting. It can be seen as an advantage for some real-time applications that selection sort will perform identically regardless of the order of the array, while insertion sort's running time can vary considerably. However, this is more often an advantage for insertion sort in that it runs much more efficiently if the array is already sorted or "close to sorted."

Another key difference is that selection sort always performs Θ(n) swaps, while insertion sort performs Θ(n2) swaps in the average and worst cases. Because swaps require writing to the array, selection sort is preferable if writing to memory is significantly more expensive than reading, such as when dealing with an array stored in EEPROM or Flash.

Finally, selection sort is greatly outperformed on larger arrays by Θ(nlog n) divide-and-conquer algorithms such as quicksort and mergesort. However, insertion sort or selection sort are both typically faster for small arrays (ie less than 10-20 elements). A useful optimization in practice for the recursive algorithms to switch to insertion sort or selection sort for "small enough" sublists.

[edit] Variants

Heapsort greatly improves the basic algorithm by using an implicit heap data structure to speed up finding and removing the lowest datum. If implemented correctly, the heap will allow finding the next lowest element in Θ(log n) time instead of Θ(n) for the inner loop in normal selection sort, reducing the total running time to Θ(n log n).

A bidirectional variant of selection sort, called cocktail sort, is an algorithm which finds both the minimum and maximum values in the list in every pass. This reduces the number of scans of the list by a factor of 2, eliminating some loop overhead but not actually decreasing the number of comparisons or swaps. Note, however, that cocktail sort more often refers to a bidirectional variant of bubble sort.

Selection sort can be implemented as a stable sort. If, rather than swapping in step 2, the minimum value is inserted into the first position (that is, all intervening items moved down), the algorithm is stable. However, this modification leads to Θ(n2 ) writes, eliminating the main advantage of selection sort over insertion sort, which is always stable.


[edit] References

  • Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0. Pages 138–141 of Section 5.2.3: Sorting by Selection.

[edit] External links