Pigeonhole sort

From Wikipedia, the free encyclopedia

Pigeonhole sorting, also known as count sort, is a sorting algorithm that takes linear time (Θ(n)), which is the best possible performance for a sorting algorithm since one must inevitably look at each of the elements being sorted at least once, regardless of the sorting algorithm. However, pigeonhole sorting is only practical if the objects you are sorting fall within (or can be mapped into) a range of possible values that is small compared to the size of the list to be sorted.

The algorithm's efficiency is reduced whenever more than one non-equivalent item gets sorted into the same pigeonhole. (If you have to sort within each pigeonhole, you've defeated the purpose because you're now inspecting each item more than once.) So for simplicity, and to keep the "classic" pigeonhole sort distinct from its many variations, we specify that the mapping must be reversible, ie, if two items end up in the same bucket then they must have the same value. Multiple items with the same value in the same bucket are no problem, just put them into the sorted list adjacent to each other (this can even be implemented as a stable sort (see sorting)).

The pigeonhole algorithm works as follows.

Set up an array of initially empty "pigeonholes", one pigeonhole for each value in the range of keys.
Go over the original array, putting each object in its pigeonhole.
Iterate over the pigeonhole array in order, and put elements from non-empty holes back into the original array.

The hidden constant for this algorithm depends critically on the density of the elements in the pigeonhole array. If there are many more array elements than items to be sorted, steps 1 and 3 will be relatively slow.

Pigeonhole sort is rarely used as the requirements are rarely met and other, more flexible, and almost as fast, sorting algorithms are easier to use. In particular, the bucket sort is a more practical variation on the pigeonhole sort.

A well-known variant of the pigeonhole sort is the tally sort; while it is only applicable to a very limited number of problems, it is widely familiar because of its use in the book Programming Pearls as an example of an unconventional solution to a particular set of limitations.

In a certain sense, each pass of quicksort is a kind of pigeonhole sort with just two pigeonholes (or three pigeonholes if it uses a Dutch National flag partitioning).

[edit] Pseudocode

Pseudocode for a pigeonhole sort for N distinct elements (using zero-based indexing):

function pigeonhole_sort(array a[n])
    array b[N]
    var i,j
    
    zero_var (b)      (* zero out array b *)
    
    for i in [0...length(a)-1]
        b[a[i]] := b[a[i]]+1

    (* copy the results back to a *)
    j := 0
    for i in [0...length(b)-1]
        repeat b[i] times
            a[j] := i
            j := j+1