Counting sort

From Wikipedia, the free encyclopedia

Counting sort is a sorting algorithm which (like bucket sort) takes advantage of knowing the range of the numbers in the array to be sorted (array A). It uses this range to create an array C of this length. Each index i in array C is then used to count how many elements in A have a value less than i. The counts stored in C can then be used to put the elements in A into their right position in the resulting sorted array. It is less efficient than pigeonhole sort.

Contents

[edit] Characteristics of counting sort

Counting sort is stable (see sorting algorithm) and has a running time of Θ(n+k), where n and k are the lengths of the arrays A (the input array) and C (the counting array), respectively. In order for this algorithm to be efficient, k must not be too large compared to n. As long as k is O(n) the running time of the algorithm is Θ(n).

The indices of C must run from the minimum to the maximum value in A to be able to index C directly with the values of A. Otherwise, the values of A will need to be translated (shifted), so that the minimum value of A matches the smallest index of C. (Translation by subtracting the minimum value of A from each element to get an index into C therefore gives a counting sort. If a more complex function is used to relate values in A to indices into C, it is a bucket sort.) If the minimum and maximum values of A are not known, an initial pass of the data will be necessary to find these (this pass will take time Θ(n); see selection algorithm).

The length of the counting array C must be at least equal to the range of the numbers to be sorted (that is, the maximum value minus the minimum value plus 1). This makes counting sort impractical for large ranges in terms of time and memory needed. Counting sort may for example be the best algorithm for sorting numbers whose range is between 0 and 100, but it is probably unsuitable for sorting a list of names alphabetically (again, see bucket sort and pigeonhole sort). However counting sort can be used in radix sort to sort a list of numbers whose range is too large for counting sort to be suitable alone.

Because counting sort uses key values as indexes into an array, it is not a comparison sort, allowing it to break the Ω(n log n) lower-bound on those sorts.

[edit] The algorithm

[edit] Informal

  1. Count the discrete elements in the array. (after calculating the minimum and the maximum)
  2. Accumulate the counts.
  3. Fill the destination array from backwards: put each element to its countth position.
    Each time put in a new element decrease its count.

[edit] C++ implementation

/// countingSort - sort an array of values.
///
/// For best results the range of values to be sorted
/// should not be significantly larger than the number of 
/// elements in the array.
/// 
/// \param A - input - set of values to be sorted
/// \param num_of_elements - input - number of elements in the input and output arrays
/// \return sorted - new ordered array of the first num_of_elements of the 
///  input array. caller is responsible for freeing this array.
///
int* countingSort(const int A[], const int num_of_elements) {
        // search for the minimum and maximum values in the input
        int min = A[0], max = min;
        for (int i = 1; i < num_of_elements; ++i)
        {
                if (A[i] < min)
                        min = A[i];
                else if (A[i] > max)
                        max = A[i];
        }

        // create a counting array, counts, with a member for 
        // each possible discrete value in the input.  
        // initialize all counts to 0.
        int distinct_element_count = max - min + 1;
        int* counts = new int[distinct_element_count]; // C
        for (int i = 0; i < distinct_element_count; ++i)
                counts[i] = 0;

        // for each value in the unsorted array, increment the
        // count in the corresponding element of the count array
        for (int i = 0; i < num_of_elements; ++i)
                ++counts[ A[i] - min ];

        // accumulate the counts - the result is that counts will hold
        // the offset into the sorted array for the value associated with that index
        for (int i = 1; i < distinct_element_count; ++i)
                counts[i] += counts[ i - 1 ];

        // store the elements in a new ordered array
        int* sorted = new int[num_of_elements]; // B
        for (int i = num_of_elements - 1; i >= 0; --i)
                // decrementing the counts value ensures duplicate values in A
                // are stored at different indices in sorted.
                sorted[ --counts[ A[i] - min ] ] = A[i];

        delete[] counts;
        return sorted;
}

[edit] References

[edit] External links

Wikibooks