Best, worst and average case

In computer science, best, worst, and average cases of a given algorithm express what the resource usage is at least, at most and on average, respectively. Usually the resource being considered is running time, i.e. time complexity, but it could also be memory or other resources.

In real-time computing, the worst-case execution time is often of particular concern since it is important to know how much time might be needed in the worst case to guarantee that the algorithm will always finish on time.

Average performance and worst-case performance are the most used in algorithm analysis. Less widely found is best-case performance, but it does have uses: for example, where the best cases of individual tasks are known, they can be used to improve the accuracy of an overall worst-case analysis. Computer scientists use probabilistic analysis techniques, especially expected value, to determine expected running times.

The terms are used in other contexts; for example the worst- and best-case outcome of a planned-for epidemic, worst-case temperature to which an electronic circuit element is exposed, etc. Where components of specified tolerance are used, devices must be designed to work properly with the worst-case combination of tolerances and external conditions.

Best-case performance for algorithm

The term best-case performance is used in computer science to describe an algorithm's behavior under optimal conditions. For example, the best case for a simple linear search on a list occurs when the desired element is the first element of the list.

Development and choice of algorithms is rarely based on best-case performance: most academic and commercial enterprises are more interested in improving Average-case complexity and worst-case performance. Algorithms may also be trivially modified to have good best-case running time by hard-coding solutions to a finite set of inputs, making the measure almost meaningless.[1]

Worst-case versus average-case performance

Worst-case performance analysis and average case performance analysis have some similarities, but in practice usually require different tools and approaches.

Determining what average input means is difficult, and often that average input has properties which make it difficult to characterise mathematically (consider, for instance, algorithms that are designed to operate on strings of text). Similarly, even when a sensible description of a particular "average case" (which will probably only be applicable for some uses of the algorithm) is possible, they tend to result in more difficult analysis of equations.

Worst-case analysis has similar problems: it is typically impossible to determine the exact worst-case scenario. Instead, a scenario is considered such that it is at least as bad as the worst case. For example, when analysing an algorithm, it may be possible to find the longest possible path through the algorithm (by considering the maximum number of loops, for instance) even if it is not possible to determine the exact input that would generate this path (indeed, such an input may not exist). This gives a safe analysis (the worst case is never underestimated), but one which is pessimistic, since there may be no input that would require this path.

Alternatively, a scenario which is thought to be close to (but not necessarily worse than) the real worst case may be considered. This may lead to an optimistic result, meaning that the analysis may actually underestimate the true worst case.

In some situations it may be necessary to use a pessimistic analysis in order to guarantee safety. Often however, a pessimistic analysis may be too pessimistic, so an analysis that gets closer to the real value but may be optimistic (perhaps with some known low probability of failure) can be a much more practical approach.

When analyzing algorithms which often take a small time to complete, but periodically require a much larger time, amortized analysis can be used to determine the worst-case running time over a (possibly infinite) series of operations. This amortized worst-case cost can be much closer to the average case cost, while still providing a guaranteed upper limit on the running time.

The worst-case analysis is related to the worst-case complexity.[2]

Practical consequences

Many problems with bad worst-case performance have good average-case performance. For problems we want to solve, this is a good thing: we can hope that the particular instances we care about are average. For cryptography, this is very bad: we want typical instances of a cryptographic problem to be hard. Here methods like random self-reducibility can be used for some specific problems to show that the worst case is no harder than the average case, or, equivalently, that the average case is no easier than the worst case.

On the other hand some algorithms like hash tables have very poor worst case behaviours, but a well written hash table of sufficient size will statistically never give the worst case; the average number of operations performed follows an exponential decay curve, and so the run time of an operation is statistically bounded.

Examples

Sorting algorithms

Algorithm Data Structure Time Complexity:Best Time Complexity:Average Time Complexity:Worst Space Complexity:Worst
Quick Sort Array O(n log(n)) O(n log(n)) O(n2) O(log(n))
Merge sort Array O(n log(n)) O(n log(n)) O(n log(n)) O(n)
Heap sort Array O(n log(n)) O(n log(n)) O(n log(n)) O(1)
Smooth sort Array O(n) O(n log(n)) O(n log(n)) O(1)
Bubble sort Array O(n) O(n2) O(n2) O(1)
Insertion sort Array O(n) O(n2) O(n2) O(1)
Selection sort Array O(n2) O(n2) O(n2) O(1)

Data structures

Data structure Time Complexity: Avg: Indexing Time Complexity: Avg: Search Time Complexity: Avg: Insertion Time Complexity: Avg: Deletion Time Complexity: Worst: Indexing Time Complexity: Worst: Search Time Complexity: Worst: Insertion Time Complexity: Worst: Deletion Space Complexity: Worst
Basic Array O(1) O(n) - - O(1) O(n) - - O(n)
Dynamic array O(1) O(n) O(n) - O(1) O(n) O(n) - O(n)
Singly linked list O(n) O(n) O(1) O(1) O(n) O(n) O(1) O(1) O(n)
Doubly linked list O(n) O(n) O(1) O(1) O(n) O(n) O(1) O(1) O(n)
Hash table - O(1) O(1) O(1) - O(n) O(n) O(n) O(n)
Binary search tree - O((log n)) O((log n)) O((log n)) - O(n) O(n) O(n) O(n)
B-tree - O((log n)) O((log n)) O((log n)) - O((log n)) O((log n)) O((log n)) O(n)
Red-black tree - O((log n)) O((log n)) O((log n)) - O((log n)) O((log n)) O((log n)) O(n)
AVL tree - O((log n)) O((log n)) O((log n)) - O((log n)) O((log n)) O((log n)) O(n)

See also

References

  1. Introduction to Algorithms (Cormen, Leiserson, Rivest, and Stein) 2001, Chapter 2 "Getting Started".
  2. Worst-case complexity
This article is issued from Wikipedia - version of the Wednesday, February 10, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.