Best, worst and average case

From Wikipedia, the free encyclopedia

In computer science, best, worst and average cases of a given algorithm express what the resource usage is at least, at most and on average, respectively. Usually the resource being considered is running time, but it could also be memory or other resource.

In real-time computing, the worst case execution time is often of particular concern since it is important to know how much time might be needed in the worst case to guarantee that the algorithm would always finish on time.

Average performance and worst-case performance are the most used in algorithm analysis. Less widely found is best-case performance, but it does have uses, for example knowing the best cases of individual tasks can be used to improve accuracy of an overall worst case analysis. Computer scientists use probabilistic analysis techniques, especially expected value, to determine expected average running times.

[edit] Worst case versus average case performance

Worst case performance analysis and average case performance analysis have similarities, but usually require different tools and approaches in practice.

Determining what average input means is difficult, and often that average input has properties which make it difficult to characterise mathematically (consider, for instance, algorithms that are designed to operate on strings of text). Similarly, even when a sensible description of a particular "average case" (which will probably only be applicable for some uses of the algorithm) is possible, they tend to result in more difficult to analyse equations.

Worst case analysis has similar problems, typically it is impossible to determine the exact worst case scenario. Instead, a scenario is considered which is at least as bad as the worst case. For example, when analysing an algorithm, it may be possible to find the longest possible path through the algorithm (by considering maximum number of loops, for instance) even if it is not possible to determine the exact input that could generate this. Indeed, such an input may not exist. This leads to a safe analysis (the worst case is never underestimated), but which is pessimistic, since no input might require this path.

Alternatively, a scenario which is thought to be close to (but not necessarily worse than) the real worst case may be considered. This may lead to an optimistic result, meaning that the analysis may actually underestimate the true worst case.

In some situations it may be necessary to use a pessimistic analysis in order to guarantee safety. Often however, a pessimistic analysis may be too pessimistic, so an analysis that gets closer to the real value but may be optimistic (perhaps with some known low probability of failure) can be a much more practical approach.

When analyzing algorithms which often take a small time to complete, but periodically require a much larger time, amortized analysis can be used to determine the worst case running time over a (possibly infinite) series of operations. This amortized worst case cost can be much closer to the average case cost, while still providing a guaranteed upper limit on the running time.

[edit] Examples

  • In the worst case, linear search on an array must visit every element once. It does this if either the element being sought is the last element in the list, or if the element being sought is not in the list. However, on average, assuming the input is in the list, it visits only n/2 elements.
  • Applying insertion sort on n elements. On average, half the elements in an array A1 ... Aj-1 are less than an element Aj, and half are greater. Therefore we check half the subarray so tj = j/2. Working out the resulting average case running time yields a quadratic function of the input size, just like the worse case running time.
  • The popular sorting algorithm Quicksort has an average case performance of O(n log n), which contributes to making it a very fast algorithm in practice. But given a worst-case input, its performance can degrade to O(n2).

[edit] See also

  • Sorting algorithm - an area where there is a great deal of performance analysis of various algorithms.
In other languages