Best, worst and average case

From Wikipedia, the free encyclopedia

In computer science, best, worst and average cases of a given algorithm express what the resource usage is at least, at most and on average, respectively. Usually the resource being considered is running time, but it could also be memory or other resource.

In real-time computing, the worst-case execution time is often of particular concern since it is important to know how much time might be needed in the worst case to guarantee that the algorithm would always finish on time.

Average performance and worst-case performance are the most used in algorithm analysis. Less widely found is best-case performance, but it does have uses, for example knowing the best cases of individual tasks can be used to improve accuracy of an overall worst-case analysis. Computer scientists use probabilistic analysis techniques, especially expected value, to determine expected running times.

Contents

[edit] Best-case performance

The term best-case performance is used in computer science to describe the way an algorithm behaves under optimal conditions. For example, a simple linear search on an array has a worst-case performance O(n) (for the case where the desired element is the last, so the algorithm has to check every element; see Big O notation), and average running time is O(n) (the average position of an element is the middle of the array, ie. at position n/2, and O(n/2)=O(n)), but in the best case the desired element is the first element in the array and the run time is O(1).

Development and choice of algorithms is rarely based on best-case performance: most academic and commercial enterprises are more interested in improving average performance and worst-case performance.

[edit] Worst case versus average case performance

Worst-case performance analysis and average case performance analysis have similarities, but usually require different tools and approaches in practice.

Determining what average input means is difficult, and often that average input has properties which make it difficult to characterise mathematically (consider, for instance, algorithms that are designed to operate on strings of text). Similarly, even when a sensible description of a particular "average case" (which will probably only be applicable for some uses of the algorithm) is possible, they tend to result in more difficult to analyse equations.

Worst-case analysis has similar problems, typically it is impossible to determine the exact worst-case scenario. Instead, a scenario is considered which is at least as bad as the worst case. For example, when analysing an algorithm, it may be possible to find the longest possible path through the algorithm (by considering maximum number of loops, for instance) even if it is not possible to determine the exact input that could generate this. Indeed, such an input may not exist. This leads to a safe analysis (the worst case is never underestimated), but which is pessimistic, since no input might require this path.

Alternatively, a scenario which is thought to be close to (but not necessarily worse than) the real worst case may be considered. This may lead to an optimistic result, meaning that the analysis may actually underestimate the true worst case.

In some situations it may be necessary to use a pessimistic analysis in order to guarantee safety. Often however, a pessimistic analysis may be too pessimistic, so an analysis that gets closer to the real value but may be optimistic (perhaps with some known low probability of failure) can be a much more practical approach.

When analyzing algorithms which often take a small time to complete, but periodically require a much larger time, amortized analysis can be used to determine the worst-case running time over a (possibly infinite) series of operations. This amortized worst-case cost can be much closer to the average case cost, while still providing a guaranteed upper limit on the running time.

[edit] Practical consequences

Many problems with bad worst-case performance have good average-case performance. For problems we want to solve, this is a good thing: we can hope that the particular instances we care about are average. For cryptography, this is very bad: we want typical instances of a cryptographic problem to be hard. Here methods like random self-reducibility can be used for some specific problems to show that the worst case is no harder than the average case, or, equivalently, that the average case is no easier than the worst case.

[edit] Examples

  • In the worst case, linear search on an array must visit every element once. It does this if either the element being sought is the last element in the list, or if the element being sought is not in the list. However, on average, assuming the input is in the list, it visits only n/2 elements.
  • Applying insertion sort on n elements. On average, half the elements in an array A1 ... Aj-1 are less than an element Aj, and half are greater. Therefore we check half the subarray so tj = j/2. Working out the resulting average case running time yields a quadratic function of the input size, just like the worst-case running time.
  • The popular sorting algorithm Quicksort has an average case performance of O(n log n), which contributes to making it a very fast algorithm in practice. But given a worst-case input, its performance can degrade to O(n2).

[edit] See also

Languages