Condition number

In the field of numerical analysis, the condition number of a function with respect to an argument measures the asymptotically worst case of how much the function can change in proportion to small changes in the argument. The "function" is the solution of a problem and the "arguments" are the data in the problem.

A problem with a low condition number is said to be well-conditioned, while a problem with a high condition number is said to be ill-conditioned.

The condition number is a property of the problem. Paired with the problem are any number of algorithms that can be used to solve the problem, that is, to calculate the solution. Some algorithms have a property called backward stability. In general, a backward stable algorithm can be expected to accurately solve well-conditioned problems. Numerical analysis textbooks give formulas for the condition numbers of problems and identify the backward stable algorithms.

As a general rule of thumb, if the condition number \kappa(A) = 10^k, then you may lose up to k digits of accuracy on top of what would be lost to the numerical method due to loss of precision from arithmetic methods.[1] However, the condition number does not give the exact value of the maximum inaccuracy that may occur in the algorithm. It generally just bounds it with an estimate (whose computed value depends on the choice of the norm to measure the inaccuracy).

Contents

Matrices

For example, the condition number associated with the linear equation Ax = b gives a bound on how inaccurate the solution x will be after approximate solution. Note that this is before the effects of round-off error are taken into account; conditioning is a property of the matrix, not the algorithm or floating point accuracy of the computer used to solve the corresponding system. In particular, one should think of the condition number as being (very roughly) the rate at which the solution, x, will change with respect to a change in b. Thus, if the condition number is large, even a small error in b may cause a large error in x. On the other hand, if the condition number is small then the error in x will not be much bigger than the error in b.

The condition number is defined more precisely to be the maximum ratio of the relative error in x divided by the relative error in b.

Let e be the error in b. Assuming that A is a square matrix, the error in the solution A−1b is A−1e. The ratio of the relative error in the solution to the relative error in b is

 \frac{ \left\Vert A^{-1} e \right\Vert / \left\Vert A^{-1} b \right\Vert }{ \left\Vert e \right\Vert / \left\Vert b \right\Vert } .

This is easily transformed to

 \left( \left\Vert A^{-1} e \right\Vert / \left\Vert e \right\Vert \right) \cdot \left( \left\Vert b \right\Vert / \left\Vert A^{-1} b \right\Vert \right) .

The maximum value (for nonzero b and e) is easily seen to be the product of the two operator norms:

 \kappa(A) = \left\Vert A \right\Vert \cdot \left\Vert A^{-1} \right\Vert .

The same definition is used for any consistent norm, i.e. one that satisfies

 \kappa(A) \ge 1 .\,

When the condition number is exactly one, then the algorithm may find an approximation of the solution with an arbitrary precision. However it does not mean that the algorithm will converge rapidly to this solution, just that it won't diverge arbitrarily because of inaccuracy on the source data (backward error), provided that the forward error introduced by the algorithm does not diverge as well because of accumulating intermediate rounding errors.

The condition number may also be infinite, in which case the algorithm will not reliably find a solution to the problem, not even a weak approximation of it (and not even its order of magnitude) with any reasonable and provable accuracy.

Of course, this definition depends on the choice of norm:

Other contexts

Condition numbers can be defined for any function ƒ mapping its data from some domain (e.g. an m-tuple of real numbers x) into some codomain [e.g. an n-tuple of real numbers ƒ(x)], where both the domain and codomain are Banach spaces. They express how sensitive that function is to small changes (or small errors) in its arguments. This is crucial in assessing the sensitivity and potential accuracy difficulties of numerous computational problems, for example polynomial root finding or computing eigenvalues.

The condition number of ƒ at a point x (specifically, its relative condition number[2]) is then defined to be the maximum ratio of the fractional change in ƒ(x) to any fractional change in x, in the limit where the change δx in x becomes infinitesimally small:[2]

\lim_{ \varepsilon \to 0^%2B }
        \sup_{ \Vert \delta x \Vert \leq \varepsilon } 
        \left[  \frac{ \left\Vert f(x %2B \delta x) - f(x)\right\Vert }{ \Vert f(x) \Vert }  
              / \frac{ \Vert \delta x \Vert }{ \Vert x \Vert }
        \right],

where \Vert \cdots \Vert is a norm on the domain/codomain of ƒ(x).

If ƒ is differentiable, this is equivalent to:[2]

\frac{\Vert J \Vert}{ \Vert f(x) \Vert / \Vert x \Vert},

where J denotes the Jacobian matrix of partial derivatives of ƒ and \Vert J \Vert is the induced norm on the matrix.

References

  1. ^ Numerical Mathematics and Computing, by Cheney and Kincaid.
  2. ^ a b c L. N. Trefethen and D. Bau, Numerical Linear Algebra (SIAM, 1997).

External links