Information-based complexity

Information-based complexity (IBC) studies optimal algorithms and computational complexity for the continuous problems which arise in physical science, economics, engineering, and mathematical finance. IBC has studied such continuous problems as path integration, partial differential equations, systems of ordinary differential equations, nonlinear equations, integral equations, fixed points, and very-high-dimensional integration. All these problems involve functions (typically multivariate) of a real or complex variable. Since one can never obtain a closed-form solution of the problems of interest one has to settle for a numerical solution. Since a function of a real or complex variable cannot be entered into a digital computer, the solution of continuous problems involves partial information. To give a simple illustration, in the numerical approximation of an integral, only samples of the integrand at a finite number of points are available. In the numerical solution of partial differential equations the functions specifying the boundary conditions and the coefficients of the differential operator can only be sampled. Furthermore, this partial information can be expensive to obtain. Finally the information is often contaminated by noise.

The goal of information-based complexity is to create a theory of computational complexity and optimal algorithms for problems with partial, contaminated and priced information, and to apply the results to answering questions in various disciplines. Examples of such disciplines include physics, economics, mathematical finance, computer vision, control theory, geophysics, medical imaging, weather forecasting and climate prediction, and statistics. The theory is developed over abstract spaces, typically Hilbert or Banach spaces, while the applications are usually for multivariate problems.

Since the information is partial and contaminated, only approximate solutions can be obtained. IBC studies computational complexity and optimal algorithms for approximate solutions in various settings. Since the worst case setting often leads to negative results such as unsolvability and intractability, settings with weaker assurances such as average, probabilistic and randomized are also studied. A fairly new area of IBC research is continuous quantum computing.

Overview

We illustrate some important concepts with a very simple example, the computation of

\int_0^1 f(x)\,dx.

For most integrands we can't use the fundamental theorem of calculus to compute the integral analytically; we have to approximate it numerically. We compute the values of  f at n points

[f(t_1),\dots,f(t_n)].

The n numbers are the partial information about the true integrand f(x). We combine these n numbers by a combinatory algorithm to compute an approximation to the integral. See the monograph Complexity and Information for particulars.

Because we have only partial information we can use an adversary argument to tell us how large n has to be to compute an \epsilon-approximation. Because of these information-based arguments we can often obtain tight bounds on the complexity of continuous problems. For discrete problems such as integer factorization or the travelling salesman problem we have settle for conjectures about the complexity hierarchy. The reason is that the input is a number or a vector of numbers and can thus be entered into the computer. Thus there is typically no adversary argument at the information level and the complexity of a discrete problem is rarely known.

The univariate integration problem was for illustration only. Significant for many applications is multivariate integration. The number of variables is in the hundreds or thousands. The number of variables may even be infinite; we then speak of path integration. The reason that integrals are important in many disciplines is that they occur when we want to know the expected behavior of a continuous process. See for example, the application to mathematical finance below.

Assume we want to compute an integral in d dimensions (dimensions and variables are used interchangeably) and that we want to guarantee an error at most \epsilon for any integrand in some class. The computational complexity of the problem is known to be of order \epsilon^{-d}. (Here we are counting the number of function evaluations and the number of arithmetic operations so this is the time complexity.) This would take many years for even modest values of d. The exponential dependence on d is called the curse of dimensionality. We say the problem is intractable.

We stated the curse of dimensionality for integration. But exponential dependence on d occurs for almost every continuous problem that has been investigated. How can we try to vanquish the curse? There are two possibilities:

An example: mathematical finance

Very high dimensional integrals are common in finance. For example, computing expected cash flows for a collateralized mortgage obligation (CMO) requires the calculation of a number of 360 dimensional integrals, the 360 being the number of months in 30 years. Recall that if a worst case assurance is required the time is of order \epsilon^{-d} time units. Even if the error is not small, say \epsilon=10^{-2}, this is 10^{720} time units. People in finance have long been using the Monte Carlo method (MC), an instance of a randomized algorithm. Then in 1994 a research group at Columbia University (Papageorgiou, Paskov, Traub, Woźniakowski) discovered that the quasi-Monte Carlo (QMC) method using low discrepancy sequences beat MC by one to three orders of magnitude. The results were reported to a number of Wall Street finance to considerable initial skepticism. The results were first published by Paskov and Traub, Faster Valuation of Financial Derivatives, Journal of Portfolio Management 22, 113-120. Today QMC is widely used in the financial sector to value financial derivatives.

These results are empirical; where does computational complexity come in? QMC is not a panacea for all high dimensional integrals. What is special about financial derivatives? Here's a possible explanation. The 360 dimensions in the CMO represent monthly future times. Due to the discounted value of money variables representing times for in the future are less important than the variables representing nearby times. Thus the integrals are non-isotropic. Sloan and Woźniakowski introduced the very powerful idea of weighted spaces which is a formalization of the above observation. They were able to show that with this additional domain knowledge high dimensional integrals satisfying certain conditions were tractable even in the worst case! In contrast the Monte Carlo method gives only a stochastic assurance. See Sloan and Woźniakowski When are Quasi-Monte Carlo Algorithms Efficient for High Dimensional Integration? J. Complexity 14, 1-33, 1998. For which classes of integrals is QMC superior to MC? This continues to be a major research problem.

Brief history

Precursors to IBC may be found in the 1950s by Kiefer, Sard, and Nikolskij. In 1959 Traub had the key insight that the optimal algorithm and the computational complexity of solving a continuous problem depended on the available information. He applied this insight to the solution of nonlinear equations which started the area of optimal iteration theory. This research was published in the 1964 monograph Iterative Methods for the Solution of Equations.

The general setting for information-based complexity was formulated by Traub and Woźniakowski in 1980 in A General Theory of Optimal Algorithms. For a list of more recent monographs and pointers to the extensive literature see To Learn More below.

Prizes

There are a number of prizes for IBC research.

References

Extensive bibliographies may be found in the monographs N (1988), TW (1980), TWW (1988) and TW (1998). The IBC website has a searchable data base of some 730 items.

External links