Inverse problem
From Wikipedia, the free encyclopedia
An inverse problem is the task that often occurs in many branches of science and mathematics where the values of some model parameter(s) must be obtained from the observed data.
The inverse problem can be formulated as follows:
- Data → Model parameters Eq. 1
The transformation from data to model parameters is a result of the interaction of a physical system, e.g. the Earth, the atmosphere, gravity etc. Inverse problems arise for example in geophysics, medical imaging (such as computed axial tomography), remote sensing, ocean acoustic tomography, nondestructive testing, and astronomy.
Inverse problems are typically ill posed, as opposed to the well-posed problems more typical when modelling physical situations where the model parameters or material properties are known. Of the three conditions for a well-posed problem suggested by Jacques Hadamard (existence, uniqueness, stability of the solution or solutions) the condition of stability is most often violated. In the sense of functional analysis, the inverse problem is represented by a mapping between metric spaces. While Inverse Problems are often formulated in infinite dimensional spaces, limitations to a finite number measurements, and the practical consideration of recovering only a finite number of unknown parameters, may lead to the problems being recast in discrete form. In this case the inverse problem will typically be ill-conditioned. See condition number.
Inverse modelling is a term applied to describe the group of methods used to gain information about a physical system based on observations of that system. In other words, it is an attempt to solve the inverse problem.
[edit] Linear inverse problems
A linear inverse problem can be described by:
- d = Gm Eq. 2
where G is an operator, which represents the explicit relationship between data and model parameters and is a representation of the `physical system' in Equation 1 above.
[edit] Examples
One central example of a linear inverse problem is provided by a Fredholm first kind integral equation.
For sufficiently smooth g the operator defined above is compact on reasonable Banach spaces such as Lp spaces. Even if the mapping is bijective its inverse will not be continuous. Thus small errors in the data d are greatly amplified in the solution m. In this sense the inverse problem of inferring m from measured d is ill-posed.
To obtain a numerical solution, the integral must be approximated using quadrature, and the data sampled at discrete points. The resulting system of linear equations will be ill-conditioned.
Another example is the inversion of the Radon transform. Here a function (for example of two variables) is deduced from its integrals along all possible lines. This is precisely the problem solved in image reconstruction for X-ray computerized tomography.
[edit] Non-linear inverse problems
An inherently more difficult family inverse problems are collectively referred to as non-linear inverse problems.
Non-linear inverse problems have a more complex relationship between data and model, represented by the equation:
- d=G(m)
Here G is a non-linear operator and cannot be separated to represent a linear mapping of the model parameters that form m into the data. In such research, the first to do is to understand the structure of the problem and to give a theoretical answer to the three Hadamard questions (so that the problem is "solved fom the theoretical point of view" ). It is only later in the study that regularisation and interpretations of the solutions evolution with new measurements ( probabilistic ones or others) can be done. Hence the corresponding following sections do not really apply to these problems. Whereas linear inverse problems were completely solved from the theoretical point of view at the end of the nineteenth century, only one class of nonlinear inverse problems was so before 1970, that of inverse spectral and (one space dimension) inverse scattering problems, after the seminal work of the Russian mathematical school (Krein, Gelfand, Levitan, Marchenko). A large review of the results has been given by Chadan and Sabatier in their book "Inverse Problems of Quantum Scattering Theory" ( 2 editions in English, one in Russian).
In this kind of problems, data are properties of the spectrum of a linear operator which describe the scattering. The spectrum is made of eigenvalues and eigenfunctions (see the article in Wikipedia), forming together the "discrete spectrum", and generalisations, called the continuous spectrum. The very remarkable physical point is that scattering experiments give information only on the continuous spectrum, and that knowing its full spectrum is necessary (and sufficient) to recover the scattering operator. Hence we have there invisible parameters, much more interesting than the null space which has a similar property in linear inverse problems! In addition, there are physical motions where the spectrum of such an operator is conserved through the motion. These motions are governed by special nonlinear partial differential evolution equations, for instance the "Korteveg -de Vries one. If the spectrum of the operator is reduced to one single eigenvalue, the corresponding motion is that of a single bump which propagates at constant velocity and without deformation, a solitary wave called "soliton".
It is clear that such a perfect signal and its generalisations for KdV or other so-called "integrable nonlinear partial differential equations" is of great interest, with many possible applications, and it is currently studied as a branch of mathematical physics since 1970 or so. Nonlinear inverse problems are also currently studied in many fields of applied science (acoustics, mechanics, quantum mechanics, electromagnetic scattering, in particular radar soundings, sismic soundings, all kinds of imaging, etc).The following sections, which are centered on linearised inverse problems of geophysics, cannot give a sufficient appraisal of the variety of methods and the interdisciplinarity character of most questions.
[edit] Probabilistic formulation of inverse problems
In physics, inverse problem theory is used to interpret experimental data. Because measurements have always attached uncertainties, one may choose to use a probabilistic formulation of the inverse problem, as described in this section.
According to Karl Popper, a physical theory must be falsifiable: a theory must predict the results of observations, predictions that may or may not fit actual observations. While fitting many observations is never sufficient to prove a theory right, a false prediction is sufficient to prove that a theory is false (and, therefore, that it needs modification). Given a model of a physical system (and a physical theory), the problem of predicting the result of some observations is the "ordinary" or forward problem. The inverse problem consists in using the result of some observations (and a physical theory) to infer the values of the parameters representing a physical system. It is usually said that while (in non-quantum physics) the forward problem has a unique solution, the inverse problem may have many solutions, or no solution at all. This is not so with the probabilistic approach developed here: one always starts with a probability distribution representing the a priori information, and the use of observations narrows this distribution. THE solution of the inverse problem is not a particular model, it is the (posterior) probability distribution over the model space.
The rigorous formulation of the probabilistic approach to Inverse Problems requires some intricate Bayesian reasoning, and the writing of complex probabilistic equations. But the basic —and most general— idea is quite simple (Tarantola, 2005):
- one starts by defining some probabilistic rules that randomly generate models of the system under study; these probabilistic rules should encapsulate all available a priori information on the system: the more a model is (a priori) likely, the more frequently it should appear in the random sample; any possible model should eventually appear, but very unlikely models should appear very infrequently; here, a priori information means information that is independent of the data that shall be used to modify this a priori information;
- one actually uses these probabilistic rules to generate many models; as far as we are talking about principles, and not about practical implementations, "many models" may mean a trillion of trillions of models;
- one introduces the physical theory that, given a particular model of the system, is able to predict the result of some observations;
- one runs this prediction for all the models of the (a priori) sample;
- for each of the models, one compares this prediction of the observations with the actual observations, and one uses a sensible criterion (can just be common sense, or can use some quantitative rule) to decide which models of the a priori sample can be kept (because they fit the data) or must be discarded (because they are unfit);
- the few models that have been kept represent the most general solution of the inverse problem: this sample contains the a priori information (as we started from it) and obeys the data; here, "a few models" may mean a million of models (the more complex the a posteriori distribution of models, the more models we need in the sample).
The models in the a posteriori sample can be just "watched" (as the human brain is quite good at extracting relevant information from a sample), or they can be used to answer complex questions: to the question "which is the temperature at the center of the Earth?" one could answer with the histogram of the temperature at the center of the Earth in each of the models of the sample. Of course the model of the posterior sample can also be used to evaluate simple estimators (the mean value of some parameter, of the covariance between two parameters), but it is the sample itself that is the solution to the inverse problem.
All other methods of solving inverse problems can be seen as special cases of application of the previous philosophy. For instance, when all uncertainties (in the a priori information and in the observations) can be modeled by Gaussian distributions, and if the relation between model parameters and observable parameters is not strongly nonlinear, the a posteriori distribution in the model space is approximately Gaussian, and the usual least-squares formulas provide the mean and the covariance of this posterior distribution.
To be now more quantitative, in a typical inverse problem, there is a set of model parameters a set of observable parameters , and a given relation that solves the forward problem [see Figure 1]. The model parameters are coordinates on the model parameter manifold, while the observable parameters are coordinates over the observable parameter manifold. There is no need to assume the these manifolds are linear space, or that the sum of two "model vectors" or two "data vectors" makes sense (it does not, in general).
The three basic elements of a typical inverse problem are as follows [see Figure 2]:
(i) some a priori information on the model parameters, represented by a volumetric probability defined over the model parameter manifold,
(ii) some experimental information obtained (through actual measurements) on the observable parameters, represented by a volumetric probability defined over the observable parameter manifold (a simple example is the Gaussian distribution suggested in Figure 4 below),
(iii) the forward modeling relation that we have just seen.
Using some basic Bayesian reasoning (and mathematics!) these three pieces of information can be combined to produce the a posteriori volumetric probability expressed at the bottom of Figure 2 (Tarantola, 2005). This posterior volumetric probability is the result of the modification of the prior volumetric probability that the new data has induced.
Once this posterior volumetric probability has been defined, the most general approach for solving an inverse problem consists in sampling it, and displaying the sample points (i.e., the models). These sample points of can be displayed, and compared with sample points of the prior distribution , to visually appreciate the gain of information that the data has brought. More quantitatively, probabilities of events can be evaluated from these sample points (for instance, the question "which is the probability that the total volume of oil in this reservoir is larger than one billion barrels?" is answered by evaluating the proportion of models (in a large sample) that do have this characteristic). A simpler, but less useful, possibility is to use the sample point to evaluate some estimators of the distribution (like the mean model, the median model, some covariances, etc.). Now, how can we obtain sample points of the posterior distribution ? Figure 3 suggests the simplest (although not the most efficient) approach. To gain in efficiency one can, instead, use Metropolis-like algorithms (as suggested by Mosegaard and Tarantola, 1995), but it must be clear that the gains in efficiency are often only in the exploration of some isolated region of significant probability in the model parameter manifold, and come at the expense of making more difficult for the algorithm to jump from one region of significant probability into another one.
Assume now that both, the model parameter manifold, and the observable parameter manifold, are linear spaces, and that the coordinates we use on both manifolds are the components of the vectors. When all uncertainties can be modeled by Gaussian distributions (Figure 4), the posterior probability distribution takes a simple form (the exponential of the misfit function) (Figure 5), and the maximum likelihood model can be obtained using optimization techniques, as the quasi-Newton algorithm suggested in Figure 6.
As we have seen, the model at which the algorithm converges maximizes the posterior volumetric probability . To estimate the posterior uncertainties, one can use the following property: the covariance operator of the Gaussian volumetric probability that is tangent to at the point is (see Figure 7) .
[edit] External links
- Inverse Problems Network
- Inverse Problems page at the University of Alabama
- Another Inverse Problems web site
- Albert Tarantola's website, including a free PDF version of his Inverse Problem Theory book, and some on-line articles on Inverse Problems
[edit] Academic journals
There are three main academic journals covering inverse problems in general.
- Inverse Problems
- Journal of Inverse and Ill-posed Problems
- Inverse Problems in Science and Engineering
In addition there are many journals on medical imaging, geophysics, non-destructive testing etc that are dominated by inverse problems in those areas.
[edit] Books
[edit] General inverse problems
- Albert Tarantola, Inverse Problem Theory (free PDF version), Society for Industrial and Applied Mathematics, 2005. ISBN 0-89871-572-5
- Richard Aster, Brian Borchers, and Cliff Thurber, Parameter Estimation and Inverse Problems, Academic Press, 2004. ISBN 0-12-065604-3
- William Menke, Geophysical Data Analysis, Academic Press, 1989. ISBN 0-12-490920-5
- Parker, R. L., Geophysical inverse theory, Princeton University Press, 1994. ISBN 0-691-03634-9
- M Bertero and P Boccacci, Introduction to Inverse Problems in Imaging, Institute of Physics Publishing, 1998. ISBN 0-7503-0439-1
- Per Christian Hansen, Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion, Society for Industrial and Applied Mathematics. ISBN 0-89871-403-6
- Heinz W. Engl, Martin Hanke, Andreas Neubauer, Regularization of Inverse Problems, Kluwer Academic Publishers. ISBN 0-7923-4157-0
- Curtis Vogel, Computational methods for inverse problems, Society for Industrial and Applied Mathematics. ISBN 0-89871-507-5
- David Gubbins, Time Series Analysis and Inverse Theory for Geophysicists, Cambridge University Press, 2004. ISBN 0-521-52569-1
- J. Kaipio and E. Somersalo, Statistical and Computational Inverse Problems, Springer, 2004. ISBN 0-387-22073-9
- Andreas Kirsch, An Introduction to the Mathematical Theory of Inverse Problems, Springer-Verlag, 1996. ISBN 038794530X
- S. Twomey, Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements, Dover Publications, 1977. ISBN 0-486-69451-8
- C. W. Groetsch, 'Inverse Problems in the Mathematical Sciences, Vieweg, 1993. ISBN 3-528-06545-1
- P.C. Sabatier."Past and future of Inverse Problems" Journal of Mathematical Physics, 41, 2000, 4082-4124
[edit] Inverse problems in medical imaging
- Frank Natterer, The Mathematics of Computerized Tomography (Classics in Applied Mathematics, 32), Society for Industrial and Applied Mathematics. ISBN 0-89871-493-1
- Frank Natterer and Frank Wubbeling, Mathematical Methods in Image Reconstruction, Society for Industrial and Applied Mathematics. ISBN 0-89871-472-9
[edit] Inverse problems in ocean and atmospheric sciences
- Ian G. Enting, Inverse Problems in Atmospheric Constituent Transport, Cambridge University Press, 2002. ISBN 0-521-81210-0
- Walter Munk, Peter Worcester, and Carl Wunsch, Ocean Acoustic Tomography, Cambridge University Press, 1995. ISBN 0-521-47095-1
- Carl Wunsch, The Ocean Circulation Inverse Problem, Cambridge University Press, 1996. ISBN 0-521-48090-6
[edit] Analysis of inverse problems for partial differential equations
- Victor Isakov, Inverse Problems for Partial Differential Equations, Applied Mathematical Sciences (Springer-Verlag), Vol 127, 1997. ISBN 0-387-98256-6
[edit] Inverse scattering problems
- David Colton and Rainer Kress, Inverse Acoustic and Electromagnetic Scattering Theory, 2nd Edition, Springer-Verlag, 1998. ISBN 0-387-55518-8
- K. Chadan and P.C. Sabatier, "Inverse Problems of Quantum Scattering Theory", 2nd edition, revised and auvmented, Springer-Verlag, 1989. ISBN 0-387-18731-6
- R. Pike and P. Sabatier, "SCATTERING. Vol. 1,2. Scattering and inverse scattering in pure and applied science".Academic Press. San Diego. 2002.(1831 pp.). ISBN 0-12-613760-9
[edit] Inverse problems in geophysics (articles with historical interest)
- Aki, K., Christofferson, A., and Husebye, E. S., 1977. Determination of the three-dimensional seismic structure of the lithosphere, J. Geophys. Res., 82, 277-296.
- Aki, K., and Lee, W. H. K., 1976. Determination of three-dimensional velocity anomalies under a seismic array using first P arrival times from local earthquakes. 1. A homogeneous initial model, J. Geophys. Res., 81, 4381-4399.
- Backus, G., 1970a. Inference from inadequate and inaccurate data: I, Proc. Nat. Acad. Sci., 65, 1, 1-105.
- Backus, G., 1970b. Inference from inadequate and inaccurate data: II, Proc. Nat. Acad. Sci., 65, 2, 281-287.
- Backus, G., 1970c. Inference from inadequate and inaccurate data: III, Proc. Nat. Acad. Sci., 67, 1, 282-289.
- Backus, G., 1971. Inference from inadequate and inaccurate data, Mathematical problems in the geophysical sciences: Lectures in Applied Mathematics, 14, American Mathematical Society, Providence, RI.
- Backus, G., and Gilbert, F., 1967. Numerical applications of a formalism for geophysical inverse problems, Geophys. J. Royal Astron. Soc., 13, 247-276.
- Backus, G., and Gilbert, F., 1968. The resolving power of gross Earth data, Geophys. J. Royal Astron. Soc., 16, 169-205.
- Backus, G., and Gilbert, F., 1970. Uniqueness in the inversion of inaccurate gross Earth data, Philos. Trans. Royal Soc. London, 266, 123-192.
- Keilis-Borok, V. J., and Yanovskaya, T. B., 1967. Inverse problems of seismology (structural review), Geophys. J. Royal Astr. Soc., 13, 223-234.
- Mosegaard, K., and Tarantola, A., 1995. Monte Carlo sampling of solutions to inverse problems. J. Geophys. Res., 100, B7, 12431-12447. [1]
- Parker, R. L., 1975. The theory of ideal bodies for gravity interpretation, Geophys. J. Royal Astron. Soc., 42, 315-334.
- Press, F., 1968. Earth models obtained by Monte-Carlo inversion, J. Geophys. Res., 73, 16, 5223-5234.
- Tarantola, A., and Nercessian, A., 1984. Three-dimensional inversion without blocks, Geophys. J. Royal Astr. Soc., 76, 299-306.
- Tarantola, A., and Valette, B., 1982a. Inverse problems = quest for information, J. Geophys., 50, 159-170.
- Tarantola, A., and Valette, B., 1982b. Generalized nonlinear inverse problems solved using the least-squares criterion, Rev. Geophys. Space Phys., 20, 2, 219-232.
[edit] A philosophical view on inverse problems
- Mario Bunge, From Z to A: Inverse Problems, In: Pp. 145-164 of M. Bunge Chasing Reality: Strife over Realism, University of Toronto Press, 2006. ISBN 0-8020-9075-3