Surrogate model

From Wikipedia, the free encyclopedia

Most engineering design problems require experiments and/or simulations to evaluate design objective and constraint functions as function of design variables. For example, in order to find the optimal airfoil shape for an aircraft wing, an engineer simulates the air flow around the wing for different shape variables (length, curvature, material, ..). For many real world problems, however, a single simulation can take many minutes, hours, or even days to complete. As a result, routine tasks such as design optimization, design space exploration, sensitivity analysis and what-if analysis become impossible since they require thousands or even millions of simulation evaluations.

One way of alleviating this burden is by constructing approximation models, known as surrogate models, response surface models, metamodels or emulators) that mimic the behavior of the simulation model as closely as possible while being computationally cheap(er) to evaluate. Surrogate models are constructed using a data-driven, bottom-up approach. The exact, inner working of the simulation code is not assumed to be known (or even understood), solely the input-output behavior is important. A model is constructed based on modeling the response of the simulator to a limited number of intelligently chosen data points. This approach is also known behavioral modeling or black-box modeling, though the terminology is not always consistent. When only a single design variable is involved, the process is known as curve fitting as illustrated in the Figure.

An important distinction can be made between two different applications of surrogate models. The first involves building small and simple surrogates for use in optimization. Simple surrogates are used to guide the search towards a global optimum. Once the optimum is found the surrogates are discarded. In the second case one is not interested in finding the optimal parameter vector but rather in the global behavior of the system. Here the surrogate is tuned to mimic the underlying model as closely as needed over the complete design space. Such surrogates are a useful, cheap way to gain insight into the global behavior of the system. Optimization can still occur as a post processing step.

The scientific challenge of surrogate modeling is the generation of a surrogate that is as accurate as possible, using as little simulation evaluations as possible. The process comprises three major steps which may be interleaved iteratively:

Sample selection (also known as sequential design, optimal experimental design (OED) or active learning)
Construction of the surrogate model and optimizing the model parameters (Bias-Variance trade-off)
Appraisal of the accuracy of the surrogate.

The accuracy of the surrogate depends on the number and location of samples (expensive experiments or simulations) in the design space. Various design of experiments (DOE) techniques cater to different sources of errors, in particular errors due to noise in the data or errors due to an improper surrogate model.

The most popular surrogate models are polynomial response surfaces, Kriging, support vector machines and artificial neural networks. For most problems, the nature of true function is not known a priori so it is not clear which surrogate model will be most accurate. In addition, there is no consensus on how to obtain the most reliable estimates of the accuracy of a given surrogate.

[edit] See also

[edit] References

Queipo, N.V., Haftka, R.T., Shyy, W., Goel, T., Vaidyanathan, R., Tucker, P.K. (2005), “Surrogate-based analysis and optimization,” Progress in Aerospace Sciences, 41, 1-28.