Subset simulation

Subset Simulation[1] is a method used in reliability engineering to compute small (i.e., rare event) failure probabilities encountered in engineering systems. The basic idea is to express a small failure probability as a product of larger conditional probabilities by introducing intermediate failure events. This conceptually converts the original rare event problem into a series of frequent event problems that are easier to solve. In the actual implementation, samples conditional on intermediate failure events are adaptively generated to gradually populate from the frequent to rare event region. These 'conditional samples' provide information for estimating the complementary cumulative distribution function (CCDF) of the quantity of interest (that governs failure), covering the high as well as the low probability regions. They can also be used for investigating the cause and consequence of failure events. The generation of conditional samples is not trivial but can be performed efficiently using Markov Chain Monte Carlo (MCMC).

Subset Simulation takes the relationship between the (input) random variables and the (output) response quantity of interest as a 'black-box'. This can be attractive for complex systems where it is difficult to use other variance reduction or rare event sampling techniques that require prior information about the system behavior. For problems where it is possible to incorporate prior information into the reliability algorithm, it is often more efficient to use other variance reduction techniques such as importance sampling.

Basic idea

Let X be a vector of random variables and Y = h(X) be a scalar (output) response quantity of interest for which the failure probability P(F)=P(Y>b) is to be determined. Each evaluation of h(.) is expensive and so it should be avoided if possible. Using direct Monte Carlo one can generate i.i.d (independent and identically distributed) samples of X and then estimate P(F) simply as the fraction of samples with Y>b. However this is not efficient when P(F) is small because most samples will not fail (i.e., with Y≤b) and in many cases an estimate of 0 results. As a rule of thumb for small P(F) one requires 10 failed samples to estimate P(F) with a coefficient of variation of 30% (a moderate requirement). E.g., 10000 i.i.d. samples (and hence evaluations of h(.)) for P(F)=0.001.

Subset Simulation attempts to convert a rare event problem into more frequent ones. Let

b1 < b2 < ... < bm = b

be an increasing sequence of intermedidate threshold levels. From the basic property of conditional probability,

P(Y>b) = P(Y>bm|Y>bm-1) P(Y>bm-1)

= P(Y>bm|Y>bm-1) P(Y>bm-1|Y>bm-2) P(Y>bm-2)
= ...
= P(Y>bm|Y>bm-1) P(Y>bm-1|Y>bm-2) ... P(Y>b2|Y>b1) P(Y>b1)

The 'raw idea' of Subset Simulation is to estimate P(F) by estimating P(Y>b1) and the conditional probabilities P(Y>bi|Y>bi-1) (i=2,...,m), anticipating efficiency gain when these probabilities are not small. To implement this idea there are two basic issues:

  1. Estimating the conditional probabilities by means of simulation requires the efficient generation of samples of X conditional on the intermediate failure events, i.e., the conditional samples. This is generally non-trivial.
  2. The intermediate threshold levels {bi} should be chosen so that the intermediate probabilities are not too small (otherwise ending up with rare event problem again) but not too large (otherwise requiring too many levels to reach the target event). However, this requires information of the CCDF, which is the target to be estimated.

In the standard algorithm of Subset Simulation the first issue is resolved by using Markov Chain Monte Carlo. The second issue is resolved by choosing the intermediate threshold levels {bi} adaptively using samples from the last simulation level. As a result, Subset Simulation in fact produces a set of estimates for b that corresponds to different fixed values of p = P(Y>b), rather than estimates of probabilities for fixed threshold values.

Notes

See also

References

  1. Au, S.K.; Beck, James L. (October 2001). "Estimation of small failure probabilities in high dimensions by subset simulation". Probabilistic Engineering Mechanics 16 (4): 263–277. doi:10.1016/S0266-8920(01)00019-4.
  2. Au, S.K.; Wang, Y. (2014). Engineering Risk Assessment with Subset Simulation. Singapore: John Wiley & Sons. ISBN 978-1-118-39804-3.
  3. Schuëller, G.I.; Pradlwarter, H.J. (2007). "Benchmark study on reliability estimation in higher dimensions of structural systems – An overview". Structural Safety 29: 167–182.
  4. Phoon, K.K. (2008). Reliability-Based Design in Geotechnical Engineering: Computations and Applications. Singapore: Taylor & Francis. ISBN 978-0-415-39630-1.
  5. Zio, E.; Pedroni, N. (2011). "How to effectively compute the reliability of a thermal– hydraulic nuclear passive system". Nuclear Engineering and Design 241: 310–327. doi:10.1016/j.nucengdes.2010.10.029.