Rejection sampling
From Wikipedia, the free encyclopedia
In mathematics, rejection sampling is a technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm".
It generates sampling values from an arbitrary probability distribution function f(x) by using an instrumental distribution g(x), under the only restriction that f(x) < Mg(x) where M > 1 is an appropriate bound on f(x) / g(x).
Rejection sampling is usually used in cases where the form of f(x) makes sampling difficult. Instead of sampling directly from the distribution f(x), we use an envelope distribution Mg(x) where sampling is easier. These samples from Mg(x) are probabilistically accepted or rejected.
This method relates to the general field of Monte Carlo techniques, including Markov chain Monte Carlo algorithms that also use a proxy distribution to achieve simulation from the target distribution f(x). It forms the basis for algorithms such as the Metropolis algorithm.
The unconditional acceptance probability is the proportion of proposed samples which are accepted, and is the integral over all values of x of Mf(x). If this is high, fewer samples are rejected, and the required number of samples for the target distribution is obtained more quickly. The unconditional acceptance probability is higher the less the ratio f(x) / g(x) varies, however to obtain acceptance probabilty 1, f(x) = g(x), which defeats the purpose of sampling.
[edit] Algorithm
The algorithm (due to John von Neumann) is as follows:
- Sample x from g(x) and u from U(0,1)
- Check whether or not u < f(x) / Mg(x).
- If this holds, accept x as a realization of f(x);
- if not, reject the value of x and repeat the sampling step.
The validation of this method is the envelope principle: when simulating the pair (x,v = u * Mg(x)), one produces a uniform simulation over the subgraph of Mg(x). Accepting only pairs such that u < f(x) / Mg(x) then produces pairs (x,v) uniformly distributed over the subgraph of f(x) and thus, marginally, a simulation from f(x).
This means that, with enough replicates, the algorithm generates a sample from the desired distribution f(x). There are a number of extensions to this algorithm, such as the Metropolis algorithm.
[edit] Examples
As a simple geometric example, suppose it is desired to generate a random point within the unit circle. Generate a candidate point (x,y) where x and y are independent uniformly distributed between −1 and 1. If it so happens that then the point is within the unit circle and should be accepted. If not then this point should be rejected and another candidate should be generated.
The ziggurat algorithm, a more advanced example, is used to efficiently generate normally-distributed pseudorandom numbers.
[edit] References
- Robert, C.P. and Casella, G. "Monte Carlo Statistical Methods" (second edition). New York: Springer-Verlag, 2004.
- J. von Neumann, "Various techniques used in connection with random digits. Monte Carlo methods", Nat. Bureau Standards, 12 (1951), pp. 36–38.