Talk:Ziggurat algorithm

From Wikipedia, the free encyclopedia

Mathematics Portal

This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.

Mathematics rating:

Start Class

Low Priority

Field: Probability and statistics

Please update this rating as the article progresses, or if the rating is inaccurate. Click to show/hide comments.
Please add to or update the comments to suggest improvements to the article.
Geometry guy 17:08, 22 May 2007 (UTC)

This is the talk page for discussing improvements to the Ziggurat algorithm article.
Please sign and date your posts by typing four tildes (`~~~~`). Put new text under old text. Click here to start a new topic. New to Wikipedia? Welcome! Ask questions, get answers. This is not a forum for general discussion about the article's subject.	Be polite Assume good faith No personal attacks Be welcoming	Article policies No original research Neutral point of view Verifiability

1 Diagram needed
2 Approximations
- 2.1 Ziggurat algorithm
- 2.2 Distortionless Ziggurat Algorithm
3 Image

[edit] Diagram needed

I added an animated diagram.Cuddlyable3 17:58, 14 September 2007 (UTC)

Basically, something like Fig. 1 on p.2 of Marsaglia's paper, or Fig. 5 from the Thomas et al. paper (linked from article References). An improvement would be to add labels "y₀", "y₁", etc. along the y-axis and "x₀", "x₁", etc. above the points.

Without that picture, the best wording is awfully hard to follow.

For a normal distribution, the function to plot is y = e^-x²/2, and the 8-layer Ziggurat points on it are

x[−1]= 2.7120530222393384079012   y[−1]= 0.0000000000000000000000  —Preceding unsigned comment added by 71.41.210.146 (talk) 18:48, 1 October 2007 (UTC) 
x[0] = 2.3383716982472527044692   y[0] = 0.0649595117330380202104
x[1] = 1.9819049364005106693290   y[1] = 0.1402998180884519367696
x[2] = 1.7165081257767795707347   y[2] = 0.2291908828832851165061
x[3] = 1.4853586756432938545359   y[3] = 0.3318257830466926345376
x[4] = 1.2629701985308330330774   y[4] = 0.4504325835506035839847
x[5] = 1.0273863717802284625238   y[5] = 0.5899241094185249870813
x[6] = 0.7383689179764494421219   y[6] = 0.7614016031422429793942
x[7] = 0.0000000000000000000000   y[7] = 1.0000000000000000003253

Here's a sample gnuplot input file:

#set terminal svg
#set output "zig.svg"
unset label
set label "x-1" at 2.7120530222393384079012,0.0649595117330380202104 cent offset 0,0.5
set label "x0" at 2.3383716982472527044692,0.1402998180884519367696 cent offset 0,0.5
set label "x1" at 1.9819049364005106693290,0.2291908828832851165061 cent offset 0,0.5
set label "x2" at 1.7165081257767795707347,0.3318257830466926345376 cent offset 0,0.5
set label "x3" at 1.4853586756432938545359,0.4504325835506035839847 cent offset 0,0.5
set label "x4" at 1.2629701985308330330774,0.5899241094185249870813 cent offset 0,0.5
set label "x5" at 1.0273863717802284625238,0.7614016031422429793942 cent offset 0,0.5 
set label "x6" at 0.7383689179764494421219,1.0000000000000000000000 cent offset 0,0.5

set label "y0" at 0,0.0649595117330380202104 right offset -0.5,0
set label "y1" at 0,0.1402998180884519367696 right offset -0.5,0
set label "y2" at 0,0.2291908828832851165061 right offset -0.5,0.2
set label "y3" at 0,0.3318257830466926345376 right offset -0.5,0
set label "y4" at 0,0.4504325835506035839847 right offset -0.5,0
set label "y5" at 0,0.5899241094185249870813 right offset -0.5,-0.7
set label "y6" at 0,0.7614016031422429793942 right offset -0.5,0
set label "y7" at 0,1.0000000000000000000000 right offset -2.5,0

set xrange [0:3.5]
set yrange [0:1.05]
plot exp(-x**2/2) title "exp(-x^2/2)" with lines linewidth 2, "zig.dat" ti "" with lines

And the corresponding input file zig.dat:

0.0000000000000000000000        1.0000000000000000000000
0.7383689179764494421219        1.0000000000000000000000
0.7383689179764494421219        0.5899241094185249870813

0.0000000000000000000000        0.7614016031422429793942
1.0273863717802284625238        0.7614016031422429793942
1.0273863717802284625238        0.4504325835506035839847

0.0000000000000000000000        0.5899241094185249870813
1.2629701985308330330774        0.5899241094185249870813
1.2629701985308330330774        0.3318257830466926345376

0.0000000000000000000000        0.4504325835506035839847
1.4853586756432938545359        0.4504325835506035839847
1.4853586756432938545359        0.2291908828832851165061

0.0000000000000000000000        0.3318257830466926345376
1.7165081257767795707347        0.3318257830466926345376
1.7165081257767795707347        0.1402998180884519367696

0.0000000000000000000000        0.2291908828832851165061
1.9819049364005106693290        0.2291908828832851165061
1.9819049364005106693290        0.0649595117330380202104

0.0000000000000000000000        0.1402998180884519367696
2.3383716982472527044692        0.1402998180884519367696
2.3383716982472527044692        0.0000000000000000000000

0.0000000000000000000000        0.0649595117330380202104
2.7120530222393384079012        0.0649595117330380202104
2.7120530222393384079012        0.0000000000000000000000

71.41.210.146 16:33, 19 June 2007 (UTC)

Thanks for the image, but the caption and animation seem to imply that:

The areas A under the curve are equal, and
The right-hand (solid white) part is eliminated by rejection.

Neither of those are true. Each layer's black + vertical hatched regions total a constant area A (except for the base layer, which is special), and the right-hand region is eliminated by multiplying a [0,1)-distributed random point by the width of the slice x_i.

I tried to edit the caption to clarify the second point, but the first is pretty hard to fix.

Also, the fact that the distribution tail is, in fact infinite, is not clear from the graphics. It's asymptotic to, but never quite reaches, the X axis.

Sorry to complain, but to illustrate it accurately, you have to demonstrate:

Choose a point in a vertical interval divided evenly into 8 regions. This gives the slice number i.
Map that region number, via a loojup table, onto a slice of non-uniform height and width.
Choose a point x uniformly between 0 and x_i−1
Test if the point is less than x_i, and accept x immediately if so.
Otherwise, generate a random point y between y_i−1 and y_i and test if y < f(x). If so, accept the point. If not, restart from the beginning.
(Step 5 is different in the i=0 case, but let's not try to illustrate that.)

71.41.210.146 02:06, 25 September 2007 (UTC) (originally posted on cuddlyable3's talk page)

I agree that the animation shows

1. The areas A under the curve are equal

That appears to be the case using the 8 coordinates posted here (with high precision!) by anonymous user 71.41.210.146 on 15:26 20 June 2007. The same user now says "Each layer's black and vertical hatched regions total a constant area A" implying x_n(y_n-y_n+1) = constant. That, if true, means his coordinates are unusable. It is impossible to define the base layer where x-->infinity this way, whereas the tail area is finite.

Some other issues in the graphic:

a) The asymptotic tail of the function, as noted. I can remove a white pixel to leave a gap.

b) The left PRNG must select layers with equal probabilities. A non-linear grating suggests what would be done by a look-up table.

c)71.41.210.146 states that each output of the upper PRNG must be multiplied by the x_n of the current layer (so only small wastage) while the graphic shows no conditional multiplication (hence a larger wastage). That seems to trade time to multiply every sample with time to fetch about twice as many PRNs. Cuddlyable3 16:12, 25 September 2007 (UTC)

It is indeed true that x_n−1(y_n−y_n+1) = constant. The base layer and the tail are handled specially with the i=0 test in step 4, as described in the text leading up to and including the "fallback algorithms for the tail" section. See particularly the description of the "fictitious x₋₁"; the "box" for the base layer is special: shorter than the tail, and of the same area as the base layer under the tail.
On modern processors, a floating-point multiply is faster than an unpredictable branch, much less a branch plus an additional random number generation (which usually involves a few multiplies itself). Also, when using more than 8 layers, the top layers are very narrow, and rejection would be much more than 50%.

If the special handling of the base layer is not clear from the text, some help improving it would be appreciated! It seems clear enough to me, but if you can point out a place where you get confused, I can try to reword that area.

71.41.210.146 08:41, 29 September 2007 (UTC)

[edit] Approximations

Any digital generator of Gaussian pseudo-random samples must make the following approximations.

- Truncate the tails that extend to infinite sample values. Two ways to do this are:
  - see Fig. a). The result is part of a Gaussian distribution plus a uniform distribution with the same range
  - see Fig. b). The result is a distorted (when normalised) part of a Gaussian distribution
- see Fig. c). Sample values must be quantised to fit the finite length of binary numbers in use.

The sizes of these approximations must be chosen with regard to the intended use of the samples. Thus there is no single "right" choice.

[edit] Ziggurat algorithm

The motivation behind the ziggurat algorithm is that the majority of the samples of a non-uniform distribution such as the Gaussian that would be generated by calculating a transcendental (costly) can be provided instead by combining uniform distributions from PRNGs (cheap).

The PRNGs are assigned to layers of the area under the intended distribution curve. After above approximations, the layer areas on the left of the curve should be equal. They will be selected randomly by index and over long time each layer is selected the same number of times. (One may conveniently use a maximal-length n-bit PRNG shift register to select (2ⁿ-1) layers.)

A version of the algorithm described by Marsaglia, Tsang, Doornik et al. establishes a set of points on the curve using the approximation

x_n(y_n+1-y_n) = A constant

These are equal-area rectangles that are larger, by varying degrees, than the curve-bounded areas that they include. The disproportionality means that coresponding ranges of sample values are relatively under-represented in the output, whose distribution is distorted as shown in Fig. d). That distortion may be reduced (but not eliminated) by using more layers. The diagram shows the case of the coordinates provided here by anonymous user 71.41.210.146 .

[edit] Distortionless Ziggurat Algorithm

Two conditions together would allow the distortion in Fig. d) to be avoided:

- Select layer coordinates that satisfy

erf(x_n)-erf(x_n+1)-y_n(x_n-x_n+1)+x_n+1(y_n+1-y_n)

- Whenever a sample is rejected at a layer, try again using a new PR sample to the same layer. Due to the rarity of such cases they do not slow the algorithm significantly. Cuddlyable3 14:59, 6 October 2007 (UTC)

[edit] Image

The image of of too poor quality. It is better to remove it than keep it in the way it is. Oleg Alexandrov (talk) 01:25, 26 September 2007 (UTC)

Let's talk about how we can make it better. What are your ideas Oleg? Cuddlyable3 10:49, 26 September 2007 (UTC)

The asymptote of the distribution to the X axis isn't shown well. The distribution extends to infinity.
The areas marked "A" aren't equal area. You have to include all of the partial box (vertically hatched region) on the non-base layers, specifically including the area above the curve, to get the per-layer area A.
The way that layer selection is illustrated implies that the probability is proportional to the height. It's not; it's the same for each layer.
The way the X coordinate is chosen implies that the area to the right of the curve (solid white) is rejected. It's not; the random number is chosen to be uniform from 0 to the width of the layer.

The correct way to draw it in black and white, IMHO, is with solid white boxes under the curve and hollow boxes extending past the curve. Although I don't wish to slight the effort put into the illustration (pictographic illustration of random number generation is extremely challenging!), honestly, a non-animated illustration would do just as well.

But if you like animation, then you have to show that a layer is selected uniformly, and I'd do it like this. Rather than showing a single generation, it shows many random number generations in parallel:

Start as shown, with the two-sided distribution cut in half. Excellent, just show the asymptote of the tail.
Then show the boxes on top of the curve, and delete the curve everywhere except for the lowest layer.
Show that the layers have equal area A.
Replace the tail on the lowest layer with a box, preserving the area A.
Show a square box on the left, divided horizontally into uniform layers, also each of area A.
Fill the square box with random points.
Morph the points onto the equal-area layers under the curve. It's just a linear scaling.
Extend the right-hand margin of each box down into the box below, dividing it into two pieces (except for the top layer.) Mark the points in each box that are underneath the layer above in green (immediately acceptable). Mark the points in the right-hand portions of each box in yellow (questionable).
Make the curve re-appear. On all layers except the bottom, turn yellow points under the curve green. Turn yellow points over the curve red (rejected), then fade them out.
Have the yellow points on the bottom layer disappear and reappear as green points in the tail. Don't try to show "morphing" between them.
Disappear the boxes and just show the curve and the randomly chosen points under it.

That actually depicts the algorithm quite faithfully.

71.41.210.146 08:26, 29 September 2007 (UTC)

Thank you for your script for a new animation. The .gif animation format is poorly equipped for morphing, and with a bigger palette for colours and fading the file size will swell.Cuddlyable3 16:15, 1 October 2007 (UTC)

Okay, then let's think. We can do it with fewer colours and make the morphing easier. You're using four colours already (black, white, pink, and transparent). Can your software handle local colour tables? You're allowed a colour table per frame if you like.

Rather than sliding the points, have the layers sequentially disappear from the source rectangle and appear in the ziggurat layer destination rectangle. An arrow (or better yet, lines connecting the corners of the boxes) is optional.
How about this colour scheme: Draw the original curve in white. Then the rectangles in colour (pink). Then the points in white. Turn the points pink when accepted, or just vanish them when rejected. Then disappear the pink rectangles and have the white curve reappear.
You can fade out lines by dithering, removing every fourth pixel for four frames, say.

Does that work better? 71.41.210.146 19:13, 1 October 2007 (UTC)

I released the animation to public domain so you are free to take any part of it and show your ideas here.Cuddlyable3 19:12, 2 October 2007 (UTC)

- While improvements are made on the diagram, I'm moving it down so it doesn't dominate the article so heavily. The image should in any case come after a written introduction. jugander (t) 13:17, 7 October 2007 (UTC)

Talk:Ziggurat algorithm

From Wikipedia, the free encyclopedia

Contents

[edit] Diagram needed

[edit] Approximations

[edit] Ziggurat algorithm

[edit] Distortionless Ziggurat Algorithm

[edit] Image

Views

Navigation

Interaction

Search