Talk:Geometric distribution

From Wikipedia, the free encyclopedia

We should mention the coupon collectors problem here, i.e. the number of trials (on average) needed to complete a 'set of coupons' given a uniform unerlying distribution of coupons.

The classic example came from cigarette cards or coupons. How many packs of cigarettes do you need to buy (on average) to collect all 5 movie stars? The answer is 11+5/12, assuming the underlying distribution of movie stars per pack is uniform.

This is a result of the geometric distribution,

=    5/5 + 5/4 + 5/3 + 5/2 + 5/1
= 5( 1/5 + 1/4 + 1/3 + 1/2 + 1/1 )
= 11+5/12

and is related to urn problems, the poisson distribution and generating functions.


I have been looking for a proof that the expected value of a geometric distribution is 1/p, and have been unable to find anything. Does anyone have a link or know the proof? THN

Contents

[edit] proof of expected value of geometric distribution

Source: Mitzenmacher and Upfal. Randomized Algorithms and Probabilistic Analysis: A First Course.

Lemma. Let X be a discrete random variable.

E[X]
=\sum_{i=0}^\infty i\Pr[X=i]
=\sum_{i=0}^\infty i(\Pr[X\geq i]-\Pr[X\geq i+1])
=\sum_{i=1}^\infty \Pr[X\geq i]

Corollary. If X is a geometric random variable, then:

E[X]
=\sum_{i=1}^\infty \Pr[X\geq i]
=\sum_{i=1}^\infty\sum_{n=i}^\infty(1-p)^{n-1}p
=\sum_{i=1}^\infty(1-p)^{i-1}
=\frac{1}{1-(1-p)}

QED. =)

No time right now to put this into the article, I'm afraid. Can someone else do it? –Matt 10:46, 19 Jun 2004 (UTC)

[edit] squares

(1 − p)/p2.

Does anyone else see a square in place of the minus sign? –Matt 10:48, 19 Jun 2004 (UTC)

[edit] Graphs need to be redone

Geometric distribution is discrete, therefore it is determined by the probability mass function, not density. In MATLAB one should use stem to plot it. Consequently, the cdf of the geometric distribution is a piecewise constant right-continuous function, not piecewise linear. I may redo graphs myself next week. If someone else can do it sooner, that'd be great. PBH 17:01, 18 May 2006 (UTC)

You are right, my bad. I remade the graphs trying to correct the mistake. However, I didn't use stem as I think it looks bad and it makes the graph unclear when you have more than one function. If you think the graphs are correct now, pleace remove the notice. If you still don't like them, please tell me why AdamSmithee 16:29, 18 May 2006 (UTC)

As far as stem is concerned, the problem is the choice of parameters. Pmf's of Geom(0.8) and Geom(0.2) coincide at 1. A better choice would be 0.2, 0.5, 0.7 or something like that. At least put filled dots at integer points. Also, personally I don't very much like the staircase look of the cdf. It does not display right-continuity well. There really ain't no "vertical" lines in a graphof any function. I'll probably make cleaner graphs for the Russian version. However, the graphs now are more acceptable, so I'll remove the notice. PBH 17:01, 18 May 2006 (UTC)

I'm not really that crazy about them, but unfortunately my graphic capabilities stop around here. If you can do something better, by all means replace them AdamSmithee 19:04, 18 May 2006 (UTC)

I've commented out the graphs because I could not edit the CAPTIONS to make them cease to be misleading. Here's the problem:

  • One graph shows the p.m.f. of Y and not of X;
  • The other shows the c.d.f. of X and not of Y.

But they're presented in a way that suggests they're both about the same probability distribution! This really needs to get fixed. The graphs should be made consistent with the information in the table. Michael Hardy 22:33, 1 June 2006 (UTC)

[edit] only ONE parameter

This family of distributions is parameterized by just one parameter, p. The argument n to the probability mass function is not such a parameter, i.e., we don't get a different distribution for each different value of n. I deleted the "n" from "parameters" in the table. Michael Hardy 00:54, 31 May 2006 (UTC)