Talk:Cumulative distribution function
From Wikipedia, the free encyclopedia
Contents |
[edit] Distribution function
This page had a redirect from distribution function, which I've now made into its own article describing a related but distinct concept in physics. I'll try to modify the pages pointing here through that redirect so that the net change in the wikipedia is minimal.SMesser 16:12, 24 Feb 2005 (UTC)
[edit] Cumulative density function
I originally created the redirect cumulative density function in March to point to this article. Why? A simple google test for cumulative density function shows 41,000 hits while cumulative distribution function shows 327,000 hits. Michael Hardy's contention is that "cumulative density" is patent nonsense (see deletion log) and a redirect shouldn't exist.
Regardless of the correctness of "cumulative density", there still is a significant usage of it in reference to this article and its content. "Cumulative density function" is even used in a doctoral thesis. Hardly patent nonsense.
Even if "cumulative density function" is incorrect, someone still may look for it, find nothing, and create an article paralleling this article. If you don't buy the "it's not patent nonsense, or even just nonsense" then I invoke (from WP:R#When should we delete a redirect?) that it increases accidental linking and therefore should not be deleted.
Michael, if you have a problem with the correctness of "cumulative density" then by all means add a section here or change the redirect to an article and explain it there. Either way, cumulative density function needs to be a valid link. Cburnett 14:42, 14 December 2005 (UTC)
=== How is this a debate?
The word "cumulative distribution function" is used in many elementary books. It is a pretty stupid term, but we are stuck with it. The best we can do is acknowledge that the term is out there, that is should simply be "distribution function" and that it's definition MUST be with <= or else many tables, software routines, etc will be incorrectly used.
[edit] Doesn't make sense
"Note that in the definition above, the "less or equal" sign, '≤' could be replaced with "strictly less" '<'. This would yield a different function, but either of the two functions can be readily derived from the other. The only thing to remember is to stick to either definition as mixing them will lead to incorrect results. In English-speaking countries the convention that uses the weak inequality (≤) rather than the strict inequality (<) is nearly always used."
Surely it doesn't matter at all! Since the probability of one single value is 0, hence the two interval boundaries can be included or excluded.
- If you're only interested in integrals. Shinobu 22:50, 7 June 2006 (UTC)
- The convention in the entire world is to use '≤' and it matters HUGELY for the binomial, poisson, negative binomial, etc. To use anything else and to rely upon the formulas in any text would lead substantial errors, say when one is using a table of the binomial distribution. Jmsteele 01:18, 21 October 2006 (UTC)
- I'm not sure about that. The definition: F(x) = P(X <= x)
- Because P(X <= x) = P(X < x) + P(X = x), F(x) = P(X < x) + P(X = x)
- Now for normal functions (the kind of functions you mention) P(X = x) = 0.
- Of course, there are things like deltafunctions, but that's not what you're talking about. Shinobu 16:27, 27 October 2006 (UTC)
Please consider some very important distributions: The Binomial, Poisson, Hypergeometric. You simply MUST use the definition F(x) = P(X <= x) or else all software packages and all tables will be misundestood. PS I am a professor of statistics, so give me some slack here. This is not a matter of delta functions it is a matter of sums of coin flips ... very basic stuff.
[edit] F(x) vs Phi(x)
I completely disagree with "It is conventional to use a capital F for a cumulative distribution function, in contrast to the lower-case f used for probability density functions and probability mass functions." From all the literature I have read, is the cumulative distribution function and is used for probability density/mass functions. Where's the reference to make such a bold claim that F and f are convention? See the probit article which uses for the inverse to cdf. -- Thoreaulylazy 19:13, 3 October 2006 (UTC)
- There is no such convention - you can pick any symbol you like, of course. It is common practice to use the capital for the cdf, because it's the primitive of the df. I've seen phi in quantum mechanical books, but I've also seen f and rho. Shinobu 22:58, 3 October 2006 (UTC)
- From all the literatures I have read, the pair of F and f was the convention. I don't mean to say that Φ and φ are wrong, but how can you be so sure to declare something else as a bold claim? Many different fields have different notational conventions, and we just have to accept it. Musiphil 07:03, 3 December 2006 (UTC)
====This is a collapsed disctintion. One uses Phi for the normal distribution and phi for the normal density. These are reseved symbols for these purposes --- see any statistics book. One uses F and f fo the generic distributions and densities, but these are not reserved. In many books and papers one will find G g , H h etc. Each time the capital representing distribution and the lower case the density.
[edit] Programming algorithm
I've been looking for a better algorithm to generate a random value based on an arbitrary CDF (better than the one I wrote). For example, if one would like to obtain a random value with a "flat" distribution, one can use the 'rand()' function in C's math.h . However, I wrote this function to use an arbitrary function to generate the random value:
// xmin and xmax are the range of outputs you want // ymin and ymax are the actual limits of the function you want // function is a function pointer that points to the CDF long double randfunc(double xmin, double xmax, long double (*function)(long double), double ymin, double ymax) { long double val; while(1) { if( (ymax-ymin)*( rand()/((long double)RAND_MAX + 1)) + ymin < function(val= ((xmax-xmin)*( rand()/((long double)RAND_MAX + 1)) + xmin))) { return val; } } }
I was trying to find a way to do it faster/better. If anyone knows of anything.. let me know. Fresheneesz 07:53, 27 December 2006 (UTC)
- Wikipedia really isn't the place to ask these sorts of questions. The talk pages are more for discussion on the articles themselves. Anyway, i will tell you that Donald Knuth's textbook Numerical recipies in C has a good dissertation on random number generation and also includes algorithms. I would further advise that you read the text, not just implement the algorithms listed there, its quite good! User A1 13:48, 12 March 2007 (UTC)
-
- I think it'd be nice to have something on algorithms on this page. I have actually found a better answer. It invovles either integrating the CDF, and using the definate integral instead of an indefinite integral, or if no definite integral is possible, preintegrate the function and use the numbers prerendered in memory. Fresheneesz 02:39, 13 March 2007 (UTC)
[edit] cdf vs pdf
Hello,
i removed the comment that probability distribution function is the same as CDF, which i assert to be wrong. My reference is "Probability and Statistics for Engineering and the Sciences" pp 140 (J. Devore) . The PDF is the same as the probability density function, not the CDF. The CDF is the integral of the PDF, not the PDF itself.
Please comment. 129.78.208.4 05:28, 12 March 2007 (UTC)