Talk:Pareto distribution

From Wikipedia, the free encyclopedia

Mathematics Portal

This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.

Mathematics rating:

B Class

Low Priority

Field: Probability and statistics

One of the 500 most frequently viewed mathematics articles.

Please update this rating as the article progresses, or if the rating is inaccurate. Please also add comments to suggest improvements to the article.

This article is within the scope of WikiProject Statistics, which collaborates to improve Wikipedia's coverage of statistics. If you would like to participate, please visit the project page.

1 Graph
2 Technical cleanup tag
3 Pareto density at x_min
4 Error in CDF formula for Pareto
5 I got the wrong PDF?
6 Alternative R code
7 Generalized Pareto distribution
8 Relationship to the exponential distribution
9 Many types of Pareto
10 Distribution Example: size of sand particles?

[edit] Graph

For those visual thinkers among us, can we have an example graph of this?

A very dull graph: starting at

x min

, the density falls as x increases and the cumulative distribution rises, each with a slope which becomes shallower for large x.

I have removed the statement If the value of k is chosen judiciously then the Pareto distribution obeys the "80-20 rule" since it depends on a right truncation which this distribution doesn't have; allowing such truncation judiciously would mean most distributions met the "80-20" rule.--Henrygb 00:13, 6 Aug 2004 (UTC)

[edit] Technical cleanup tag

It was commented to me that articles like this are not and should not be aimed at non-technical readers. WikiProject Science and other communal efforts I've seen generally have the goal of making the first part of the article accessible to the general public, but allowing for later parts which may be intelligible only to technical readers. That's certainly possible to do in this case.

I'm the one who commented, and I didn't say "articles like this"; I said articles on probability distributions, and I didn't just say "non-technical readers"; I said readers not familiar with the mathematical theory of probability. Michael Hardy 02:47, 6 Mar 2005 (UTC)

Because Pareto distributions are used in economics and sociology with regard to political issues of public interest, it's entirely likely that non-technical readers will arrive at this article needing to know what this thing is. Not necessarily in precise detail, but in vague outline, at least.

This article isn't very accessible even to many technical readers. I have a degree from MIT, and I've taken math up through differential equations.

But the relevant question is whether you've studied probability theory. Course 18.440 at MIT, titled Probability and Random Variables, does not require "up through differential equations", but only first-year calculus, which most MIT students have before entering as freshmen (and "first-year" is construed differently at MIT in this case). Anyone who's studied continuous probability distributions (and not just at MIT!) can understand this article. But yes, probably some things could be said for "lay" readers initially. Perhaps because of this distribution's occurrence in social sciences, that would make more of a difference in this case than with most probability distributions. But generally, articles on mathematics shouldn't need to be comprehensible to everyone who's studied only high-school math. Michael Hardy 02:47, 6 Mar 2005 (UTC)

I could make a graph (either mentally, digitally, or on paper) that plots a typical Pareto distribution, but that would be a lot of work that I shouldn't really have to do. I'm sure there are many scientists and computer engineers who would benefit from a better introduction.

... actually, if I were trying to write an introduction for non-mathematically inclined social scientists, a graph wouldn't be the first thing I would attend to. Maybe I'll work on this at some point .... Michael Hardy 02:47, 6 Mar 2005 (UTC)

Fortunately, I think all this article needs to be much more widely accessible is a graph or two of typical Pareto distributions, with labels and a brief explanation. -- Beland 02:19, 6 Mar 2005 (UTC)

I did study probability theory, back in the day, and found the article a bit terse. What I hoped to see at the start was a few extra paragraphs:

1. A brief general introductory paragraph or two pitched at people with only a craps or texas-hold-em knowledge of probability - why it matters, the elevator speech statement of what it means, etc.

2. Move the short section on things claimed to match a Pareto from the bottom of the article, with perhaps a few hard numbers added to it. (The usual - for k=1, x% will be <=3, with similar for k=2 or 3. This is still fluffy, but gives a numerical feel to that graph and the fluffy stuff in the first paragraph

Pretty much the same content, but with the take home goodies near the top. --ScottEllsworth 08:00, 20 Mar 2005 (UTC)

[edit] Pareto density at x_min

Don't we need to come up with a value at the transition point x=x_min? I'm in favor of x=(1/2) k/x_min because it allows definition in terms of, say, the Heaviside step function, and F^-1(F(p(x)))=p(x) uniformly where F is the fourier transform. Whatever we come up with, I will alter the graphic accordingly. Paul Reiser 23:27, 15 Mar 2005 (UTC)

For purposes of probability theory, the value of a density at a boundary point does not matter since it does not affect the value of any integral. But for pruposes of maximum likelihood estimation in statistical inference, you'd probably want to make it the maximum. Therefore I would not use half the maximum. But of course, the inverse Fourier transform of the Fourier transform of the density may give you half the maximum. Michael Hardy 00:08, 16 Mar 2005 (UTC)

- - - Comment by an actuary*****

Why is the exponent called "k" ? This tends to make one think that the parameter only takes on integral values, which is not true. European actuaries use alpha, Americans use "Q"--either would be better.

Technically, when the exponent is < 1, the mean "does not exist"; "is infinity" is slightly off. Similar comment for < 2, variance.

Consider mentioning that the Pareto is often shifted so its support starts at 0; put in a reference to "shifted distribution."

Note that conditional distribution is also Pareto with the same exponent.

Maybe note that Method of Moments parameter estimation doesn't work (even more so than usual!). (Because setting the mean equal to the sample mean implies an assumption that the exponent is at least one.)

Asymptotic theory says that asymptotically, tails of distributions (if not of finite support) look exponential, or Pareto. Should link.

[edit] Error in CDF formula for Pareto

I believe that the expression for the cumulative distribution function has an error. It currently reads

cdf = $1-\left(\frac{x_\mathrm{m}}{x_\mathrm{m}+x}\right)^k\!$

and should read

cdf = $1-\left(\frac{x_\mathrm{m}}{x}\right)^k\!$

This is perhaps part of the confusion arising out of not shifting the origin to x_m. There should also be a reference to the excellent (highly technical) article in mathworld: http://mathworld.wolfram.com/ParetoDistribution.html

Unless I get, within a short period of time, some indication that I am wrong, I will change it in the main article.

[edit] I got the wrong PDF?

I Changed the

cdf = $1-\left(\frac{x_\mathrm{m}}{x}\right)^k\!$

For this one

cdf = $1-\left(\frac{x_\mathrm{m}}{x_\mathrm{m}+x}\right)^k\!$

I got the result from integrating a pdf...which is a bit different from the one given; it is essentially the same one but mine did not shift the origin to x_m ... this is not a fake result or anything. Something should indicate this "kinda" conflict between 2 version of the same probability function. but definitively...i will remove my mistake...only because the current cdp does not reflect the shifting nature of the pdf. I'll specify in the generating topic that it will generate a random sample from a non shifted pareto distribution.
Cyberyder 04:24, 6 April 2006 (UTC)

[edit] Alternative R code

The provided R code for random sample generation does not translate from the origin to lambda, and thus yields numbers lower than lambda. A good alternative that provides the wanted values directly can be found in [1].

[edit] Generalized Pareto distribution

I have just changed Generalized Pareto Distribution so that it redirects here rather than to Generalized extreme value distribution, which was incorrect. Now we need someone to expand the new section (perhaps with reference to [2]). Any volunteers? DFH 18:59, 23 December 2006 (UTC)

[edit] Relationship to the exponential distribution

I'm not quite sure what the relationship is, but how it is defined in this article is rather ambiguous. The exponential random variable has one parameter, but the formula implies that there are two parameters for the exponential distribution. There's a relationship between the Pareto distribution and the uniform distribution, as described in Statistical Distributions, Second Edition, by Evans, Hastings, and Peacock. Perhaps this is a simpler and more meaningful relationship. Steve Simon 23:30, 22 January 2007 (UTC)

A probability distribution does not have any parameters; rather a family of probability distributions may be parameterized. The usually-seen family of exponential distributions has just one parameter. However, one may speak of an exponential distribution on an interval (a, ∞), and the minimum point a is itself a parameter. This is the conditional probability distribution of an exponential distribution on (0, ∞), given the event of being ≥ a. One then gets a more extensive family of exponential distributions, parameterized by two real parameters. Michael Hardy 20:00, 23 January 2007 (UTC)

So which parameter corresponds to the value of "a" in your example. Is it k? Is it

(ln(x / x m)

? Also, if the two parameter family of exponential distributions is an important one, perhaps that should be incorporated on the Exponential distribution page. Steve Simon 21:31, 25 January 2007 (UTC)

[edit] Many types of Pareto

Hi :) I'm studying My Actuarial Exam 4/C along with loss models and i can't help but to notice that the pareto distribution listed in wikipedia is the 1 parameter for of the distribution. In fact this distribution is pretty much the same as the 2 parameter instead, Xm. I was wondering if it was possible to change the name of the article for "Single parameter pareto distribution" and i will eventually add a subtopic for the 2 parameter distribution. 64.18.166.93 18:16, 7 April 2007 (UTC)

[edit] Distribution Example: size of sand particles?

Is it true that the size of sand particles (probably from the same part of the same beach) are pareto-distributed? I would have thought they'd be gaussian-normal-distributed...

I never looked at sand particles under a microscope, though, I just thought they were all the same size...

<--- Feels Educated now. :)

Talk:Pareto distribution

From Wikipedia, the free encyclopedia

Contents

[edit] Graph

[edit] Technical cleanup tag

[edit] Pareto density at x_min

[edit] Error in CDF formula for Pareto

[edit] I got the wrong PDF?

[edit] Alternative R code

[edit] Generalized Pareto distribution

[edit] Relationship to the exponential distribution

[edit] Many types of Pareto

[edit] Distribution Example: size of sand particles?

Views

Navigation

Interaction

Search

Talk:Pareto distribution

From Wikipedia, the free encyclopedia

Contents

[edit] Graph

[edit] Technical cleanup tag

[edit] Pareto density at xmin

[edit] Error in CDF formula for Pareto

[edit] I got the wrong PDF?

[edit] Alternative R code

[edit] Generalized Pareto distribution

[edit] Relationship to the exponential distribution

[edit] Many types of Pareto

[edit] Distribution Example: size of sand particles?

Views

Navigation

Interaction

Search

[edit] Pareto density at x_min