Talk:Log-normal distribution

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Statistics, which collaborates to improve Wikipedia's coverage of statistics. If you would like to participate, please visit the project page.

Contents

[edit] Old talk

Hello. I have changed the intro from "log-normal distributions" to "log-normal distribution". I do understand the notion that, for each pair of values (mu, sigma), it is a different distribution. However, common parlance is to call all the members of a parametric family by a collective name -- normal distribution, beta distribution, exponential distribution, .... In each case these terms denote a family of distributions. This causes no misunderstandings, and I see no advantage in abandoning that convention. Happy editing, Wile E. Heresiarch 03:42, 8 Apr 2004 (UTC)

In the formula for the maximum likelihood estimate of the logsd, shouldn't it be over n-1, not n?

Unless you see an error in the math, I think its ok. The n-1 term usually comes in when doing unbiased estimators, not maximum likelihood estimators.
You're right; I was confused.

QUESTION: Shouldn't there be a square root at the ML estimation of the standard deviation? User:flonks

Right - I fixed it, thanks. PAR 09:15, 27 September 2005 (UTC)

[edit] Could I ask a question?

If Y=a^2; a is a log normal distribution ; then What kind of distribution is Y?

a is a lognormal distribution
so log(a) is a normal distribution
log(a^2) = 2 log(a) is also a normal distribution
a^2 is a lognormal distribution --Buglee 00:47, 9 May 2006 (UTC)

One should say rather that a has---not is---a lognormal distribution. The object called a is a random variable, not a probability distribution. Michael Hardy 01:25, 9 May 2006 (UTC)


Maria 13 Feb 207: I've never written anything in wikipedia, so I apologise if I am doing the wrong thing. I wanted to note that the following may not be clear to the reader: in the formulas, E(X)^2 represents the square of the mean, rather than the second moment. I would suggest one of the following solutions: 1) skip the parentheses around X and represent the mean by EX. Then it is clear that (EX)^2 will be its square. However, one might wonder about EX^2 (which should represent the second moment...) 2) skip the E operator and put a letter there, i.e. let m be the mean and s the standard deviation. Then there will be no confusion. 3) add a line at some point in the text giving the notation: i.e. that by E(X)^2 you mean the square of the first moment, while the second moment is denoted by E(X^2) (I presume). I had to invert the formula myself in order to figure out what it is supposed to mean.

I've just attended to this. Michael Hardy 00:52, 14 February 2007 (UTC)

[edit] A mistake?

I think there is a mistake here : the density function should include a term in sigma squared divided by two, and the mean of the log normal variable becomes mu - sigma ^2/2 Basically what happened is that, I think, the author forgot the Ito term.

I believe the article is correct. See for example http://mathworld.wolfram.com/LogNormalDistribution.html for an alternate source of the density function and the mean. They are the same as shown here, but with a different notation. (M in place of mu and S in place of sigma). Encyclops 00:23, 4 February 2006 (UTC)
Either the graph of the density function is wrong, or the expected value formula is wrong. As you can see from the graph, as sigma decreases, the expected value moves towards 1 from below. This is consistent with the mean being exp(mu - sigma^2/2), which is what I recall it as. 69.107.6.4 19:29, 5 April 2007 (UTC)
Here's you're mistake. You cannot see the expected value from the graph at all. It is highly influenced by the fat upper tail, which the graph does not make apparent. See also my comments below. Michael Hardy 20:19, 5 April 2007 (UTC)

I've just computed the integral and I get

e^{\mu + \sigma^2/2}.\,

So with μ = 0, as σ decreases to 0, the expected value decreases to 1. Thus it would appear that the graph is wrong. Michael Hardy 19:57, 5 April 2007 (UTC)

...and now I've done some graphs by computer, and they agree with what the illustration shows. More later.... Michael Hardy 20:06, 5 April 2007 (UTC)

OK, there's no error. As the mode decreases, the mean increases, because the upper tail gets fatter! So the graphs and the mean and the mode are correct. Michael Hardy 20:15, 5 April 2007 (UTC)

You're right. My mistake. The mean is highly influenced by the upper tail, so the means are actually decreasing to 1 as sigma decreases. It just looks like the means approach from below because the modes do. 71.198.244.61 23:50, 7 April 2007 (UTC)

[edit] A Typo

There is a typo in the PDF formula, a missing '['

[edit] Erf and normal cdf

There are formulas that use Erf and formulas that use the cdf of the normal distribution, IMHO this is confusing, because those functions are related but not identical. Albmont 15:02, 23 August 2006 (UTC)

[edit] Technical

Please remember that Wikipedia articles need to be accessible to people like high school studends, or younger, or without any background in math. I consider myself rather knowledgable in math (had it at college level, and still do) but (taking into account English is not my native language) I found the lead to this article pretty difficult. Please make it more accessible.-- Piotr Konieczny aka Prokonsul Piotrus | talk  22:48, 31 August 2006 (UTC)

To expect all Wikipedia math articles to be accessible to high-school students is unreasonable. Some can be accessible only to mathematicians; perhaps more can be accessible to a broad audience of professionals who use mathematics; others to anyone who's had a couple of years of calculus and no more; others to a broader audience still. Anyone who knows what the normal distribution is, what a random variable is, and what logarithms are, will readily understand the first sentence in this article. Can you be specific about what it is you found difficult about it? Michael Hardy 23:28, 31 August 2006 (UTC)

I removed the "too technical" tag. Feel free to reinsert it, but please leave some more details about what specifically you find difficult to understand. Thanks, Lunch 22:18, 22 October 2006 (UTC)

[edit] Skewness formual incorrect?

The formula for the skewness appears to be incorrect: the leading exponent term you have is not present in the definitions given by Mathworld and NIST, see http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm and http://mathworld.wolfram.com/LogNormalDistribution.html.

Many thanks.

[edit] X log normal, not normal.

I think the definition of X as normal and Y as lognormal in the beginning of the page should be changed. The rest of the page treats X as the log normal variable. —The preceding unsigned comment was added by 213.115.25.62 (talk) 17:40, 2 February 2007 (UTC).

[edit] Partial expectation

I think that there was a mistake in the formula for the partial expectation: the last term should not be there. Here is a proof: http://faculty.london.edu/ruppal/zenSlides/zCH08%20Black-Scholes.slide.doc See Corollary 2 in Appendix A 2.

I did put my earlier correction back in. Of course, I may be wrong (but, right now, I don't see why). If you change this again, please let me know why I was wrong. Thank you.

Alex —The preceding unsigned comment was added by 72.255.36.161 (talk) 19:39, 27 February 2007 (UTC).

Thanks. I see the problem. You have the correct expression for

g(k)=\int_k^\infty x f(x)\, dx

while what I had there before would be correct if we were trying to find

g_2(k)=\int_k^\infty (x-k) f(x)\, dx

which is (essentially) the B-S formula but is not the partial mean (or partial expectation) by my (or your) definition. (Actually I did find a few sources where the partial expectation is defined as g2 but this usage seems to be rare. For ex. [1]). The term that you dropped occurs in g2(k) but not g(k), the correct form of the partial mean. So I will leave the formula as it is now. Encyclops 00:47, 28 February 2007 (UTC)

[edit] Generalize distribution of product of lognormal variables

About the distribution of a product of independent log-normal variables:

Wouldn't it be possible to generalize it to variables with different average ( mu NOT the same for every variable)?

[edit] the name: log vs exponential

log normal, sometimes, it is a little bit confusing for me, so a little bit note here:

For variable Y, if X=log(Y) is normal, then Y is log normal, which says after being taken log, it becomes normal. Similarly, there might be exponential normal: for variable Z, exp(Z) is normal. However, exp(Z) can never be normal, so the name log normal. Furthermore, if X is normal, then log(X) is undefined.

In other cases, variable X is in whatever distribution (XXX), we need a name for the distribution of Y=log(X) (in the case it is defined). X=exp(Y), Such a name should exponential XXX. For instance, X is in IG, then Y=log(X) is in exponential IG. Jackzhp 15:37, 13 July 2007 (UTC)

[edit] Mean, μ and σ

The relationship given for μ in terms of Var(x) and E(x) suggest that μ is undefined when E(x)\le 0. However, I see no reason why E(x) must be strictly positive. I propose defining the relatinship in terms of E2(x) such that

\mu = \frac{1}{2}\left(\ln\left(E^2(x)\right) - \sigma^2\right)

I am suspicious that this causes μ to be...well, wrong. It suggests that two different values for E(x) could result in the same μ, which I find improbable. In any case, if there is a a way to calculate μ when E(x)\le 0 then we should include it, if not, we need to explain this subtlety. In my humble opinion.--Phays 20:35, 6 August 2007 (UTC)

I'm not fully following your comment. I have now made the notation consistent throughout the article: X is the random variable that's log-normally distributed, so E(X) must of course be positive, and μ = E(Y) = E(log(X)).
I don't know what you mean by "E2". It's as if you're squaring the expectation operator. "E2(X) would mean "E(E(X))", but that would be the same thing as E(X), since E(X) is a constant. Michael Hardy 20:56, 6 August 2007 (UTC)

[edit] Maximum Likelihood Estimation

Are there mistakes in the MLE? It looks to me as though the provided method is a MLE for the mean and variance, not for the parameters μ and σ. If that is so it should be changed to the parameters estimated \hat{E}(x) and \hat{Var}(x)and then a redirect to extracting the parameter values from the mean and variance.--Phays 20:40, 6 August 2007 (UTC)

The MLEs given for μ and σ2 are not for the mean and variance of the log-normal distribution, but for the mean and variance of the distribution of the normally distribution logarithm of the log-normally distributed random variable. They are correct MLEs for μ and σ2. The "functional invariance" of MLEs generally, is being relied on here. Michael Hardy 20:47, 6 August 2007 (UTC)
I'm afraid I still don't fully understand, but it is simple to explain my confusion. Are the parameters being estimated μ and σ2 from
f(x;\mu,\sigma) = \frac{e^{-(\ln x - \mu)^2/(2\sigma^2)}}{x \sigma \sqrt{2 \pi}}
or are these estimates describing the mean and variance? In other words, if X is Nnn) and Y = exp(X) then is E(Y)=\mu\approx \bar{\mu}? It is my understand that the parameters in the above equation, namely μ and σ are not the mean and standard deviation of Y. They may be the mean and standard deviation of X.--Phays 01:16, 7 August 2007 (UTC)
The answer to your first question is affirmative. The expected value of Y = exp(X) is not μ; its value if given elsewhere in the article. Michael Hardy 16:10, 10 August 2007 (UTC)

8/10/2007:

It is my understanding that confidence intervals use standard error of a population in the calculation not standard deviation (sigma).

Therefore I do not understand how the Table is using 2sigma e.tc. for confidence interval calulation as pertains to the log normal distribution.

Why is it shown as 2*sigma?

Angusmdmclean 12:35, 10 August 2007 (UTC) angusmdmclean

[edit] This page lacks adequate citations!!

Wikipedia policy (see WP:CITE#HOW) suggests citation of specific pages in specific books or peer-reviewed articles to support claims made in Wikipedia. Surely this applies to mathematics articles just as much as it does to articles about history, TV shows, or anything else?

I say this because I was looking for a formula for the partial expectation of a lognormal variable, and I was delighted to discover that this excellent, comprehensive article offers one. But how am I supposed to know if the formula is correct? I trust the competence of the people who wrote this article, but how can I know whether or not some mischievous high schooler reversed a sign somewhere? I tried verifying the expectation formula by calculating the integral myself, but I got lost quickly (sorry! some users of these articles are less technically adept than the authors!) I will soon go to the library to look for the formula (the unconditional expectation appears in some books I own, but not the partial expectation) but that defeats the purpose of turning to Wikipedia in the first place.

Of course, I am thankful that Wikipedia cites one book specifically on the lognormal distribution (Aitchison and Brown 1957). That reference may help me when I get to the library. But I'm not sure if that was the source of the formula in question. My point is more general, of course. Since Wikipedia is inevitably subject to errors and vandalism, math formulas can never be trusted, unless they follow in a highly transparent way from prior mathematical statements in the same article. Pages like this one would be vastly more useful if specific mathematical statements were backed by page-specific citations of (one or preferably more) books or articles where they could be verified. --Rinconsoleao 15:11, 28 September 2007 (UTC)

Normally I do not do this because I think it is rude, but I really should say {{sofixit}} because you are headed to the library and will be able to add good cites. Even if we had a good source for it, the formula could still be incorrect due to vandalism or transcription errors. Such is the reality of Wikipedia. Can you write a program to test it, perhaps? Acct4 15:23, 28 September 2007 (UTC)
I believe Aitchinson and Brown does have that formula in it, but since I haven't looked at that book in many years I wouldn't swear by it. I will have to check. I derived the formula myself before adding it to Wikipedia, unfortunately there was a slip up in my post which was caught by an anonymous user and corrected. FWIW, at this point I have a near 100% confidence in its correctness. And I am watching this page for vandalism or other problems. In general your point is a good one. Encyclops 22:34, 28 September 2007 (UTC)

Why has nobody mentioned whether the mean and standard deviation are cacultaed from x or y?. if y = exp(x). Then mean and stdev are from the x values. Book by - Athansious Papoulis. Siddhartha,here. —Preceding unsigned comment added by 203.199.41.181 (talk) 09:26, 2 February 2008 (UTC)

[edit] examples for log normal distributions in nature/economy?

Some examples would be nice! —Preceding unsigned comment added by 146.113.42.220 (talk) 16:41, 8 February 2008 (UTC)

One example is neurological reaction time. This distribution has been seen in studies on automobile braking and other responses to stimuli. See also mental chronometry.--IanOsgood (talk) 02:32, 26 February 2008 (UTC)
This is also useful in telecom. in order to compute slow fading effects on a transmitted signal. -- 82.123.94.169 (talk) 14:42, 28 February 2008 (UTC)

I think the Black–Scholes Option model uses a log-normal assumption about the price of a stock. This makes sense, because its the percentage change in the price that has real meaning, not the price itself. If some external event makes the stock price fall, the amount that it falls is not very important to an investor, its the percent change that really matters. This suggests a log normal distribution. PAR (talk) 17:13, 28 February 2008 (UTC)

[edit] Parameters boundaries ?

If the relationship between the log-normal distribution and the normal distribution is right, then I don't understand why μ needs to be greater than 0 (since μ is expected to be a real with no boundary in the normal distribution). At least, it can be null since it's the case with the graphs shown for the pdf and cdf (I've edited the article in consequence). Also, that's not σ that needs to be greater than 0, but σ2 (which simply means that σ can't be null since it's a real number). -- 82.123.94.169 (talk) 15:04, 28 February 2008 (UTC)

Question: What can possibly be the interpretation of, say, σ = − 3 as opposed to σ = 3? By strong convention (and quite widely assumed in derivations) standard deviations are taken to be in the domain [0,\infty), although I suppose in this case algebraically σ can be negative... It's confusing to start talking about negative sds, and unless there's a good reason for it, please don't. --128.59.111.72 (talk) 22:59, 10 March 2008 (UTC)

Yes, you're right: σ can't be negative or null (it's also obvious reading the PDF formula). I was confused by the Normal Distribution article where only σ2 is expected to be positive (which is also not sufficient there). Thanks for your answer, and sorry for that. I guess μ can't be negative as well because that would be meaningless if it was (even if it would be mathematically correct). -- 82.123.102.83 (talk) 19:33, 13 March 2008 (UTC)

[edit] Logarithm Base

Although yes, any base is OK, the derivations and moments, etc. are all done assuming a natural logarithm. Although the distribution would still be lognormal in another base b, the details would all change by a factor of ln(b). A note should probably be added in this section, that we are using by convention the natural logarithm here. (And possibly re-mention it in the PDF.) --128.59.111.72 (talk) 22:59, 10 March 2008 (UTC)

[edit] Product of "any" distributions

I think it should be highlighted in the article that the Log-normal distribution is the analogue of the normal distribution in this way: if we take n independent distributions and add them we "get" the normal distribution (NB: here I am lazy on purpose, the precise idea is the Central Limit Theorem). If we take n positive independent distributions and multiply them, we "get" the log-normal (also lazy). Albmont (talk) 11:58, 5 June 2008 (UTC)

This is to some extent expressed (or at least suggested) where the article says "A variable might be modeled as log-normal if it can be thought of as the multiplicative product of many small independent factors". Perhaps it could be said better, but the idea is there. Encyclops (talk) 14:58, 5 June 2008 (UTC)
So we're talking about the difference between "expressed (or at least suggested)" on the one hand, and on the other hand "highlighted". Michael Hardy (talk) 17:39, 5 June 2008 (UTC)
Yes, the ubiquity of the log-normal in Finance comes from this property, so I think this property is important enough to deserve being stated in the initial paragraphs. Just MHO, of course. Albmont (talk) 20:39, 5 June 2008 (UTC)