Talk:Logistic regression

From Wikipedia, the free encyclopedia

I think this article should be mergedb with logit Pdbailey 03:19, 20 April 2006 (UTC)

Interesting question: the logit link function is the inverse of the logistic function, which also has its own article (that talks about epidemiology, etc.). However, I think the inverse function is only really used in logistic regression, so the merge does make sense. -- hike395 14:31, 20 April 2006 (UTC)

NO! This suggestion is totally wrong! logit and logistic and logistic regression are different things, and should not be mixed up merely because they have close relations. For example, I was initially interested in logistic function, then thought logit can be used for other purpose, but not the logistic regression. And I don't need to know the logistric regression at all to use logistic function. --Pren 14:43, 24 April 2006 (UTC)

Pren, thanks for the input. Can you please expand on why you think the two should not be merged. I'm specifically interested in what purpose you had for using the logit function that was not associated with logistic regression. Thanks a lot for including your input. Pdbailey 16:45, 24 April 2006 (UTC)

I agree with Pren. The logit function is just a function, easily described by a formula. Logistic regression is a mathematical procedure in applied mathematics, that makes use of the logit function. Of course both articles should link to each other, but they are different things, objects of a different category and complexity. The logit function is interesting in itself. --zeycus 11:25, 25 April 2006 (UTC)

Pren and Zeycus, have you read the logit entry and the wikipedia page on [[wp:mm|merges]? It looks to me like these pages meet the second or third criteria for merging, which are
  • 'There are two or more pages on related subjects that have a large overlap. Wikipedia is not a dictionary; there doesn't need to be a separate entry for every concept in the universe. For example, "Flammable" and "Non-flammable" can both be explained in an article on Flammability.
  • 'If a page is very short and cannot or should not be expanded terribly much, it often makes sense to merge it with a page on a broader topic.'
Certainly the portion of the logit page that is on logistic regression can be merged with this page and then removed from that page. Once that is done, the logit page is one paragraph long and is either a stub or should be deleted. This raises the question, are we then making the wikipedia a dictionary by including it? If there is some real substantive material about logit that does not fold well into other areas, probably not -- it should probably stay. But I don't think the logit is as important a function as, say, the gamma function. I take as evidence of its unimportance that it does not appear in "Handbook of Mathematical Functions." Just saying that it is a function distinct from the regression that it is often used for does not seem sufficient. Pdbailey 14:42, 25 April 2006 (UTC)
Thank you, Pdbaley, I read what you suggested and I see your point. So now the question seems subjective to me, I don't dare to defend any of the options. --zeycus 16:41, 26 April 2006 (UTC)
I tend towards inclusionism -- I recommend that we delete the overlapping part of logit, but then leave the rest alone: it may expand in the future to include history or other applications, who knows? -- hike395 17:31, 26 April 2006 (UTC)
Hike395, -isms aside, can you identify what is included in logit that should not be included in this page? I'm not sure I see anything. Pdbailey 04:59, 27 April 2006 (UTC)
How about
In mathematics, especially as applied in statistics, the logit (pronounced with a long "o" and a soft "g", IPA /loʊdʒɪt/) of a number p between 0 and 1 is
{\rm logit}(p)=\log\left( \frac{p}{1-p} \right) =\log(p)-\log(1-p).
Plot of logit in the range 0 to 1, base is e
Plot of logit in the range 0 to 1, base is e
The logit function is the inverse of the "sigmoid", or "logistic" function. If p is a probability then p/(1 − p) is the corresponding odds, and the logit of the probability is the logarithm of the odds; similarly the difference between the logits of two probabilities is the logarithm of the odds-ratio, thus providing an additive mechanism for combining odds-ratios.
-- hike395 05:40, 27 April 2006 (UTC)

Okay, I can see why (given your wikipedia philosophy) you want to keep that article and I think it's just subjective at this point. I'll just say now that short of another voice we can use your proposed text. That said, let me tell you why I disagree that there is value in having that article seperate. I think that it might make new users think that it is all wikipedia has to say on logistic regression. Afterall, search logit on google and you get that page. If you still disagree, again, I'll concede the point and I'll update both articles after a few days for others to throw in their two cents. Pdbailey 14:20, 27 April 2006 (UTC)

Yes, I can see your point, it's valid. How about we append
The logit function is an important part of logistic regression: for more information, please see that article.
Would that take care of your objection? -- hike395 03:33, 28 April 2006 (UTC)

I am reading 'Gatrell, A.C. (2002) Geographies of Health: an Introduction, Oxford: Blackwell.' today and it discusses 'logistic regression model' in a health geography context of case and controls. It appears to be mentioned in alot of academic literature, why is wikipedia trying to call it something else?Supposed 19:22, 9 May 2006 (UTC)


This article would be more useful if an example could be given of how logistic regression is used in statistical analysis. For example, it would be great if someone could use actual data to describe how logistic regression makes X concept more clear. Zminer 01:58, 15 May 2006 (UTC)

[edit] Mistake?

What does i, = 1, ..., n mean? What the comma after i stands for? -- Neoforma 12:54, 13 July 2006 (UTC)

Was just about to answer the wrong question. Yup, that's a mistake. — cBuckley (TalkContribs) 17:42, 13 July 2006 (UTC)

[edit] binomial distributed errors?

Since when does the logit model have binomial distributed errors? This must be standard (with mean 0 and s=1) logistic distributed.

I improved the wording of this part to be more accurate. Have a look. Baccyak4H (talk) 17:42, 22 November 2006 (UTC)

Along a similar line, the article read that the dependant variable was bernouli distributed, I updated this to bionomially distributed because the binomial is a generalization of the bernouli to more than one trial. Perhaps this further clarifies things. Pdbailey 01:39, 5 March 2007 (UTC)