Talk:Generalized linear model

From Wikipedia, the free encyclopedia

This text should be read with care, because of the errors, in the text and formulas.

Contents

[edit] Cleanup

There are no errors here, but this article needs to be cleaned up and fleshed out a bit. I will try to work on it soon... --shaile 00:33, 14 April 2006 (UTC)

I don't like: "In statistics the generalized linear model (GLM) generalizes the ordinary least squares regression." Surely OLS is an estimation method, not a model? In my view, it should say "In statistics the generalized linear model (GLM) generalizes the linear model." Blaise 21:45, 23 September 2006 (UTC)
You bettter believe OLS implies a model. Granted, a model that most would agree is wrong for their data, but some models are useful, so we use them. I'd suggest merging those two, they cover the exact same subject. Pdbailey 13:39, 26 September 2006 (UTC)

[edit] proposed changes

The edit I made (that was rved) did trim the overall size of the page and reduce the number of examples, but as it stands it is very difficult to understand what a GLM is. The basic question is what is a glm? After we have said that, we need to say why you might want to use one and then I think we should get a little into how parameters are estimated. I'm going to rv it back and expand substantially on what I wrote last time.

It would be great if we could expand on the part about using any CDF. here we have the alternative example where Y = 1ifη > 0 and is zero otherwise. Pdbailey

Would you explain what you meant by this sentence: "Because the variance portion is not constant, there is no sense to least squares, instead the parameters must be estimated with maximum likelihood or quasi maximum likelihood." It is not clear at all... Thanks! (Also, PLEASE sign your comments! All it takes is 4 ~'s) --shaile 22:21, 20 April 2006 (UTC)
Sorry, I don't have time to fix it up right now. The general idea is that OLS won't work because none of the assumptions hold (independance, constant variance)Pdbailey 00:11, 21 April 2006 (UTC)

[edit] on using η

So the reason that I like the η terminology is that it seperates out the linear part of the equation (Xβ) from the random part (Y). It makes it clear that there is a linear model in there. Also, the way the second equation is now (E(Y_i) = \mu_i = g^{-1}(X_i^T\beta), so that g(\mu_i) = X_i^T\beta.) it is a bit of a garbled mess. Pdbailey 02:21, 21 April 2006 (UTC)

I guess I see your point. On the other hand, I think maybe it's better to keep a similar notation to that of Linear model, thus dropping the η. As for the second equation, I think it should be either g(\mu_i) = X_i^T\beta, or g(E(Y_i)) = X_i^T\beta. You're right, the inverse g function is a bit confusing, and I think g(\mu)=X_i^T\beta is the more standard notation. I need to find the paper for this, I have it around somewhere... --shaile 04:21, 21 April 2006 (UTC)
I'm working off lecture notes from McCullagh (author of one of the References) and I'll have to admit that my notes are not an example of clarity (hence g instead of g − 1). However, I really like the seperation that McCullagh emphasies between the linear part of the model (which describes the mean behavior with a linear equation) and the variance part of the model, which describes the dispersion of the Y values. In fact, he argues that it is always clearer to write a model in the fashon -- to seperate out the expected value from the dispersion. In a GLM, when they are seperated it is clear that the link, well, links the two portions of the model. It gives some perspective into the relevance of the link function and seperates out clearly the three components that are present in all GLMs. Some people might think that probit and logit are worlds apart, but in this framework, it is clear that they are minor variants on each other.
This form is also echoed in programs such as STATA which has a linear model, a link function, and a variance function. But I'm affraid that we just disagree on this one -- you like the simplicity of having it all in one formula, I like seeing all three components seperated. Pdbailey 05:19, 21 April 2006 (UTC)
I just changed the page in light of this discussion not to rv but to state the model more clearly in the way I'm arguing for. Pdbailey 05:27, 21 April 2006 (UTC)
Actually, this is fine. It was just more confusing the way you had it before.  :) --shaile 13:21, 21 April 2006 (UTC)
Based on the objects that the two of you raised, I changed it back to look more like it used to. I think that it is easier to get a handle on this way quickly. Pdbailey 16:10, 27 April 2006 (UTC)
Would someone clarify this: \epsilon_i = f( g^{-1}(X_i^T \beta) )? It's not at all clear, and I don't see how the error term is a function of the other parts, actually, I don't think it should be stated this way. Also, We need more details in there, it was better when the three parts of GLM were clarified separately. Any objections? --shaile 19:22, 27 April 2006 (UTC)

[edit] Reorganization and exponential family detail

I have been reorganizing this article a bit, starting at the top. I wish however to point out a small but important change I made to the definition of exponential family here. Where before it contained a term a(y)b(θ), in McCullagh & Nelder (p28) it clearly shows a(y. I have made this change and eliminated the reference to the b function. If there is a more reputable source than M&N (hard to imagine) which has the more general a(y)b(θ) form, feel free to revert but please include the source. Baccyak4H 18:50, 27 October 2006 (UTC)

I think that's a good change you made to the definition of the exponential family. Do you think we could include how this relates to the link function? (I know how to do that, but I'd have to do a few to remember exactly where the link function comes from...) --shaile 20:57, 27 October 2006 (UTC)

Thanks. I was planning to address some more advantages of this form (M&N's form) including sufficient statistics, variance as function of a, c and d, and the canonical parameter. But that must wait; my copy is elsewhere now ;-). I certainly will be continuing my reorganization (mostly to make different editors' contributions sound like they came from one editor -- my pet peeve). While I may do a lot of tweaking, please feel free to improve my efforts. Baccyak4H 03:06, 28 October 2006 (UTC)

while M&N use that unusual exponential family form in their book they are equivalent (in the sense that you point out) and I think it makes sense for wikipedia to the broader definition. Which is to say that the notational convenience that it affords one book may not be as nice in an encyclopedia. Also, this article is for a much broader audience than the book and covers in a lot less detail. Pdbailey 03:51, 28 October 2006 (UTC)

I suppose in the end a more general formula would be better. Is there a version like the current version which also contains the so-called dispersion parameter? M&N has that (φ), and given it appears in an overdispersion context twice (one of which I just added), I see some merit in including it in the formula: one of the great merits as I see it is the unification that the theory provides - another reason I may discribe the canonical parameter some more (Note: I removed it from the link table title, not because it was wrong (it was not), but because it was unexplained. I plan on returning to fix that). Baccyak4H 03:00, 29 October 2006 (UTC)

Well, here is a possible generalization of the exponential family formula which includes a dispersion parameter φ. THe inclusion of φ is in the spirit of M&N's definition, but the rest of the formula is the same as the current one.

Old: f_y(y; \theta) = \exp{(a(y)b(\theta) + c(\theta) + d(y) )} \,\!.
New: f_y(y; \theta) = \exp{\left(\frac{a(y)b(\theta) + c(\theta)}{h(\phi)} +                                      d(y,\phi) \right)} \,\!.

I tried to reword the discussion there to apply to this. Baccyak4H 14:58, 30 October 2006 (UTC)

Just dawned on me that for the b(θ) formula, if b is invertable then it is exactly equivalent to the version without b; they merely represent a reparameterization of each other. Baccyak4H 16:34, 31 October 2006 (UTC)