Talk:G-test
From Wikipedia, the free encyclopedia
[edit] Feasibility of Fisher Exact test
Before writing the words below, I ran several such calculations using this web-based application: http://home.clara.net/sisa/twoby2.htm with the Firefox browser: for examples in which all cells had values between 10,000 and 20,000 it took about 30 seconds to finish the calculations.
For example, a laptop with a 1.7 Ghz Pentium and 1 GB of RAM, specifications not considered particularly high end in 2006, can readily handle cases of the Fisher exact test in which each cell's value is around 10,000 with commonly available statistical software.
- Reverted as off topic. not really about G-test. Pete.Hurd 17:31, 31 July 2006 (UTC)
[edit] similarity to Kullback-Leibler divergence
Does the G-test and the Kullback-Leibler divergence mean the same but from another point of view?
[edit] G^2
Note that the "G-test" is referred to as the G^2 (g-squared) test (at least in psychology-related statistics).
- Humph. I've never seen that - please give a reference. seglea 23:29, 22 July 2005 (UTC)
To name few references to G^2 in psychological stats (this is common in multinomial modeling work in the memory literature and is becoming more common in fitting other types of models as well):
Dodson, Holland, & Shimamura, 1998. Using Excel to estimate parameters from observed data: An example from source memory data. Behavior Research Methods, Instruments, & Computers 1998, 30 (3), 517-526.
Batchelder & Reifer, 1999. Theoretical and empirical review of multinomial process tree modeling. Psychonomic Bulletin & Review, 6(1), 57-86.
Bayen, Murane, & Erdfelder. (1996). Source Discrimination, Item Detection, and Multinomial Models of Source Monitoring. Journal of Experimental Psychology: Learning, Memory, and Cognition 1996, Vol. 22, No. 1, 197-215.
Erdfelder & Buchner. (1998). Process-Dissociation Measurement Models: Threshold Theory or Detection Theory? Journal of Experimental Psychology: General, 127(1), 83-96.
[edit] fisher.g.test in GeneTS not G-test as described?
fisher.g.test implemented in GeneTS is an exact test for whether a time series is different from Gaussian white noise, not the alternative to the chi-square test as described.
[edit] Where does the 2 come from?
I've been trying to work out how Pearson's formula is an approximation for this test.
G = 2 | ∑ | Oiln(Oi / Ei) |
i |
(since
∑ | (Oi − Ei) = 0 |
i |
)
This is the formula for χ2, except that the factor of 2 is still there. What was my error? Thanks! — ciphergoth 14:11, 3 June 2006 (UTC)
- Your approximation for ln(1+x) at wasn't good enough; it roughly works for each term, but its error for positive and negative numbers reinforces is enough for the factor of 2. Taking the -x^2/2 term and another approximation should get you there. --Henrygb 15:01, 9 March 2007 (UTC)
Then please tell me where is my error:
G = 2 | ∑ | Oiln(Oi / Ei) |
i |
- You made a mistake in the one-before-last equality:
- —Preceding unsigned comment added by 87.69.46.105 (talk) 07:38, 23 February 2008 (UTC)
[edit] More precise stating of distribution of G under null hypothesis
This sentence should be made more precise
- Given the null hypothesis that the observed frequencies result from random sampling from a distribution with the given expected frequencies, the distribution of G is approximately that of chi-squared, with the same number of degrees of freedom as in the corresponding chi-squared test.
Does it converge in distribution? So does the χ2 statistic, right? Is the asymptotic rate of convergence quicker for G than for χ2? I don't have any references on G so I'm afraid I won't be of any help answering these questions.
Andyrew609 19:39, 27 November 2006 (UTC)
[edit] splitting of the G statistics
I am currently going through agrasti's: Categorical Data Analysis (2002) and at page 82 he gies a clean explanation on how to partition the G statistic (p.s: be aware that on the 2007 edition on the book, most of this section was cut - so don't bother looking for it there)
This partitioning is useful - so it might be worth noting in the article... Talgalili —Preceding unsigned comment added by Talgalili (talk • contribs) 17:38, 5 September 2007 (UTC)
[edit] Maybe a squeamish comment about notation
It should be correct in the G formulae to write the bigger brackets outside the summation operator and containing the whole expression of the terms using indexes.
[edit] How to handle zero frequencies in observations?
Since in the formula
the logarithm is used, how terms are handled where Oi = 0?