Talk:Cross tabulation
From Wikipedia, the free encyclopedia
The "Wiki users are smarter" bit is cute, but hardly NPOV, is it? Joshua McGee (talk) 06:20, 29 Jul 2004 (UTC)
- It is just my feeble attempt at introducing some humor into an otherwize dry subject. If it offends you, change it. mydogategodshat 17:41, 29 Jul 2004 (UTC)
- I wasn't trying to be a jerk about it. Personally, I like it. But it's hard to imagine the Encyclopedia Brittanica doing something like this, right? :-) I did forget to thank you for a very clear example; I like the article. Joshua McGee (talk) 23:58, 29 Jul 2004 (UTC)
Something seems wrong with the explanation in the second half of the second paragraph. The reason the rows don't add to 100% is because the table is of the use of wiki for various underwear-types of people, not underwear type for various groups of people based on wikipedia usage. I can't explain it well or I would rewrite it, but I'm hoping someone else who reads this can. Furthermore, the last sentence is blantantly wrong because it implies that the whole table should sum to 100%. It should probably read something like: each cell gives the percentage of people who fall into the column grouping who also fall into the row grouping, but that's a bit long. Scott 04:56, 13 Jan 2007 (UTC)
[edit] Wiki users wear boxers or briefs
The article says that these categories are all-inclusive. In order for that to be the case, we would have to assume that all Wiki users are male, or that female Wiki users wear men's underpants.
[edit] Need a section on statistical software
Crosstabs are computed with software on a computer. Large, comprehensive statistical software languages have one or two routines for computing crosstabs.
Language, Crosstab procedure
Minitab, ?
SAS, Proc Freq, Proc Tabulate
SPSS, CROSSTABS
S (open source version "R"), freq()
Systat, ?
Stata, ?
MS Access, Crosstab query
MS Excel, Pivot tables
Lotus Approach, ?
EPI Info, FREQ
Geoda, ?
Jim.Callahan,Orlando 01:43, 20 October 2007 (UTC)
[edit] Regression based statistical analysis of contingency tables (General Linear Model or GLM)
The current statistical discussion of analysis of contingency tables speaks with an ANOVA accent. A simpler, but more powerful way of looking at the contingency tables is using a regression with a collection of zero-one (dummy) indicator variables to code levels.
Y = b0 + b1*X1 + b2*X2 (NOTE: 0,1 & 2 SHOULD BE SUBSCRIPTS
Y = variable to be explained or predicted in terms of the other variables = f(X1, X2...)
b0 = Y-Intercept (beta-zero) b1 = estimated regression coefficient for first variable (beta-one)
X1 = first variable (x-one)
b2 = estimated regression coefficient for second variable (beta-two)
X2 = second variable (x-two)
a complete regression model would include an interaction term (X1*X2) and squared terms (X1*X1) and (X2*X2).
X1*X2 = interaction term (the coefficient of this term measures interaction (twisted response surface) between X1 and X2)
X1*X1 = squared term (for X1) measures curvature (fits a parabola) indicating a non-linear effect
X2*X2 = squared term (for X2) measures curvature (fits a parabola) indicating a non-linear effect
Again we need a software table.
SAS, PROC GLM
S (open source version "R"), lm()
Jim.Callahan,Orlando 01:37, 20 October 2007 (UTC)