Talk:Cross tabulation

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Statistics, which collaborates to improve Wikipedia's coverage of statistics. If you would like to participate, please visit the project page.

The "Wiki users are smarter" bit is cute, but hardly NPOV, is it? Joshua McGee (talk) 06:20, 29 Jul 2004 (UTC)

It is just my feeble attempt at introducing some humor into an otherwize dry subject. If it offends you, change it. mydogategodshat 17:41, 29 Jul 2004 (UTC)
I wasn't trying to be a jerk about it. Personally, I like it. But it's hard to imagine the Encyclopedia Brittanica doing something like this, right?  :-) I did forget to thank you for a very clear example; I like the article. Joshua McGee (talk) 23:58, 29 Jul 2004 (UTC)

Something seems wrong with the explanation in the second half of the second paragraph. The reason the rows don't add to 100% is because the table is of the use of wiki for various underwear-types of people, not underwear type for various groups of people based on wikipedia usage. I can't explain it well or I would rewrite it, but I'm hoping someone else who reads this can. Furthermore, the last sentence is blantantly wrong because it implies that the whole table should sum to 100%. It should probably read something like: each cell gives the percentage of people who fall into the column grouping who also fall into the row grouping, but that's a bit long. Scott 04:56, 13 Jan 2007 (UTC)

[edit] Wiki users wear boxers or briefs

The article says that these categories are all-inclusive. In order for that to be the case, we would have to assume that all Wiki users are male, or that female Wiki users wear men's underpants.

[edit] Need a section on statistical software

Crosstabs are computed with software on a computer. Large, comprehensive statistical software languages have one or two routines for computing crosstabs.

Language, Crosstab procedure
Minitab, ?
SAS, Proc Freq, Proc Tabulate
SPSS, CROSSTABS
S (open source version "R"), freq()
Systat, ?
Stata, ?
MS Access, Crosstab query
MS Excel, Pivot tables
Lotus Approach, ?
EPI Info, FREQ
Geoda, ?
Jim.Callahan,Orlando 01:43, 20 October 2007 (UTC)

[edit] Regression based statistical analysis of contingency tables (General Linear Model or GLM)

The current statistical discussion of analysis of contingency tables speaks with an ANOVA accent. A simpler, but more powerful way of looking at the contingency tables is using a regression with a collection of zero-one (dummy) indicator variables to code levels.

Y = b0 + b1*X1 + b2*X2 (NOTE: 0,1 & 2 SHOULD BE SUBSCRIPTS

Y = variable to be explained or predicted in terms of the other variables = f(X1, X2...)

b0 = Y-Intercept (beta-zero) b1 = estimated regression coefficient for first variable (beta-one)
X1 = first variable (x-one)
b2 = estimated regression coefficient for second variable (beta-two)
X2 = second variable (x-two)

a complete regression model would include an interaction term (X1*X2) and squared terms (X1*X1) and (X2*X2).

X1*X2 = interaction term (the coefficient of this term measures interaction (twisted response surface) between X1 and X2)
X1*X1 = squared term (for X1) measures curvature (fits a parabola) indicating a non-linear effect
X2*X2 = squared term (for X2) measures curvature (fits a parabola) indicating a non-linear effect

Again we need a software table.

SAS, PROC GLM
S (open source version "R"), lm()

Jim.Callahan,Orlando 01:37, 20 October 2007 (UTC)