Talk:Resampling (statistics)
From Wikipedia, the free encyclopedia
I originally wrote the permutation test article. I understand that in the Wikipedia world that doesn't mean much, but it had gotten so convoluted with parenthetical phrases and qualifications and such that it was virtually impenetrable. So I edited it. I appreciate the helpful additions as well as putting the article into the general rubric of resampling which makes sense. - Respectfully, WJF.
I made changes to Bootstrap, Jackknife and wrote a much longer text on Permutation test and moved the reference list after Bootstrap to the reference list at the end of the chapter. My writing is based on my experience as an applied statistician and a developer of statistical software with emphasis on resampling techniques, except for the text about Jackknife, which borrows heavily from Mooney & Duval (see list of references). I have tried keep as much as possible of the original text, but in some cases where it clashed with my own writing it was removed. For example the sentence
"permutation tests usually involve calculation of test statistics from and permutation of the observed data, as opposed to other non-parametric tests which may involve analysis of the ranks of data points"
have been removed because it may confuse the readers, as rank tests ain't 'other non-parametric tests'. Rank tests, for example Mann-Whitney U and the Spearman rank correlation test, are permutation tests. - Respectfully, VS.
I made some changes to the Permutation test section to correct some vagueness and misleading comments. I am not an expert in this area, but the current section does not appear to present a balanced perspective on parametric vs. permutation tests. In addition, all sections would benefit mightily from simple examples of each technique. - Ken K 21:15, 1 March 2006 (UTC)
I changed the "approximation" section title to Monte Carlo Testing" and some of the language therein. Monte Carlo testing is not an approximation, but an exact test (meaning that the true alpha = nominal alpha) and is asymptotically equivalent to the test performed by enumerating all of the possible arrangements.Ken K 19:45, 30 March 2006 (UTC)
Wikipedia wrote "An important consequence of the exchangeability assumption is that tests of difference in location (like a permutation t-test) require equal variance" I'm wondering... requires equal variance to infer what? Do you mean to draw an inference about the population from which the samples are drawn? Okay, maybe so. But there is a radically different way of thinking about permutation tests - as not only distribution free but POPULATION FREE. If the inference is limited to the sample at hand (or to put it a different way, if the entire population is being measured) then I don't see how equal variance is necessary. Why do we need statistical inference if we have the whole population? Because we need to know whether the difference between groups is plausibly attributable to chance (random assignment or simply chance factors).
Answer to the previous post: A test of group difference is not 'POPULATION FREE'. It is a test if the observed data belong to one population or two different populations. This is regardless of if the test is parametric or non-parametric, and also the requirement of exchangeability is independent of if we regard the observed sample as a random sample from a larger population or as the population in it self. For a comprehensive explanation of this, read the article by Welch. But it also is easy to understand this requirement if we think about a concrete example about testing that two groups have the same mean. Assume that we have two samples (or two complete populations) with different variance, and we randomly draw one observation from the combined sample, and that observation happens to have a value in the tail of the combined distribution. A permutation test is a conditional test, and this means that the marginal distribution of the combined sample is fixed, so if we observe an extreme value and (for example) know that the first group have larger variance than the second group, the probablity of that observation to belong to the first group is larger than the probability of belonging to the second group if the null hypothesis is true. This invalidates the basic assumption of a permutation test that all permutations of the observed sample have equal probability when H0 is true. Permutation in this situation is equivalent to the allocation of an observation to the first or second group. This means that if the two groups have very different variance, the significane from the permutation test of group difference in mean may be completely misleading. V.S. 28 July 2006
the external link at the bottom of the page (to some random verizon user's page) is (i) broken and (ii) an advertisement (to a "Statistical Consultants for Clinical Trials, Legal Affairs, and Marketing." company). i suggest deletion. -c.w., nyc, Mon Sep 4 05:37:07 EDT 2006
This article is not very clear about what a permutation test actually is. I read the main section on permutation test several times and compared it to other sources and I'm still kind of fuzzy on it. Specifically I'm confused over how you compare the results after the permutations. Is it necessarily implied by permutation test that you order all of the test statistic values, find the number of t values "more extreme" than your t value (I'll call it k), and say that your confidence of the null hypothesis is (k/n!)? Or is that just one way to do it? -Anadverb 16:24, 24 September 2006 (UTC)
[edit] Question
What's a "reference distribution"? There's no definition, and no wikipedia article on it.