Talk:Confidence interval

From Wikipedia, the free encyclopedia

Contents

[edit] Give common Z-values

It would be great if someone could provide typical Z-values for 90%, 95% and 99% confidence intervals. 128.12.20.32 20:09, 11 February 2006 (UTC)

I just added a small section to the article about using Z-scores and a small table with common values. It's not well done, but at least it's something. I'm not sure that this article is the absolute best place to have put that data, but since it's what I came here for as well, I figure other people besides just the two of us would probably find it helpful. -- Zawersh 22:22, 17 May 2006 (UTC)

[edit] Explanation?

Could someone explain this:

"[the] frequentist interpretation which might be expressed as: "If the population parameter in fact lies within the confidence interval, then the probability that the estimator either will be the estimate actually observed, or will be closer to the parameter, is less than or equal to 90%""

I have a hard time figuring out how to write or deduce this in an actual mathematical formulation. It would help having the formulation of the claim, precisely stating what variables are used to denote random variables, and which identifiers are used as realizations of the random variables, and maybe the demonstration too (such as where the inequality come from? (Markov's etc))

[edit] Bayesian approach

A considerable part of this page is devoted to explaining what a confidence interval is not. I think this is appropriate, given how often the confidence intervals of classical statistics are misunderstood. However, I don't agree that the statement "There is only one planet Neptune" has any force. A person may say "Suppose that when Nature created the planet Neptune, this process selected a weight for the planet from a uniform distribution from ... to ....". This, of course, is a Bayesian approach.

No, it's not! That is a frequentist parody of the Bayesian approach! The Bayesian approach is to regard probabilities as degrees of belief (whether logically determined or subjective) in uncertain propositions. (Bayesians have generally done a lousy job of defending themselves against the above attack, though. In one notable case, Jim Berger, now of Duke University, I think, it is simply because he doesn't really care about the issue involved. (I realize that the statement above that I called an "attack" probably wasn't intended to be an attack, but it almost may as well have been.)) Michael Hardy 00:27, 21 Dec 2004 (UTC)

If the fact that "There is only one planet Neptune" had any such implications, then the Bayesian approach would be rendered invalid.

Absolutely wrong. For the reason I gave above. The fact that there is only one planet Neptune does not invalidate Bayesianism precisely because Bayesianism assigns probabilities to uncertain propoisitions, not to random events. Michael Hardy 00:27, 21 Dec 2004 (UTC)

I don't think this is what the author of the page intended.

I'm the author and I'm a Bayesian. Deal with it. You have simply misunderstood. Michael Hardy 00:27, 21 Dec 2004 (UTC)

The difference between the Bayesian approach and the classical approach is not that they obtain different answers to the same mathematical problem. It is that they set up different mathematical problems for the same practical problem. The Bayesian approach allows creating a stochashic model for the process that picks a value that may have been selected in the past.

No. "In the past" misses the point completely. Michael Hardy 00:27, 21 Dec 2004 (UTC)

Classical statistics does not allow creating such a model for certain parameters that are already realized. (However it does allow probability models for things like sample means, which may, in fact, have been already realized by the time the statistician is reading about a given classical approach.)

I think there are two better ways to convey why the usual misinterpretation of classical confidence intervals is wrong. First we may appeal to the fact that "conditional probabilities" are different than "unconditional probabilitites". As an example of this, suppose we select a value of a random variable X from a normal distribution with fixed, but unknown, mean M and fixed, but unknown standard deviation Sigma. What is the probability that X is equal or greater than M. The answer is 0.5.

What is the conditional probability that X is equal or greater than M given that X > 100.0. We can no longer give an answer. We can't even say that the probability is greater than 0.5. That would false if M happens to be -10.0, for example. What is the probability that X is equal or greater than M given that X = 103.2? We cannot give a numerical answer for this conditional probability either.

This impass is because of the mathematical formulation used. If we had set it up so that there was a prior distribution of the mean then we might find an answer, or at least say that an answer, different than 0 or 1, exists. As problem is stated, we can only say that above probabilities have unknown values and their value is either 0 or 1.

People find it counter intuitive that finding an answer to one question is possible and then, being given more information, it becomes impossible to answer it. This is the paradoxical sensation that conditional probabilities produce. Confidence intervals are just one example.

A second way of explaining the misinterpretation is the phrase "The reliability of a reporting process is different that the reliability of a specific report". The "confidence" of classical confidence intervals is a number which describes the reliability of the sampling process with respect to producing intervals that span the True value of the parameter in question. This will be different than the "confidence" we can assign to a specific report, a numerical interval.

As a simple example: Suppose we have a non-necessarily fair coin. (Pretend it is bent and deformed.) A lab assistant takes the coin in to a different room and flips it. He write the actual result ("H" or "T") down on two slips of paper. Then he writes a False result down on one slip of paper. (So, if the coin lands heads, he creates two slips of paper which say "H" and one that says "T".) He picks a slip of paper at random and gives it to us.

What is the probability that the slip of paper we receive contains the correct result? The answer is 2/3. This is the reliability of the reporting process.

If the paper says "H", what is probability that the coin landed heads? The answer to this is no necessarily 2/3. This is a question about the reliability of a specific report. Let p be the probability that the coin landed heads. Then if p is small, the probability that the report is true is smaller than 2/3. This can be shown by computing the conditional probability. (I am sure that readers who refuse to accept the correct answer to the Monte-Hall problem can start a lively debate about this example too!)

If we refuse to admit a probability model for the coin being tossed then all we can say is that the report of "H" is either True or False. There is no probability of 2/3 about it. The probability that it is heads is unknown but either 0 or 1. People can object to such a statement by saying things like "If the weather person says there is a probability of 50% that it rains, it either rains or doesn't. This does not contradict the idea of probability." The reply to this is that the weatherperson, by giving a probability, has assumed that there is a probability model for the weather. Unless the statement of a mathematical problem admits such a model, no probability other than 0 or 1 can be assigned to its answer.

If we rewrite the above example so that the lab assistant is opening a sealed container which contains either Hydrogen or Thorium, then perhaps the non-probabilistic model comes to mind before the one that admits probability.

Another topic is whether an entry on "confidence intervals" should treat only frequentist confidence intervals, since is such a thing as a Bayesian confidence interval. The corrrect interpretation of a Bayesian confidence interval sounds like the misinterpretation of the frequentist confidence interval.

For example suppose the practical problem is to say what can be deduced from a random sample of the weights of 10 year old children if the mean is 90.0 and the standard deviation is 30? A typical frequentist approach is to assume the population is normally distributed with a standard deviation is also 30.0 and that the population mean is a fixed but unknown number. Then, ignoring the fact that we have a specific sample mean, we can calculate the "confidence" that the population mean is within, say, plus or minus 10 lbs of the sample mean produced by the sampling process.

But we can also take a Bayesian approach by assuming that the Natural process that produced the population picked the population mean at random from a uniform distribution from, say, 40 lbs to 140 lbs. One may then calculate the probability that the population mean is in 90-10 lbs to 90+10 lbs. This number is approximately the classical answer for "confidence". Only intervals outside or on the edge of the interval 40 to 140 lbs produce answers that differ significantly from frequentist “confidenceâ€. These calculations can be done with the same table of the Normal Distribution as frequentists use.

The fact that Bayesian confidence intervals on uniform priors often give the same numerical answers as the usual misinterpretation of frequentist confidence intervals is very interesting.


I have not yet read all of the anonymous comments above, but I will comment on a small part:

A person may say "Suppose that when Nature created the planet Neptune, this process selected a weight for the planet from a uniform distribution from ... to ....". This, of course, is a Bayesian approach.

That is wrong. That is not the Bayesian approach. Rather, it is a frequentist parody of the Bayesian approach. The mathematics may be identical to what is done in the Bayesian approach, but the Bayesian approach does not use the mathematics to model anything stochastic, but rather uses it to model uncertainty by interpreting probabilities as degrees of belief rather than as frequencies. Bayesianism is about degree-of-belief interpreations of mathematical probability. Michael Hardy 19:09, 7 Sep 2003 (UTC)

[edit] Bayesians vs Flaming Bayesians

Isn't anyone who is willing to assume a prior distribution qualified to be a Bayesian?

Certainly not. A frequentist may use a prior distribution when the prior distribution admits a frequency interpretation. Michael Hardy 00:33, 21 Dec 2004 (UTC)

Must it be done with a particular interpretation in mind? Doesn't someone who assumes a probability model use subjective belief?

Not necessarily. Michael Hardy 00:33, 21 Dec 2004 (UTC)

Is saying that "I assume the following probability model for the process that created Neptune is ..." less or more subjective than saying "My subjective belief for the probability of the parameters of Neptune is given by the following distribution..."? The former might be a modest kind of Bayesian, but I think it is a good way to introduce prior distributions to a world that is perceives frequentist statistics to somehow be "objective" and Bayesian statistics to be "subjective".

Frequentist statistics is willing to assume a probability model for the sample parameters but unwilling to assume a probability model for the population parameters in the typical textbook examples of confidence intervals and hypothesis testing. But a frequentist analysis of something like "Did the crime rate in Detroit remain the same in 2002 as it was in 2001" will be analyzing (here in the year 2003) events (the "sample" realization of a number of crimes from some imagined distribution or distributions) that have already happened. The fact that something (the reported numbers of crimes) already has a True and realized value does not prevent a frequentist from computing probabilities since he has assumed a probabilistic model for the sample parameters. So I don't think the the fact that there is only one planet Neptune is a pro-frquentist or anti-frequentist fact. The mathematical interpretation of a given type of confidence interval depends on the mathematical problem that is stated. The same physical situation can be tackled in different ways depending on what mathematical model for it is picked. If I want to be a flaming Bayesian then I announce that my prior represents my subjective belief. But there are less blatant ways to be subjective. One of them is to say that I have assumed a simplified probability model for the creation of Neptune's parameters.

The only place where I see a difference in the modest Bayesian and the flaming Bayesian is in the use of "improper priors". These are "mathematical objects" which are not probability distributions but we compute with them as if they were. An not all flaming Bayesians would go for such a thing.

I didn't mean to be anonymous, by the way. I'm new to this Wikipedia business and since I was logged-in, I thought my name would be visible. Also, my thanks to whoever or whatever fixed up the formatting of the comments. I pasted them in the miserable little text window from a *.txt file created by Open Office and they looked weird but I assumed the author of the page might be the only one who read them and I assumed he could figure them out! I notice these comments which were diligently typed in the text window without accidental carriage returns also look wrong in the preview. The lines are too wide.

Stephen Tashiro

PS: I think some of your comments have considerable merit, even though you've horribly mangled the definition of Bayesianism. Michael Hardy 00:33, 21 Dec 2004 (UTC)

The legibility problem resulted from your indenting by several characters. When you do that, the text will not wrap. Michael Hardy 12:24, 9 Sep 2003 (EDT) PS: More later.... Michael Hardy 12:24, 9 Sep 2003 (EDT)

[edit] Posing the problem well

I think the issues in the article (and in my previous comments) would be clarified by distiguishing between the 1) The disagreements in how a statistical problem is posed as a mathematical problem and 2) The correct and incorrect interpretation of the solutions to the mathematical problems. If we assume the reader already is familiar with the different mathematical problems posed by Bayesian and Frequentist statistics then he may extract information about 2) from it. But a reader not familiar with any distinction in the way the problem is posed may get the impression that mathematical results have now become a matter of opinion. It seems to me that the Frequentists pose one problem: The parameter of the distribution is in a fixed but unknown location. What is the probability that the sampling process produces a sample parameter within delta of this location? The Bayesians pose another problem: There is a given distribution for the population parameter. Given the sample parameter from a specific sample, what is the probability that the population parameter is within delta of this location. The disagreements in interpretation involve different ways of posing a practical situation as a mathematical problem. They don't involve claims that the same well-posed mathematical problem had two different answers or does or does-not have a solution. If there are indeed Bayesians or Frequentists who say that they will accept the mathematical problem as posed by the other side but interpret the answer in a way not justified by the mathematics then this is worthy of note.

[edit] Question about one equation

In this equation:

S^2=\frac{1}{n-1}\sum_{i=1}^b\left(X_i-\overline{X}\,\right)^2.

The term "b" isn't explained. I would've thought it should be "n"; we're taking the sum of the squares of the differences from the mean of all the members of the sample, and the sample has already been defined as X1, ..., Xn. If this isn't a typo, would someone be able to explained it in the article? JamesMLane 07:48, 20 Dec 2004 (UTC)

Just a typographical error. I've fixed it. Michael Hardy 03:11, 21 Dec 2004 (UTC)

[edit] Credible Intervals

I'm not happy with the following...

...if we are Bayesians rather than frequentists, can we then say we are 90% sure that the mass is between 82 − 1.645 and 82 + 1.645? Many answers to this question have been proposed, and are philosophically controversial. The answer will not be a mathematical theorem, but a philosophical tenet.

It obscures that fact that Bayesians have good solution to this problem (namely, credible intervals) which

a) do not have a counter-intuitive interpretation
b) the calculated interval takes the information in the prior into account, so it's an interval estimate based on everything you know.

It is not clear why the author considers credible intervals (a mainstream Bayesian concept) to be philosophically controversial.Blaise 19:27, 1 Mar 2005 (UTC)

I wrote those words. They are NOT about Bayesian credible intervals; obviously they are about CONFIDENCE INTERVALS. Bayesian credible intervals are another matter. Michael Hardy 22:13, 1 Mar 2005 (UTC)

[edit] Incomprehensible

In regard to the sentence in the article which states...

"Critics of frequentist methods suggest that this hides the real and, to the critics, incomprehensible frequentist interpretation which might be expressed as: "If the population parameter in fact lies within the confidence interval, then the probability that the estimator either will be the estimate actually observed, or will be closer to the parameter, is less than or equal to 90%".

...the statement is indeed incomprehensible but it is not a valid interpretation, frequentist or otherwise.

Speaking of incomprehnsible... isn't Wikipedia supposed to be accessible to all? I just opened up this article and was immediately turned off by the math-heavy first sentence. I am not being anti-intellectual - obviously, there is a (large) place for math (or symbols, etc.) in an article such as this. But I think a simple laymen's explanation ought to come first.

Many thousands of Wikipedia mathematics articles are comprehensible only to mathematicians, and I don't think that's a problem. This article is certainly comprehensible to a broader audience than mathematicians, but yes, in the case of this article, that could be made still broader. But it's not appropriate to insist we get rid of all the tens of thousands of math articles that cannot be made comprehensible to lay persons. Michael Hardy 23:15, 27 December 2005 (UTC)

Re: Speaking of incompreincomprehnsible... I agree, lay persons cannot get a quick idea of math concepts at wikipedia, but they should be able to, and stating that opinion is not the same as "insist[ing] we get rid of all the tens of thousands of math articles that cannot be made comprehensible to lay persons." There should be a higher, more general discussion of the concept first, then you math geeks can duke it out regarding the nitty gritty stuff afterwards. Would someone mind writing a more general explanation of CI? Maybe for someone like me, who took intro statistics in Jr college but forgot most of it. Thanks

I am not sure concerning the so-called frequentist interpretation of the Neptun example. I would say that if I would have tried to determine the weight of the planet 100 times 95% of my intervals would contain the true weight. Accordingly, if I pick one of those 100 intervals randomly, my chances to get one covering the true weight are 95%. At least with my current understanding, I would interpret that as a 95% probability that the interval contains the parameter. What is wrong with that and what is the practical conclusion from a 95% credible interval? harry.mager@web.de

I wrote the following paragraph (starting with "I share Harry..." and after a few months I realized it is nonsense, and that the article is correct. Nevertheless I will maintain this paragraph so that readers who have the same illusions as me know that they are wrong. Nagi Nahas

I share Harry's opinion that the interpretation "If the population parameter in fact lies within the confidence interval, then the probability that the estimator either will be the estimate actually observed, or will be closer to the parameter, is less than or equal to 90%" is wrong. I do not want to modify it myself as I know that Michael is more knowledgeable in this field than I am , but I hope he will look into this. I think the following interpretation is correct: the 90% confidence interval is the set of all possible values of the actual parameter for which the probability that the estimator will be the estimate actually observed or a more extreme value will be 90% or more. I agree with Harry's interpretation : "I would say that if I would have tried to determine the weight of the planet 100 times 95% of my intervals would contain the true weight. " I have some reserves about the statement "At least with my current understanding, I would interpret that as a 95% probability that the interval contains the parameter." The reason of my objection is the ambiguity of the statement: It can refer to 2 different experiments: 1-you fix a certain value of the parameter, take several samples of the distribution and count how many times the true value of the parameter falls within the confidence interval. In that case Harry's interpretation is correct. 2-Same as 1) but using a different value of the parameter each time : That's totally wrong, because you'll then have to specify the probability distribution you're taking the parameter values from, and in that case you're doing Bayesian stuff. Nagi Nahas . Email: remove_the_underscores_nnahas_at_acm_dot_org.

[edit] How to calculate

I have normally distributed observations xi, with i from 1 to N, with a known mean m and standard deviation s, and I would like to compute the upper and lower values of the confidence interval at a parameter p (typically 0.05 for 95% confidence intervals.) What are the formulas for the upper and lower bounds? --James S. 20:30, 21 February 2006 (UTC)

I found http://mathworld.wolfram.com/ConfidenceInterval.html but I don't like the integrals, and what is equation diamond? Apparently the inverse error function is required. Gnuplot has:

     inverf(x)         inverse error function of x

--James S. 22:00, 21 February 2006 (UTC)

The simplest answer is to get a good book on statistics, because they almost always have values for the common confidence intervals (you may have to look at the tables for Student's t-distribution in the row labelled n = infinity). Confusing Manifestation 03:57, 22 February 2006 (UTC)
The question was how to calculate, not how to look up in a table. --171.65.82.76 07:36, 22 February 2006 (UTC)
In which case the unfortunate answer is that you have to use the inverse error function, which is not calculable in terms of elementary functions. Thus, your choices are to look up a table, use an inbuilt function, or use the behaviour of erf to derive an approximation such as a Taylor series. (Incidentally, I think equation "diamond" is meant to be the last part of equation 5.) Confusing Manifestation 16:12, 22 February 2006 (UTC)
Any general statistical package worth its salt (including free ones such as R) will let you calculate the inverse error function mentioned above for any p, as will most spreadsheets (e.g. see the NORMSINV function in MS Excel). -- Avenue 12:07, 23 February 2006 (UTC)

Perl5's Statistics::Distributions module has source code for

 $u=Statistics::Distributions::udistr (.05);
 print "u-crit (95th percentile = 0.05 sig_level) = $u\n";

... from which I posted a wrong solution. I should have been using:

sub _subuprob {
        my ($x) = @_;
        my $p = 0; # if ($absx > 100)
        my $absx = abs($x);

        if ($absx < 1.9) {
                $p = (1 +
        $absx * (.049867347
        + $absx * (.0211410061
        + $absx * (.0032776263
        + $absx * (.0000380036
        + $absx * (.0000488906
       + $absx * .000005383)))))) ** -16/2;
        } elsif ($absx <= 100) {
                for (my $i = 18; $i >= 1; $i--) {
                        $p = $i / ($absx + $p);
                }
                $p = exp(-.5 * $absx * $absx) 
                        / sqrt(2 * PI) / ($absx + $p);
        }

        $p = 1 - $p if ($x<0);
        return $p;
}

[edit] Would someone please compare the Perl code output to a table?

Thanks in advance. --James S. 18:16, 25 February 2006 (UTC)

[edit] Concrete practical example

Would the use of critical values instead of the t-distribution be better here? --James S. 08:13, 27 February 2006 (UTC)

[edit] ... Not?

This article is not structured well for reference or orientation.

The text frequently and colloqially digresses into what the topic is NOT, which limits the reference/skim value (more seperation needed). It's not that the latter points aren't valid -- they are just not clearly nested within the context of what is ultimately a positive statement/definition/reference entry.

I feel like this article could be distilled by 15-20% with refactoring (ex: 2 sections after the intro... What CIs ARE, What CIs are NOT; clearer seperation of the negative points, emphasising bullets), and removal of the conversational tone that adds unessential bulk.

I am not an expert in this field and do not feel qualified to make these mods.

[edit] Confidence interval is not a credible interval

This whole article appears to blur the distinction between a (frequentist) confidence interval and a (Bayesian) credible interval. Eg, right at the start, it claims that a confidence interval is an interval that we have confidence that the parameter is in, and then again in the practical example "There is a whole interval around the observed value 250.2 of the sample mean with estimates one also has confidence in, e.g. from which one is quite sure the parameter is in the interval, but not absolutely sure...Such an interval is called a confidence interval for the parameter". This is surely wrong. Anyone care to defend it?Jdannan 05:28, 6 July 2006 (UTC)

OK, no-one replied, so I've had a go at clarifying the confidence/credible thing. Please feel free to improve it further.Jdannan 06:54, 10 July 2006 (UTC)

[edit] 95% almost always used?

The statement near the beginning, "In modern applied practice, almost all confidence intervals are stated at the 95% level" needs a citation. Although 95% CIs are common, especially in text books and in certain fields, 95% is certainly not the most common level of confidence in all fields, especially in some engineering fields. There is really no reason for this statement, especially in the introduction. Perhaps a seperate section could be added in the body that discusses what levels of confidence are commonly used in various fields. —The preceding unsigned comment was added by 24.214.57.91 (talkcontribs) .

[edit] Misleading CI chart

The diagram with confidence intervals, ready for publication in biological research paper. The differences on the right are not significant.
Enlarge
The diagram with confidence intervals, ready for publication in biological research paper. The differences on the right are not significant.

I've removed this chart from the article. I feel the caption is misleading, because the implication is that you can tell directly from the diagram whether the differences on the right are statistically significant or not. But you can't; this depends instead on whether the confidence interval for the difference includes zero. (If the comparison was for two independent groups of equal sizes with equal variances, then the confidence interval for the difference would be roughly 1.4 times larger than the confidence interval for each group. Judging by eye, any difference here would then be only marginally significant.)

I think the diagram could lead readers to believe that overlapping confidence intervals imply that the difference is not significant, which seems to be a common fallacy. -- Avenue 03:47, 3 August 2006 (UTC)

[edit] Help a Psychology Grad Student

My neighbor is working on her thesis (for her PhD, iiuc) and she mentioned that she needed to (I forget exactly how she phrased it), but essentially she had to figure out how many times she had to run an (her) experiment to determine that it had something like a 95% confidence level of being correct.

(Actually, the conversation occurred about a year ago.) I finally thought of checking Wikipedia to see if the calculation was here or was linked from here. If it is here, I don't recognize it. I'm guessing (hoping?) that it is a fairly simple calculation, although maybe you have to make some assumptions about the (shape of the probability) distributions or similar.

Can anyone point to a short article which explains how to do that, along with the appropriate caveats (or pointers to the caveats). (I'm assuming that by now she has solved her problem, but I am still curious--I thought that 35 years or so ago I had made a calculation like that in the Probability and Statistics course I took, but couldn't readily find it when re-reviewing the textbook.)

Rhkramer 16:01, 3 September 2006 (UTC)

Maybe your neighbor wanted to know how many observations are needed to get a confidence interval no longer than a specified length? Or maybe what she was concerned with was hypothesis testing? Anyway, it probably is a pretty routine thing found in numerous textbooks, but your question isn't very precisely stated. Michael Hardy 01:51, 4 September 2006 (UTC)

[edit] Formula used in the definition of a CI

In the article, \Pr(U<\theta<V|\theta)=x is used to define a CI. But what is θ in this? Formally, the thing behind the bar (|) must be an event, but what event is "θ"? Would be glad to receive some clarifications. hbf 22:35, 6 December 2006 (UTC)

No, it's not necessary for the thing behind the bar to be an event. Certainly the concept of conditioning in probability as usually first presented to students does require that. But there are more general concepts. One can condition on the value of a random variable and get another random variable. And one can even condition on the value of a paramter that indexes a family of probability distributions. That's what's being done here. Statisticians do this every day; probabilists less often. Michael Hardy 23:57, 6 December 2006 (UTC)