Talk:Bayesian probability/Archive 2

From Wikipedia, the free encyclopedia

Archive This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
Archive 1 |
Archive 2
| Archive 3 →


Contents

Gillies on only fair betting quotient

I have a suggestion: first of all, could Logicus please provide a quote from Gillies about the betting quotient and universal hypotheses. (Sorry if you already did. This page is rather long.) The quote is probably just for us to look at on this talk page. Then, my suggestion is that we have the article say something like "Gillies states that ...", so that it's attributed to Gillies, not to Wikipedia. I still have a number of problems with the statement [1] about the universal hypothesis. How does one win a bet -- by convincing a human judge of the truth of a proposition? Or by it being objectively true? Or what? and, for whom is it fair? It seems to me that it's unfair for the other player, who has to put some money down but may never get it back. Those are on top of the concerns I've raised already, which I'm not convinced have been adequately addressed. --Coppertwig 23:10, 31 August 2007 (UTC)

Logicus to Coppertwig: I don't undertand your problem here. There is no attribution to Wikipedia, rather Wikipedia is just reporting a recognised problem in the literature, and gives references for such. And you own original research views or problems with this problem are simply not relevant here. There are far more dubious wholly unsourced claims in this article to pick on that Logicus's genuine additions. But I will look for a quote for you. By the way, there is also the addition about the problem of fallibilism to come. --Logicus 14:53, 1 September 2007 (UTC)

I appreciate your taking the time to look for a quote for me. If you wish to challenge unsourced claims, I believe the usual method is to put {{fact}} after them in the article, which makes a footnote like this [citation needed], and then wait a period of time -- I think several weeks at least is usually expected -- and then delete them if nobody has provided sources. --Coppertwig 17:21, 1 September 2007 (UTC)
Coppertwig, if you really must put something from Gillies in the article, I suggest the following footnote: "e.g. see Gillies 2000, p55: "My own view is that betting does give a reasonable measure of the strength of a belief in many cases, but not in all. In particular, betting cannot be used to measure the strength of someone's belief in a universal scientific law or theory." " as a footnote to the very first sentence of my addition.--Logicus 18:09, 4 September 2007 (UTC)

Logicus, why is Gillies' personal view on betting odds, cited from a 2000 publication, relevant to the history of Bayesian probability? If this objection is so critical to Bayesianism's history, surely there would be some citation in an older publication by a more prominent author. -- 158.83.15.85 14:33, 18 September 2007 (UTC)

Logicus to 158.83.15.85: Thank you for this anonymous mistaken comment. Please note that contrary to your assumption, I have never claimed Gillies' views on betting odds are relevant to "the history of Bayesian probability" nor to "Bayesianism's history", nor that his objection is critical to Bayesianism's history as you variously suggest. The point at issue here is neither about 'Bayesian probability' nor about 'Bayesianism', and nor indeed about their histories. Rather it is ONLY about the Bayesian PHILOSOPHY OF SCIENCE, that is, a specific APPLICATION of Bayesian epistemic probability to the specific domain of scientists' beliefs and reasoning about scientific propositions, an application listed amongst other such specific applications as 'e-mail spam filtering' in that specific section of the article called 'Applications'. As I understand it, this application to philosophy of science is a relatively novel application of Bayesianism that only really took off in the 1990s. SO I REPEAT, contrary to what you and some other Wiki editors such as Coppertwig, Jefferys, BenE etc sometimes mistakenly presume, the subject at issue here is not the much older topic of 'Bayesian probability', that is, a specific interpretation of the meaning of 'probability' in the probability calculus to mean 'strength of belief that a proposition is true', but rather only about the specific application of that general interpretation of probability and the probability calculus to the domain of scientists' beliefs about the propositions of science to try and explain such as their acceptance and rejection of them.
As for Gillies, his view that the probability of all universal laws is zero is relevant to Bayesian PHILOSOPHY OF SCIENCE at least because (i) he is a professional academic philosopher of science (ii) he was a Cambridge double first maths wrangler, (iii) was a PhD student in the philosophy of probability of one of the most brilliant philosophers of science and of maths of the 20th century, Imre Lakatos (who also incidentally maintained the probability of all scientific laws is zero), (iv) is an ex President of the British Society for the Philosophy of Science and (v) part of his 2000 book on 'Philosophical theories of probability' does deal with Bayesian philosophy of science, which only took off in the previous decade.
As for your surmise, when appropriately re-interpreted, that if this objection is critical to Bayesian philosophy of science then "surely there would be some citation in an older publication by a more prominent author.", it is indeed correct, as you might have discovered yourself had you bothered to read the article's footnote references for this objection and the literature listed in the article's References and done some thinking before putting pen to paper, or rather fingers to keyboard. For the saleswise more prominent authors Howson & Urbach discussed it on pages 72 and 263-4 in their 1989 'Scientific Reasoning: The Bayesian Approach' listed in the article's References , as mentioned in the article's footnote. Gillies' specific views on this objection were put into the article simply because it was specifically his views Coppertwig requested I provide, as you will see from the heading of this particular Talk section, although I have no idea why Coppertwig picked on Gillies. And nor, I suspect, does Coppertwig.
If you wish to trawl through the literature in the article's References for further earlier citations of this objection, please feel free to do so. But please note that whether or not particular authors agree or disagree about the validity of this objection is irrelevant to the issue of correcting this article's highly biassed pro-Bayesian viewpoint that signally fails to mention hardly any of the many problems of Bayesianism and Bayesian philosophy of science by at least mentioning some of them, and thus giving it a somewhat more NPOV. --Logicus 14:46, 23 September 2007 (UTC)
I can't easily look through the literature. For example, I checked the local public library for the Gillies book and it doesn't have it. However, I have several requests. I think you misunderstood an earlier request I made, and that request still stands. I was not asking for a quote from Gillies for the purpose of inserting the quote into the article. Rather, because you want to insert into the article a statement about the only fair betting quotient in certain circumstances being zero, and since I can't easily check the reference you attached to the statement, I asked you to present a quote on this talk page as a substitute for me looking in the book myself. Different people interpret things differently, so I wanted to check that whatever in that book you're interpreting as supporting that statement would also be interpreted by myself (and others) as supporting that statement. Although you've provided a quote from Gillies, it does not, in my opinion, indicate that Gillies believes that the only fair betting quotient in some particular situation is zero; for example, it does not include the words "fair" or "zero" or synonyms of them in my opinion.
I would like to ask you, Logicus, to do six things: each of the three following requests applied to each of the two following statements you want to insert into the article: the statement about the only fair betting quotient in certain circumstances being zero, and the statement about the probability of scientific laws being zero. For each of these two, would you please:
  1. Since you presumably have the books at hand and I may not easily be able to access them, would you please provide on this talk page for the convenience of myself and other editors here a quote from the book that makes the statement so that we can verify that, in our opinion, the statement made in the book is essentially the same as the statement being made here. (I'm not proposing that the quote be included in the article. Possibly I or someone else might later propose including the quote in the article, but that is not the purpose of this request.)
  2. Please explain the relevance of the statement to "Bayesian probability", the topic of this article.
  3. When inserting the statement in the article, rather than asserting the statement, assert that a certain book has asserted it, perhaps like this: "Gillies (2000) states that ...".
Thank you for considering my requests. Of course you don't have to do them, but doing them successfully may lessen my opposition to the insertion of those statements.
Since Gillies is not Bayes, I wonder what the relevance of Gillies' opinion is here. Similarly for Popper. Perhaps statements by these people can only be considered relevant to this article if they mention "Bayes" or "Bayesian" in the context. --Coppertwig 15:32, 23 September 2007 (UTC)
I'm not convinced that the idea about the only fair betting quotient being zero is a previously-published idea, so I've edited it. Also, it may only be Gillies' opinion that there is a problem, so this page should not state that there is a problem -- maybe state that Gillies says there is a problem, or (as in my edit) that there may be a problem. Perhaps some of the sentences that followed it also need to be deleted or modified for similar reasons. --Coppertwig 01:14, 1 October 2007 (UTC)
I'm with you here.--BenE 14:16, 1 October 2007 (UTC)
Logicus to Coppertwig of 1 October: Would you please restore the original text before your edit of 1 October, and instead post your proposed change and its justification here on the Talk page first for critical discussion before any implementation. Also note your edit introduces a fatal omission. Let us see if you can discover what it is by yourself.
Also note your view that the only fair betting quotient is zero was not previously published before Gillies is mistaken. It was also in Howson & Urbach 1989, the basic teaching text in 'Bayesian' philosophy of science listed in the article’s references, as I pointed out. Please pay attention and stop intervening in issues in a literature and subject, philosophy of science, you are patently unfamiliar with and not competent in. Your personal problems of lack of access to the basic literature are surely sufficient to debar you from commenting. Why should anybody suffer the burden of convincing you of anything ? Are you an employed editor of Wikipedia ?
However, I should say I am not wholly opposed to the spirit of your proposal of only saying some people say there is a problem, and will consider a modification. --Logicus 12:43, 5 October 2007 (UTC)
As it seems Gillies is not a Bayesian, and that you are only quoting a genera encyclopaedia, this literature has little authority on the subject. I don't have the encyclopaedia here but one of the quote you put on the page seems patently false:
See p50-1, Gillies 2000 "The subjective theory of probability was discovered independently and at about the same time by Frank Ramsey in Cambridge and Bruno de Finetti in Italy."
People were debating the 'subjective' theory well before de Finetti see here This doesn't give much credibility to Gillies. —Preceding unsigned comment added by BenE (talkcontribs) 14:55, 5 October 2007 (UTC)

I removed the following which is in any case too specific. It might go on a page about De Finetti since it only applies to his specific flavor of bayesianim.

"This problem of the Bayesian philosophy of probability becomes a fundamental problem for the Bayesian philosophy of science that scientific reasoning is subjective Bayesian probabilist, which thereby seeks to reduce scientific method to gambling, but some regard it as solvable.[1] But it is also noteworthy that by 1981 De Finetti himself came to reject the betting conception of probability.[2]"--BenE 15:45, 5 October 2007 (UTC)

Logicus to Coppertwig of 1 October: How about the following quote as evidence of recognition in the literature that the positive undecidability of universal hypotheses poses a fundamental problem for the degree of belief as betting-quotients interpretation of subjective probability ?

"...critics [of the standard Dutch Book argument] have not been slow to point out that the postulate that degrees of belief entail willingness to bet at the odds based on them is vulnerable to some telling objections. One is that there are hypotheses for which the wise choice of odds bears no relation to your real degree of belief: if 'h' is an unrestricted universal hypothesis over an infinite domain, for example, then while it may in certain circumstances be possible to falsify 'h', it is not possible to verify it. Thus the only sensible practical betting quotient to nominate on 'h' is 0; for you could never gain anything if your betting quotient was positive and 'h' was true, whilst you would lose if 'h' turned out to be false. Yet you might well believe that 'h' stands a non-zero chance of being true. " [p90. Howson & Urbach 1993]

Please don't come back with the standard positivist rap about omniscient oracles as supposedly solving this problem, which is logically irrelevant to the issue of whether it is recognised in the literature as a fundamental problem requiring solution, whether or not you personally believe such alleged solutions are valid or invalid.--Logicus 19:01, 13 October 2007 (UTC)

The problem of fallibilist philosophy of science for epistemic Bayesian probabilist philosophy of science

The following is proposed as an addition to the second paragraph of the ‘Applications’ section of the article. Its relative length is apparently required to overcome the difficulty some Wikipedia editors have in understanding the point at issue and their objections.

However a fundamental problem for all probabilist philosophy of science is posed by radical fallibilist philosophy of science which maintains all scientific laws are false and will be refuted and replaced by hopefully better false laws that will in turn be refuted and revised again and so on ad infinitum in a potentially endless series of false laws.F1 For insofar as scientists believe this radical fallibilist philosophy, as it seems most do nowadays F2, then according to the canons of the subjectivist Bayesian method according to which probabilities are assigned to propositions in proportion to strength of belief in their truth, they must therefore assign zero prior probability to all scientific laws since they believe them to be false.F3 But this would render probabilist epistemology practically inoperable, since by Bayes' Theorem all evidentially posterior probabilities must therefore also be zero, thus putting all laws on an epistemic par and so eliminating any way of choosing between them epistemically within probabilist epistemology.F4 Thus philosophers of science who maintain scientific reasoning is consistently probabilist must deny most scientists are radical fallibilists, or at the very least show some scientists believe their theories are true, or at least not definitely false, in order for their probabilist theory of scientific reasoning to have any valid domain whatever.F5

F1 [As Duhem expressed the key tenet of this philosophy "Thus, the struggle between reality and the laws of physics will go on indefinitely: to every law that physics may formulate, reality will sooner or later oppose a rude refutation in the form of a fact, but, indefatigable, physics will improve, modify, and complicate the refuted law in order to replace it with a more comprehensive law in which the exception raised by the experiment will have found its rule in turn." p177 Duhem's The Aim and Structure of Physical Theory, Athaneum 1962]

F2 [Even Bayesian statistician George Box has admitted: "All models are wrong; but some models are useful." (Bill Jefferys, would you please kindly provide the reference here for this Box quotation you gave ?)

F3 [This is because a hypothesis that is refuted and thus falsified must be assigned zero probability in Bayesian epistemic probability theory: 'If a hypothesis h entails a consequence e, then P(h / ~ e) = 0. Interpreted in the Bayesian fashion, this means that h is maximally disconfirmed when it is refuted. Moreover...once a theory is refuted, no further evidence can ever confirm it, unless the refuting evidence or some portion of the background assumptions is revoked.' [p119, Howson & Urbach 1993] ]

F4 [Hence this problem is apparently fatal to the Bayesian theory of scientific method. For as Howson & Urbach admit, if it were correct that the prior probability of all unrestricted universal laws must be zero, "then that would be the end of our enterprise in this book" [p391 Howson & Urbach 1993], which is to demonstrate that scientific reasoning, and most especially its grounds for the acceptance and rejection of hypotheses, is subjective Bayesian probabilist reasoning. [p1] ]

F5 [For an example of some probabilist philosophers of science who do deny all scientists are radical fallibilists, see the desperate appeal to Einstein's (ironic?) dogmatic view on the truth of his GTR by Howson & Urbach on p394 of their 1993 Scientific Reasoning. But of course, one swallow does not a summer make.]

The main stumbling block on the part of some Wikipedia editors such as Jefferys, Johnston, BenE, Coppertwig etc in understanding that scientists' belief in radical fallibilist philosophy of science is fatal to probabilist philosophy of science seems to be their refusal to accept that according to the Bayesian philosophy of science literature and thus also in this article Bayesian epistemic probability interprets 'probability' as 'strength of belief that a proposition is TRUE', whereby if a proposition is believed to be false it must therefore be assigned probability zero. For they do not contest, and indeed apparently agree, that scientists believe all scientific laws are false. But they seek to avoid the conclusion that they must therefore assign them probability zero by doing original research and illegitimately redefining 'probability' as 'strength of belief that a proposition is USEFUL for making novel predictions', But since this is a non-Bayesian interpretation of probability, if they are right they are in effect demonstrating that scientific reasoning is not Bayesian probabilist.

LOGICUS 16 September 2007 —Preceding unsigned comment added by Logicus (talkcontribs) 14:45, 16 September 2007 (UTC)

I think the idea that all scientific laws have probability zero counts as "original research" under WP:NOR. I think that idea is not stated by any of the given sources but is a conclusion reached by Logicus, and furthermore that it is not generally accepted by scientists. Therefore, it should not be stated in the article -- unless a source can be found that states it, and then at most in could be mentioned in a quote or indirect speech, as in "so-and-so says that all such probabilities are zero," not "All such probabilities are zero" as if that's what Wikipedia is asserting. Probabilities are manipulated within mathematical frameworks in which certain sets of scientific laws are "assumed" to be true. Besides, the edit is too long and I disagree with the premise of the argument for including such a long quote. --Coppertwig 17:18, 16 September 2007 (UTC)
Coppertwig, please stop giving me your baloney! Do be a good fellow and go and read the literature, including the sources I give, where you should discover what you say is nonsense ! It is most definitely not Wiki-original research, whereas what you claim is. For example, it is well know that Popper maintained the probability of all laws must be zero, whether or not he was right or wrong. Please stop lecturing me on subjects about which you are either clearly ignorant or logically confused. Best wishes.

Logicus 18:14, 17 September 2007 (UTC)

The problems of Bayesian Philosophy of Science

The problems of Bayesian Philosophy of Science as distinct from those of the Bayesian Philosophy of Probability

Further to my comments of 11 October above in the 'What is probability' section, they do not really belong to this section on the concepts of probability. Maybe some confusion has arisen because BenE's critical comments of 20 September in this section commenced with his assertion that he held the same views as Bill Jefferys. But Jefferys' views and my debate with him concerned the philosophy of science, that is, the nature of scientific reasoning, and whether it is Bayesian probabilist or not. Thus it was reasonable to interpret BenE's statements as about the same issue, rather than about the philosophy of probability, which might be the issue he actually had in mind, even if unclear from his comments. As I have pointed out before to try and clarify this crucial distinction, one may hold a Bayesian philosophy of probability, but be a vehement anti-probabilist and thus anti-Bayesian in the philosophy of science, regarding it as utterly absurd that scientists' beliefs in the truth of theories obey the probability calculus or are even logically consistent.

Anyway, my point here is that the above discussion does not really belong to this section, but rather to a section on the problems of Bayesian philosophy of science, that is, the thesis that scientific reasoning is probabilist and Bayesian, not to be confused with theories about what is the best interpretation of the notion 'probability'. And so I copy it to another section devoted specifically to discussing the problems of Bayesian philosophy of science. This is, for example, is the appropriate place to discuss whether the belief that all scientific laws are false, whereby they must be assigned probability zero when 'probability' is defined as 'strength of belief a proposition is true', is recognised as posing a fundamental problem for Bayesian and probabilist theories of scientific reasoning. Or what episodes in the history of science are recognised as being successfully accounted for by a Bayesian probabilist theory of scientific reasoning, such as 'the Copernican revolution', 'the anti-Cartesian Newtonian revolution', 'the Einsteinian revolution'.--Logicus 16:15, 12 October 2007 (UTC)

BenE's further comments I'm going to add, since you keep calling me a fundamentalist for believing in Jaynes' theories, that I am far from the only one with these views.

There is an influencial group of Bayesians part of the Future of Humanity institute which is part of the Faculty of Philosophy of Oxford University which the Philosophical Gourmet Report as recently ranked "the most important ranking of Graduate Programs in Philosophy in the English speaking world." They have a blog which frequently talks about Bayesianism as the probability theory AND as the philosophy of science. One of their contributors, Eliezer Yudkowsky wrote here :

"Previously, the most popular philosophy of science was probably Karl Popper's falsificationism - this is the old philosophy that the Bayesian revolution is currently dethroning. Karl Popper's idea that theories can be definitely falsified, but never definitely confirmed, is yet another special case of the Bayesian rules; if p(X|A) ~ 1 - if the theory makes a definite prediction - then observing ~X very strongly falsifies A. On the other hand, if p(X|A) ~ 1, and we observe X, this doesn't definitely confirm the theory; there might be some other condition B such that p(X|B) ~ 1, in which case observing X doesn't favor A over B. For observing X to definitely confirm A, we would have to know, not that p(X|A) ~ 1, but that p(X|~A) ~ 0, which is something that we can't know because we can't range over all possible alternative explanations. For example, when Einstein's theory of General Relativity toppled Newton's incredibly well-confirmed theory of gravity, it turned out that all of Newton's predictions were just a special case of Einstein's predictions."

Another good article of him extolling Bayesianism can be found here

I may be young and naive and I may be suffering from intellectual "growing pains" but at least I am up to date with recent developments. And I doubt Oxford's Faculty of Philosophy is considered a fundamentalist group.--BenE 00:18, 13 October 2007 (UTC)

Logicus continues to misunderstand my position. I am not talking about philosophy of science, but what scientists actually do. For some reason, many philosophers of science have the view that their ruminations about philosophy have something to do with what scientists actually do. This is generally false, since most philosophers of science have never done science and therefore have no notion of what scientists actually do.
What scientists actually do is to construct models, compare with data, and try to find models that explain the available data well and predict future data well. A Bayesian analysis of a specifically identified set of models is a good way to choose between models that have been identified (even if we do not believe that any of the models in this restricted set is "ultimate truth"), and model-fitting criteria at any particular time will guide us in our quest to invent models that do a better job.
Such a specifically identified set of models does not have the defect that Logicus claims to be a problem, that is, that you have to put zero prior probability on each. Indeed, once you restrict yourself to a specific set of models, you are required to set priors that add to unity, so not all can have zero prior probability. Whether the "ontologically true" model is within that set is not relevant for model comparison.
I provided a joke a while ago about the Dean who, when approached by the Physics department chair about an expensive piece of equipment, complained that the mathematicians needed only pencil, paper and a wastebasket, and that the philosophers needed only pencil and paper. Logicus answered with a lame retort supposed to show the superiority of philosophers, that unfortunately completely missed the point, to wit:
Of course the philosophers' joke on the joke is that the dean was herself a philosopher and correctly believed philosophers are infalible [sic], so don't need wastebaskets.
Logicus' recent comments continue to prove that he has no clue about what scientists actually do or the way they actually think. I think, also, that it would be useful for anyone following this discussion to read Logicus' recent [rant] (which he later removed, but which, thanks to the WikiPedia gods, is preserved in perpetuity). Bill Jefferys 00:59, 13 October 2007 (UTC)

Logicus to Bill: Bill you seem to have a literacy problem here. I never removed my contribution of 11 October you link up to here. It is still there as above for the literate to see and be enlightened from the Bayesian positivist nightmare. Thanks for advertising it though. Also thanks for regaling us with your philosophy of science yet again. The question you have to answer is that IF 'probability' means 'strength of belief a proposition is true' and somebody believes a proposition is false, what probability should they assign a false proposition other than zero ? In answering this question you must set aside the fact that you yourself reject this conception of probability on which this article is based, albeit but have failed to propose an alternative conception that agrees with the literature referenced at the end of the article. Also remember that like fish have no good ideas about hydrodynamics, most scientists haven't a clue about what they actually do because their heads are usually filled with some ideological philosophy of science view about what they do i.e they suffer false consciousness. The task of critical philosophy of science is to analyse what they actually do rather than what they say they do. --Logicus 19:26, 13 October 2007 (UTC)

If you will click on the [rant], you will find on the left hand side of the page a lot of stuff that you wrote, in red, that does not appear on the right hand side. This is a "diff", which shows what you deleted and what you added. I think you need your reading glasses checked.
As for what you claim to be the task of critical philosophy of science, you may think anything you wish about what it thinks it does. This does not mean that it does it, and I think you will have a hard time showing that this is what scientists actually do.
And, as to your claim that I have proposed no alternative conception that agrees with the literature referenced at the end of the article, I have provided citations that support my so-called "alternative" conception about the real role of models in scientific inference. Bill Jefferys 23:36, 13 October 2007 (UTC)

Logicus to Jefferys: If anybody who is literate clicks on the 'rant', providing they are wearing any reading glasses they may need, they should find 'the stuff I wrote in red on the left hand side of that page' also appears on this page above in my 11 October contribution, and hence that I never removed it, contrary to Jefferys' bizarre claim that I did remove it and insinuation that one has to go to this link to find it. And there they should also see that it is not a rant, but helpful advice to Jefferys' fellow 'Bayesian fundamentalist' philosopher of science BenE about the existence of non-probabilist philosophies of science.

But what the literate reader will not find anywhere from Jefferys is a simple concise definition of his conception of 'probability', nor an answer to the key question of what probability a subjective Bayesian probabilist should assign to propositions they believe to be false. His problem here is that of avoiding the appalling unthinkable conclusion that scientific reasoning is not Bayesian probabilist. For if scientists proceed as he claimed they do on 13 October and 15 August, and thus assign non-zero positive prior probabilities to propositions they actually believe to be false, then on the normal view of it they cannot be subjective Bayesians for whom 'probability' means 'strength of belief a proposition is true' whereby a proposition believed to be false is assigned probability zero, that is, no strength of belief it is true simply because it is believed to be false. Thus scientists are not subjective Bayesians on Jefferys' view of scientific practice. Precisely my point. QED.

What Jefferys fails to grasp in this key issue of whether scientific practice is subjective Bayesian probabilist or not is that in this instance it is not his account of scientific practice that is being challenged, but rather whether that practice as he represents it is a subjective Bayesian practice or not on the standard definition of Bayesian probability this article is based upon, that is, 'strength of belief a proposition is true'. And if scientists assign non-zero positive priors to hypotheses they believe to be false as Jefferys claims they do, then clearly it is not. Thus to establish his thesis that scientific practice is Bayesian probabilist, Jefferys must reject this article's definition of that conception of probability and replace it with an alternative definition, such as his own declared conception of it as 'strength of belief that a proposition is likely to be useful for predicting novel facts'. But of course for many reasons he cannot, including the fact that this does not square with even the pro-Bayesian literature, and at least the problem of yet again avoiding the conclusion that scientific reasoning is not probabilist at least because scientists breach Axiom 2 of that calculus in not assigning probability 1 to tautologies or else explaining why completely useless propositions for predicting novel facts such as 'The Moon is the Moon' are not given probability zero. Thus the very learned self-alleged Emeritus Professor Jefferys remains impaled on the horns of the dilemma constituted on the one horn by his equivocation between an instrumentalist idealist philosophy of science that scientific hypotheses are neither true nor false but instruments of prediction and a contrary radical fallibilist realist view that they are all false but possibly useful instruments of prediction, versus his fundamentalist belief on the other horn that scientific reasoning is Bayesian probabilist. Hardly surprising that in this unenviable situation he simply protests that he really does have a coherent and empirically adequate Bayesian philosophy of science that Logicus has simply not understood, but demurs presenting it on these pages, claiming he can only present it to Logicus by personal e-mail axtra Wikipedia, rather than on the Wikipedia Talk page where anybody can see it. Or else he suddenly becomes quasi-Emeritus and makes excuses that he must rush off to prepare his lessons for the new semester instead of answering Logicus's challenges, put as follows:

Logicus to Jefferys 18 September: On another point, since you yourself at least agree with the radical fallibilist philosophy of science that all scientific laws are false according to your 15 August testimony on this Talk page, then what probability would you assign a scientific law if you assigned probabilities to propositions according to strength of belief in their TRUTH, as subjective Bayesian epistemic probabilists do according to the literature ? (I appreciate you do not accept the subjective Bayesian epistemic interpretation of probability, but have your own utilitarian pragmatic interpretation of it as 'likely usefulness of a hypothesis in making novel predictions', but just imagine you did accept it. What probability would you assign a hypothesis you believe to be definitely false ?)

I would also be grateful to know why you assign tautologies probability 1, and thus why you evaluate them as maximally useful for making novel predictions. For instance, how is 'The Moon is the Moon' useful for making novel astronomical predictions ?

Jefferys to Logicus 19 September: Since the semester started, I have been rather busy, and expect to be so for quite some time, so will answer only the question about the Box quotation, and make one more comment. The other questions will have to wait, weeks probably. I can only say that you profoundly misunderstand my position. I still invite you to contact me directly. Believe me, I do have coherent and justifiable reasons to write what I did.

But the learned now quasi Emeritus Professor manages one last comment by way of trying to teach Logicus his rotten Bayesian statistics before rushing off to spread his gospel wider:

I'll make one more comment. The reason why you have to consider more than one hypothesis is that if only one hypothesis is possible (that is, the universe of hypotheses under consideration consists of that one hypothesis alone), its prior and posterior probabilities are by definition 1. This is well-known, and any decent book on Bayesian statistics should set you straight on this point. Try Jim Berger's book.

But this is of course illogical nonsense. There is no reason in the probability calculus nor in Bayesian probability why a lone hypothesis on any subject matter should be believed to be certainly true or a tautology, and indeed it may be believed to be false because it is believed to have a counterexample somewhere sometime, and thus assigned prior probability zero. Consider the case of there being only one theory of the Moon's constitution, namely that it is made of green cheese. What definition of what concept means it must have prior and posterior probability 1 Jefferys notably does not reveal. --Logicus 18:10, 15 October 2007 (UTC)

BenE to Logicus: As I have written again and again, bayesian probability theory is interpreted as a degree of belief that is calculated through Bayes theorem. Hence the name Bayesian. Your statement that a scientific theory must assign 0 to scientific theories is unsensical in this context as it is not possible to arrive at this probability value through Bayes theorem (Unless you pull this zero value out of your ass). We have no clue what the absolute prior is for a scientific theory for two reason: First we don't have any idea of the size of the hypotheses space and second, we don't even know that the theories are mutually exclusive. (e.g. the theory that the human body temperature is between 95-105 degrees does not exclude the theory that it is between 90-110) the only thing we can do is assign a maximum entropy prior between alternative theories and thus represent our state of ignorance. This is equivalent to setting the priors to be equal. When we calculate the odds ratio, the prior terms vanishes in the equation as P(H1)/P(H2)=1 and we never actually need to assign a number to these priors! By following Bayes theorem and the MaxEnt principle we arrive on a ratio based solely on the data, and that is therefore based on how the theory fits the data: how well it predicts the data.

Choosing theories based on 'strength of belief that a proposition is likely to be useful for predicting novel facts' is not a supposition but a result of the Bayesian approach. It's a result of applying Bayes theorem with a maximum entropy (equal) prior.

I provided many citations supporting this. However, why do you even make a fuss? Even if the people posting on this discussion page and the guys in Oxfords's philosophy department think it is the best candidate for a theory of science, the article in its present form doesn't even mention the word science once! Nothing in the article should bother you. --BenE 19:50, 15 October 2007 (UTC)


I apologize to Logicus, for he is correct, he did not remove the material I thought he had removed. I was misled by the [diff page], which showed the material deleted on the left (in red, with minus sign indicating deletion) but not restored on the right. There is evidently something that I do not understand about the way the diff is presenting things. But when I searched the version that Logicus edited for the material, it was there. I am sorry.
I also apologize for using intemperate language. I hope that all of us will change our ways and try to be more respectful of the views of our fellow editors, even when we don't agree with them. Bill Jefferys 22:48, 16 October 2007 (UTC)
Logicus to Bill: Well thanks for that. I also find diff confusing, and often cannot find history of edits obviously made.--Logicus 18:15, 17 October 2007 (UTC)
BenE stated "As I have written again and again, bayesian probability theory is interpreted as a degree of belief that is calculated through Bayes theorem." This is not correct. Bayes theory may or may not use a subjective "degree of belief" as an input (it can also use objective inputs) but the degree of belief is not the result of Bayes theorem.ERosa (talk) 07:19, 10 February 2008 (UTC)

Logicus endorses ERosa

The complaint of ERosa of 10 Feb, reproduced below also with some insertions added in square brackets, that there is nothing inherently subjectivist about Bayesian probability, whereby this article’s definition of Bayesian probability is mistaken and the article is systematically confusing in its untenable conflation of Bayesian probability with subjective epistemic probability that defines ‘probability’ of a propositiion as subjective strength of belief that the proposition is true, is surely entirely justified. The truth of the matter seems to be that there is no such thing as distinctly Bayesian probability in the sense of using Bayes’ Theorem since the conditional probability calculus entails Bayes’ Theorem and it seems all statisticans use it, as ERosa testifies. So non-Bayesian probability can only mean the unconditional or absolute probability calculus.

Thus the article remains the ultimately confused nonsense it has been for some time on this basic issue. The way forward is surely to retitle it as what it is essentially about by virtue of its beginning definition, namely ‘Subjective Epistemic Probability’. In Epistemic Probability, probability is about the TRUTH of propositions or degree of certainty in their TRUTH as opposed to being about properties of extra-linguistic events, and also as opposed to being about the usefulness of propositions or degree of certainty in their usefulness rather than their truth, such as their usefulness for making predictions, for example. In the subjectivist variant of epistemic probability the probability of a proposition is the subjective strength of belief in their truth measured on a scale of fortitude from 0 to 1. Thus the article’s current definition of ‘Bayesian probability’

Bayesian probability interprets the concept of probability in the probability calculus as the degree of strength of belief in the truth of a proposition”

is in fact rather a definition of subjective epistemic probability, and so should instead say

Subjective epistemic probability interprets the concept of probability in the probability calculus as the degree of strength of belief in the truth of a proposition”

This would then enable the elimination of all the pedagocially confusing stuff that there are Bayesian probabilists who reject Bayesian probability as defined here because they reject subjective epistemic probability.

I have today edited the opening definition of ‘Bayesian probability’ pro tem just to clarify the concept of subjective epistemic probabilty that then follows as being concerned with belief in the TRUTH of propositions, but in fact like ERosa I deny this conflation of Bayesian probability with subjective epistemic probability.

I propose this article now be retitled ‘Subjective epistemic probability’ and then purged of its current confusing content about ‘objectivist Bayesian probability’, and an article on Bayesian probability be started briefly explaining there is really no such thing except for the whole conditional probability calculus, the only non-Bayesian probability being the absolute or unconditional probability calculus.


ERosa's Complaint

ERosa: Actually, if the title of the article was "Subjectivist probability" and not "Bayesian probability" it would cause much less confusion. Again, there is nothing inherently subjectivist about Bayes theorem. The only possible tie is that, since Bayes theorem allows the use of prior knowledge - whether subjective or based on observed frequences - it has sometimes been mistaken as based *only* on subjective probabilities. ERosa (talk) 07:09, 10 February 2008 (UTC) Logicus to ERosa:Amen to that ! But the remaining problem is what then is the differentiating specificity of 'Bayesian probability', whereby on the one hand it is not necessarily subjectivist as you say, but on the other hand does not include all conditional probability ? The article's current definition of Bayesian probability is in fact rather a definition of subjective probability. Defining Bayesian probability is a difficult business and indeed possibly impossible if it is only a historically mistaken pseudo-category. Maybe more later.... --80.6.94.131 (talk) 15:26, 24 February 2008 (UTC) --Logicus (talk) 15:29, 24 February 2008 (UTC) ERosa: I'm not sure I can fully construct the meaning of that first sentence. Are you saying that Bayes Theorem pertains to any conditional probability? If so, then yes. [Logicus 4 March: Yes!] It is another example of an unfortunate naming convention confusing a lot of people. [Logicus 4 March: I agree] The subjectivist philsophy of probabilistic knowledge has little to do with Bayes Theorem, which is derived mathematically from fundamental axioms of probability theory - the same rules any frequentist would feel subject to. But I don't think its really that difficult to define. THere is really nothing in statistics called a "Bayesian probability". There is Bayes Theorem which is used to compute a conditional probability, but a conditional probability isn't uniquely Bayesian. It makes no more sense than to talk about gram as a "digital gram" because it was weighed on a digital scale. Anytime the term "Bayesian probability" is used they really mean "subjectivist view of probability". Bayes theorem gives exactly the right answer for p(x|y) when you give it p(x), p(y) and p(y|x). Any "frequentist" would use the same formula to compute p(x|y). A subjective probability can be used as one input to Bayes Theorem, but like every other formula in math or science, the formula doesn't care HOW we come up with the numbers. It just gives us the answer with the numbers we give it.ERosa (talk) 02:44, 25 February 2008 (UTC) [Logicus 4 March: Right On]--Logicus (talk) 20:25, 4 March 2008 (UTC)

History needs updating

If someone wants to edit the history section there is a great starting point for their research here Currently, appart from the mention of Bayes himself, the history starts way too late(1930!)--BenE 01:52, 21 September 2007 (UTC)

I've read that paper and the Bayesian movement really took off as late as in the 1960's. iNic (talk) 00:42, 4 March 2008 (UTC)

Painting a picture of too much conflict?

This article seems very POV to me in that it seems to paint a picture of the Bayesian view of probability being more controversial than it actually is, and also that there is more conflict (see the use of the word "antagonism") between the two "schools of thought" than there actually is. Although people talk about the "schools of thought" I think that the evidence that these schools are as all-encompassing as they are is shaky at best.

For example, compare the books by Casella & Lehmann (Theory of Point Estimation) which arguably takes a heavily frequentist perspective...and the J.O. Berger (Statistical Decision Theory and Bayesian Analysis) book, which is arguably written from a (very strongly) Bayesian perspective. The berger book, which is about as "rabid" a Bayesian text as you can find, still embraces the frequentist interpretation of probability as one way of looking at things and fully lays out how to use such an interpretation. Similarly, the Lehmann book is very heavily slanted towards the frequentist perspective but it offers over a full chapter dedicated to Bayesian methods and discusses the philosophical aspects of the Bayesian interpretation of probability at great length.

I think that this article needs to be seriously rewritten to reflect the real state of things. All the modern texts talk about the "controversy" between Bayesian and frequentist methods as something that is more or less historical. People quibble about use of this technique and that, how it is applied, when it is appropriate, but most people agree that each interpretation has a certain domain where it is useful and another where it is not, and furthermore, there is a general consensus that both interpretations can be combined in a given problem for both philosophical and practical reasons. Do people agree with that? Cazort 23:37, 3 December 2007 (UTC)

Well both yes and no. If you read just some random section of this talk page you will probably find some strong feelings with heated discussions. If you read philosophy papers about this you will find some strong feelings too. However, if you read mathematical books in probability theory and statistical methods you will find only dry expositions, as math books are in general the wrong forum for debates. (But there are some exceptions here too.) So it all depends on where you look if you will find heated debates or not.
I think it's good that the article stresses the differences in view that exists, and the ongoing debate. This is the kind of information readers new to the subject/concepts want to know. What could be made a bit more clear, I think, is that there isn't only two views but many. The debate is historical in the sense that it's an old debate—it can easily be traced back to the old debate between materialism versus idealism—not in the sense that it's over. iNic (talk) 23:26, 17 December 2007 (UTC)
No, the article simply perpetuated a point of confusion about the debate. Bayes theorem and Bayesian methods are routinely used by statisticians. Bayes theorem itself is a matter of mathematical proof and is not just subjective. But this article seemed to continue with a common misconception that confuses the philisophical debate about the nature of uncertainty with the methods statisticians actually use. I'm a statistician and I use both methods. When you have prior knowledge then a Bayesian analysis will actually be a better result if you track it over time and compare it to methods that ignore prior knowledge. Often, I've used Bayesian analysis when the prior knowledge was actually derived from other standard sampling methods, not subjective estimates. I've made the changes and, unlike the previous version, inserted specific citations for the claims. Hopefully, the first year stats students who have written the material previously will provide better citations if they want to refute what I just wrote.
Actually, if the title of the article was "Subjectivist probability" and not "Bayesian probability" it would cause much less confusion. Again, there is nothing inherently subjectivist about Bayes theorem. The only possible tie is that, since Bayes theorem allows the use of prior knowledge - whether subjective or based on observed frequences - it has sometimes been mistaken as based *only* on subjective probabilities. ERosa (talk) 07:09, 10 February 2008 (UTC)
Logicus to ERosa:Amen to that ! But the remaining problem is what then is the differentiating specificity of 'Bayesian probability', whereby on the one hand it is not necessarily subjectivist as you say, but on the other hand does not include all conditional probability ? The article's current definition of Bayesian probability is in fact rather a definition of subjective probability. Defining Bayesian probability is a difficult business and indeed possibly impossible if it is only a historically mistaken pseudo-category. Maybe more later.... --80.6.94.131 (talk) 15:26, 24 February 2008 (UTC) --Logicus (talk) 15:29, 24 February 2008 (UTC)
I'm not sure I can fully construct the meaning of that first sentence. Are you saying that Bayes Theorem pertains to any conditional probability? If so, then yes. It is another example of an unfortunate naming convention confusing a lot of people. The subjectivist philsophy of probabilistic knowledge has little to do with Bayes Theorem, which is derived mathematically from fundamental axioms of probability theory - the same rules any frequentist would feel subject to. But I don't think its really that difficult to define. THere is really nothing in statistics called a "Bayesian probability". There is Bayes Theorem which is used to compute a conditional probability, but a conditional probability isn't uniquely Bayesian. It makes no more sense than to talk about gram as a "digital gram" because it was weighed on a digital scale. Anytime the term "Bayesian probability" is used they really mean "subjectivist view of probability". Bayes theorem gives exactly the right answer for p(x|y) when you give it p(x), p(y) and p(y|x). Any "frequentist" would use the same formula to compute p(x|y). A subjective probability can be used as one input to Bayes Theorem, but like every other formula in math or science, the formula doesn't care HOW we come up with the numbers. It just gives us the answer with the numbers we give it.ERosa (talk) 02:44, 25 February 2008 (UTC)
The acid test of whether or not one's a Bayesian is not (and never has been) whether or not one believes in Bayes theorem. Everyone does. It's a theorem.
Rather, the acid test is whether or not one believes it can ever be meaningful to talk about a probability P(x), if X is an event which has already happened, but about which you do not know the outcome. To a Bayesian, this is not only meaningful, it should be the central quantity of inference. To a Frequentist, it is not meaningful, and one should only talk about estimators, confidence limits and so forth; and discuss questions like "unbiassedness", which to a Bayesian can seem wholly misleading.
That's been the meaning of "Bayesian" since the word was coined, in the 1950s.
European universities tend to allow their lecturers more flexibility; but I have it on authority that, at least until very recently, there were still U.S. colleges where a lecturer would find themselves barred from teaching the course again, if they ever talked about P(x) in a first year statistics course, where X was to represent an event which had already happened. Jheald (talk) 10:00, 25 February 2008 (UTC)
Your claim that lecturers were barred from talkinga bout P(x) makes no sense. Every one of the six undergraduate stats text books on my book shelves start with chapters that uses the term "P(x)". Are you saying using P(x) for "probability of x" is somehow uniquely Bayesian? The same term is used throughout statistics whether the author is "frequentist" or not. I agree with ERosa that this entire article confuses Bayesian with subjectivist. As you both pointed out, Bayes Theorem is mathematically proven. But the problem is that the word "Bayesian" has come to mean two very different things. Over the last couple of decades, statisticians are using Bayes Theorem more often (even though Bayes Theorem is much older than that) to properly incorporate prior known constraints on potential values. We have to separate the philisophical argument which I think is better labeled frequentist vs. subjectivist, from Bayes entirely. And I think it mischaracterizes the frequentist view that p(x) must related only to a past event. The frequentist view is that p(x) only has meaning as the frequency of x over a large number of trials. But I like the argument presented earlier that "degrees of belief" can also be tested by frequentist means. If, of all the times a person says they are 80% confident, they are right 80% of the time, then you have confirmed their "degree of belief" with the frequency of being right. So, even in a purely philosophical sense, I see no real conflict, much less within the pragmatic use of statistics.ChicagoEcon (talk) 15:03, 25 February 2008 (UTC)
No, my claim is that a Bayesian will feel free to discuss the probability of x, where X is an event which has already taken place. A frequentist would resist this; and would resist talking about probability even of events in the future, if they could not be related to a frequency over a large number of trials.
"Bayesian" has been used in this sense, ie usages of probability not related to frequency over a large number of trials, ever since the word was coined, in the 1950s. An out-and-out frequentist may use Bayes theorem; but they are unlikely to describe either themselves, or their calculation, as "Bayesian". Jheald (talk) 15:42, 25 February 2008 (UTC)
In that case no statisticians are frequentists since all statistical estimators of means of populations are actually the P(a<x<b)where a and b are the bounds on some interval. This P(x) means that if we continued to sample the population until we got every possible member of the population, then the actual population mean has the stated chance of faling within those bounds. But, since it would be absurd that no statisticians are frequentists, and since that would be the logical conclusion from your claim, I would say that your initial characterization of frequentists resisting using P(x) is wrong. A frequentist uses P(x) but holds the position that the only meaning is what it means for the frequency of occurance over a large number of trials. A subjectivist would say it has another meaning - that it can mean degree of belief. I say frequentists' analysis of degrees of beliefs also show it meets the frequentists criterion (a large number of trials of degrees of belief statements can be observed with frequentists' observations). Both groups use P(x) and neither resists using it in any sense.ERosa (talk) 20:30, 25 February 2008 (UTC)
I think if you look closer, you will find that that is not the case. A frequentist statistician will not make assertions about the probabilities of a parameter θ of a distribution. Rather, they will make assertions about the probability of an estimator \hat{\theta}, and how often it might or might not be an amount δ different from θ if hypothetically a large number of similar trials were to be carried out.
A Bayesian will feel free to discuss the probability of θ itself. But for a by-the-book frequentist θ is a fixed parameter, not a random variable; so not something about which they can ever talk about a probability distribution. Jheald (talk) 21:23, 25 February 2008 (UTC)
Then, as you define it, I've never met a frequentist statistician. And that would be somethig since I'm a statisician that worked, among other places, with the Census Bureau (where there are over 1000 professional statisticians). I've also worked with the statisticians ad the EPA and with many academic researchers. My contact list has over 100 people with advanced degrees in statistics. And I've never met anyone who would does not talk in terms of he probability of a parameter falling within stated bounds. Its simply the normal language among every statistician I know. And, believe me, I've "looked closely". Now, you should compute the odds that, even with a somewha biased sample, I would by chance have never met a frequentist statistician if there are any more than a tiny minority. (Hint: You can use Bayes Theorem)ERosa (talk) 15:49, 27 February 2008 (UTC)
That's interesting. So are these people actually calculating probabilities P(θ|data) ? Or are they calculating confidence intervals and then misrepresenting the meaning of their calculation ? Jheald (talk) 16:35, 27 February 2008 (UTC)
If you understand what "confidence interval" means, you know that the "confidence" that the "interval" a to b contains the population parameter x is is P(a<x<b|data). Its not a misrepresentation. They are saying that there is a 90% chance that, if we continued to sample the entie population,we would find the mean to be within the 90% CI. In fact, simulations of samples of populations will actually prove that. Where are you learning what you have "learned" about statistics? I honestly can't think of a single professor of stats, text, or PhD researcher who makes these fundamental errors you seem to be making. Can you provide a citation?ERosa (talk) 19:12, 27 February 2008 (UTC)
If you really want P(θ|data) you do it the Bayesian way: you start with a prior P(θ|I), and update it according to Bayes theorem. Confidence interval calculations don't do that: they calculate P(interval|θ), without any consideration of the priors on θ. As a result, there are cases where frequentist methods can report very high "confidences" in parameter ranges which may nevertheless still actually have rather low probability. Jheald (talk) 20:22, 27 February 2008 (UTC)
You didn't answer my question about a source. You were surprised that calculating a CI is actually calculating a particular P(X) (in this case, P(a<x<b) where x is a population parameter). Why would you think this is a misrepresentation and what is your source? ERosa (talk) 23:28, 27 February 2008 (UTC)
But of course it is not calculating a P(X). The notation P(a<θ<b) is misleading, because θ is not a random variable. The interval is the random quantity, and it is fixed so that P(a< \tilde{\theta} <b \;|\; \theta ) = 0.95, {\mathrm{if}} \;\theta = \hat{\theta}. It's a very odd calculation, when you actually write it out properly; but it has nothing to do with getting a probability distribution for θ. Jheald (talk) 00:51, 28 February 2008 (UTC)
But of course it IS and you are seriously mislead. Again, provide a citation for your claim. The notation P(a<x<b) is quite standard and what is, in fact, random, is the estimate of x relative to the true population mean of x. A large number of simulations of samples from known populations show that the 90% CI contains the known population mean 90% of the time. By the way, a colleague of mine once wrote for the Journal of Statistics Education, which talks a lot about bizzare misconceptions about statistics. I think you will make an excellent subject.74.93.87.210 (talk) 04:49, 28 February 2008 (UTC) Forgot to sign in.ERosa (talk) 04:58, 28 February 2008 (UTC)
By the way, the wikipedia article on confidence intervals seems to use notation entirely consistent with what I'm saying and contrary to what you say. You should also set out to "correct" that error. And all the errors in every stats text I pick up. You have a lot of work to do.ERosa (talk) 04:58, 28 February 2008 (UTC)
P(a<θ<b|data), calculated using Bayes theorem, is called a Bayesian credible interval. It coincides with a frequentist confidence interval only if the prior probability P(θ|data) is uniform. Otherwise, as you can verify for yourself, the calculations are different. And it's a well known fact, that if you bet against a Bayesian who has an accurate prior, you will tend to lose. Jheald (talk) 09:51, 28 February 2008 (UTC)
By the way, with regard to the Wikipedia article on confidence intervals, note the confidence intervals#definition is in terms of
\Pr(U<\theta<V|\theta),
ie probabilities of the interval given theta.
Note also the section Meaning and Interpretation:
"It is very tempting to misunderstand this statement in the following way... The misunderstanding is the conclusion that \Pr(u<\theta<v)=0.9,\, so that after the data has been observed, a conditional probability distribution of θ, given the data, is inferred... This conclusion does not follow from the laws of probability because θ is not a "random variable"; i.e., no probability distribution has been assigned to it."
(emphasis added). Jheald (talk) 10:01, 28 February 2008 (UTC)
Wow, this debate has generated a lot of text! Actually, I think the entry in the confidence interval argument needs to be corrected if it means that a 90% confidence interval doesn't have a 90% *propability* of containing the true value. And those who insist on the distinction between "credible interval" and "confidence interval" make the same mistake Jheald makes since the distinction has no bearing on observed outcomes. I believe what ERosa was referring earlier to was the fact that if you take, say 30 samples from a large population where you already know the mean, compute the 90% confidence interval, and repeat this thousands of times, you will find that 90% of the time the known population mean actually fell between the upper and lower bounds of the 90% confidence interval. This claim is experimentally verifiable. Neither the math nor experimental observations contradict ERosa. This is another example of how people have some strange ideas about probability theory.Hubbardaie (talk) 13:43, 28 February 2008 (UTC)
I noticed that the section of the confidence interval article that Jheald cites had no citations for its arguments (much like Jheald's arguments in here). So I added fact flags. When I get a chance I will rewrite that fundamentally flawed section. This is the problem when people who barely understand the concepts try to get philosophical.Hubbardaie (talk) 14:18, 28 February 2008 (UTC)

Arbitrary section break (Confidence limits)

Here's a concrete example of the problems you can get into with confidence limits.
Suppose you have a particle undergoing diffusion in a one degree of freedom space, so the probability distribution for it's position x at time t is given by
P(x|t) dx = \frac{1}{\sqrt{2 \pi \mu t}} \exp{\frac{-x^2}{\mu t}} dx
Now suppose you observe the position of the particle, and you want to know how much time has elapsed.
It's easy to show that
 \hat{t} = \frac {x^2}{\mu}
gives an unbiased estimator for t, since
 E(\hat{t} | t) = t.
We can duly construct confidence limits, by considering for any given t what spread of values we would be likely (if we ran the experiment a million times) to see for \hat{t}.
So for example for t=1 we get a probability distribution of
P(\;\hat{t}\;) d\hat{t} \propto \sqrt{\hat{t}} \exp{\frac{-\hat{t}}{\mu}} d\hat{t}
from which we can calculate lower and upper confidence limits -a and b, such that:
P(-a < t - \hat{t} < b) = 0.95
Having created such a table, suppose we now observe x = \sqrt{\mu}. We then calculate \hat{t} = 1, and report that we can state P(\hat{t}-a < t < \hat{t}+b) with 95% confidence, or that the "95% confidence range" is 1-a < t < 1+b\;.
But does that give a 95% probability range for the likely value of t given x? No, it does not; because we have calculated no such thing.


The difference becomes perhaps clearest if we think what answer the method above gives, if the data came in that x=0\;.
That gives \hat{t}=0. Now when t=0, the probability distribution for x is a delta-function at zero, as is the distribution for \hat{t}. So a and b are both zero, and so we must report a 100% confidence range, 0 \le t \le 0.
Does that give a 100% probability range for the likely value of t given x? No, because we have made a calculation of no such quantity. The particle might actually have returned to x=0 at any time. The likelihood function, given x=0, is actually
L(t;x) = \frac{1}{\sqrt{2 \pi \mu t}}
Conclusion: confidence intervals are not probability intervals for θ given the data. Jheald (talk) 15:54, 28 February 2008 (UTC)

Certainly confidence intervals are not probability intervals. Here's a simple example: two independent observations are uniformly distributed on the interval from θ − 1/2 to θ + 1/2. Call the larger of the two observations max and the smaller min. Then the interval from min to max is a 50% confidence interval for θ since P(min < θ < max) = 1/2. But if you observe min = 10.01 and max = 10.02, it would be absurd to say that P(10.01 < θ < 10.02) = 1/2; in fact, by any reasonable standard it would be highly improbable that 10.01 < θ < 10.02 unless you had other information in addition to that given above (e.g. if you happened to know the actual value of θ). And if you observed min = 10.01 and max = 10.99, then it would be similarly absurd to say that P(10.01 < θ < 10.99) = 1/2; again, it would be highly improbable that θ is not in that interval. Michael Hardy (talk) 20:54, 28 February 2008 (UTC)

I think where Hardy and Jheald are differeing with Hubbardaie and myself is in two ways. First, as I've said before, Jheald need only repeat this process in a large number of trials to show that the CI will capture the mean exactly as often as the CI would indicate. In other words, if the 95% CI is a to b, and we compute a large number of intervals a to b based on separate random samples, we will find that the known mean falls within 95% of the computed intervals. Second, Hardy is calling the result absurd because he is taking prior knowledge into account about the distribution. But, again, if this sampling is repeated a large number of times, he will find that only 5% of the computed 95% CIs will fail to contain the answer. If we move away from the anecdotal to the aggregate (where the aggregate is the set of all CI's ever properly computed on any measurement) we find that P(X within interval of Y confidence)=Y.ERosa (talk) 21:40, 28 February 2008 (UTC)

I did not call it "absurd" because of prior knowledge; I said it's absurd UNLESS you have prior knowledge. It is true that in 50% of cases this 50% confidence interval contrains the parameters, but in one of my cases the data themselves strongly indicate that this is one of the OTHER 50%, and in the other one of my cases, the data strongly indicate that this is one of the 50% where θ is covered, so one's degree of confidence in the result would reasonably be far higher than 50%. Michael Hardy (talk) 16:13, 29 February 2008 (UTC)
Also, Jheald commits a non-sequitur and begs the question. He shows a calculation for a CI and up to the point of that answer, he is doing fine. But then he asked "But does that give a 95% probability range for the likely value of t given x?" and then states "No, it does not; because we have calculated no such thing". You correctly compute a confidence interval, but then make an unfounded leap to what it means or doesn't mean. You have not actually proved that critical step and your claim that you have not computed that is simply repeating the disputed point (i.e. begging the question).ERosa (talk) 21:44, 28 February 2008 (UTC)
Well, the 95% CI calculation is different to what a calculation of a 95% probability range for the likely value of t given x would look like. But rather than labour the point, surely the coup-de-grace is what follows?
If you observe x=0 in the example I've given above, the CI calculation gives you a 100% confidence interval for t=0.
But the likelihood L(t;x) = \frac{1}{\sqrt{2 \pi \mu t}}
So there is the key ingredient for the probability of t given x (give or take whatever prior you want to combine it with), and it is not concentrated as a delta-function at zero. Jheald (talk) 13:02, 29 February 2008 (UTC)
But now in your new response I see you are backing off from your original claim that given a particular set of data x, the CI will accurately capture the parameter 95% of the time. Now I see that you are replacing that with the weaker claim that given a particular parameter value, t = t*, the CI will accurately capture the parameter 95% of the time. Alas, this also is not necessarily true.
What is true is that a confidence interval for the difference \hat{t} - t calculated for a correct value of t would accurately be met 95% of the time.
But that's not the confidence interval we're quoting. What we're actually quoting is the confidence interval that would pertain if the value of t were \hat{t}. But t almost certainly does not have that value; so we can no longer infer that the difference \hat{t} - t will necessarily be in the CI 95% of the time, as it would if t did equal \hat{t}.
If you don't believe me, work out the CIs as a function of t for the diffusion model above; and then run a simulation to see how well it's calibrated for t=1. If the CIs are calculated as above, you will find those CIs exclude the true value of t a lot more than 5% of the time. Jheald (talk) 14:23, 29 February 2008 (UTC)
I've written in the confidence interval article talk something this entire discussion has been seriously lacking...citations! See the rest there.Hubbardaie (talk) 14:37, 29 February 2008 (UTC)
By the way, I also made a similar argument to ERosa that, over a large number of trials, 95% of 95% CIs will contain the true mean of a population. I haven't backed off of it and I don'tsee where ERosa has. In fact, the student-t distribution (which I wrote about in my book) was initially empirically derived with this method. So, alas, it IS true that the 95% CI most contain the true value 95% of the time. If you don't believe me, run a simulation on a spreadsheet where you randomly sample from a population and compute a CI over and over again. So much for the coup-de-grace. But this is getting us nowhere. Refer to the citations I provided in the confidence interval article. You also have to provide verifiable citations for anything you say or you run the risk of violation the NOR rule.Hubbardaie (talk) 14:45, 29 February 2008 (UTC)
But I'm not calculating the mean of a population. I'm trying to get a confidence limit for an unknown time, given a position measurement.
The CIs I get don't reflect the probability distribution P(t|x) for that unknown time, given the measurement.
That is sufficient to dispose of the assertion that confidence intervals necessarily reflect the probability distribution for their parameter given the data.
You might also like to reflect that WP:NOR specifically does not apply to talk pages, and per WP:SCG the creation of crunchy examples and counter-examples is not considered OR. Jheald (talk) 15:47, 29 February 2008 (UTC)
First, I didn't say NOR applied to talk pages. Of course, knock yourself out and apply all the original research you want in here (that is, one might presume its original since you never provide a citation). I'm just cautioning you for when and if you decide to modify the actual article. In there you will need citations, so why not show them here,too? And you seem to have backed off of your original position quite a lot. As I review your conversations with me and others over the last couple of weeks, you originally said that a frequentist would resist using P(X) at all. This morphed into a conversation about whether a confidence interval a to b has a probability of P(a<x<b). The fact that William Sealy Gosset derived the first t-stats by empirical methods settles that issue. The citations I showed in the confidence interval page contradict your position. You haven't proven anything. You, again, made an unrelated point followed by an unfounded leap to the original debated assertion. And you have, again, confused situations where a possible, but unlikely set of observations can in one situation produce a range that doens't contain the true value, when the true value is known, with situations where you don't know the true value to begin with and are trying to assess the probability distribution of possible population parameter values.Hubbardaie (talk) 16:03, 29 February 2008 (UTC)
In the particular case of a Student t-test, the 95% confidence interval does match a Bayesian 95% credible interval. (For a derivation, see eg Jeffreys, Theory of Probability). Student in fact used an inverse probability approach to derive his distribution; similar to Edgeworth, who'd used a full Bayesian approach back in 1883. The reason the two match is (i) we assume that we can adopt a uniform prior for the parameter; (ii) that the function P(θ'|θ) is a symmetric function that depends only on (θ'-θ), with no other dependence on θ itself; and also that (iii) θ' is a sufficient statistic.
Under those conditions, a 95% confidence interval will match a Bayesian 95% credible interval. But in the general case, ie in other situations, as in the example I gave higher up, the two do not match. Jheald (talk) 18:12, 29 February 2008 (UTC)
So: does a confidence interval a to b in general have a probability of P(a<x<b|data) = 0.95 ? In general, no. And even when it does (like the case of the t-test), in moving from one to the other, one is (either consciously or unconsciously) making a transition of worldview, from the frequentist to the Bayesian.
I don't back down from what I said above. The notion of a conditional point probability, or an interval probability, for P(θ|data) is not a Frequentist notion. A proper frequentist would not talk about P(θ) at all. Talk about P(θ|data), where θ is a non-random parameter, is only meaningful in the context of a Bayesian outlook. If somebody does believe in P(θ|data), then they either don't care about Frequentism, or don't understand it. Jheald (talk) 18:28, 29 February 2008 (UTC)
Look, I respect where you are coming from. You are clearly not a total layman on the topic. But I won't repeat how your claim doesn't address the issue of how, when you have no a priori knowledge of a population's mean or it's variance, that the CI is meant to mean what the sources I cite say it means. I understand you continue to insist that when someone says a CI is a range that contains the true values with a given probability, that they must be wrong or misleading, contrary to the authoritative sources I cite clearly state. Let's just lay out the ciations and show both sides in the article. Hubbardaie (talk) 22:43, 29 February 2008 (UTC)


Undo justification

I've removed Logicus' addition, as it contains sentences such as "However the dominant 20th century and contemporary philosophy of science apparently believed by most scientists, namely realist instrumentalist fallibilism that maintains all scientific laws are false but are more or less useful instruments of prediction, poses an intractable problem for the thesis that scientific reasoning is subjective epistemic probabilist, based on degrees of strength of belief that scientific laws are true, since if all scientific laws are believed to be false, they must be assigned prior probability zero whereby all posteriors must also be zero, thus offering no epistemic differentiation between theories." This is utterly incomprehensible and poor style. Tomixdf (talk) 17:45, 4 March 2008 (UTC)

Logicus on Tomixdf's deletion:

The whole Logicus text that Tomixdf has removed is as given below in curly brackets, preceded by the text and claim it commented upon.

Tomixdf’s justification for its removal was

“(Removed unreferenced/confusing POV section)”,

but as can be seen in fact it contained many references including links contrary to Tomixdf’s claim. Its basic purpose was to correct the articles’s current uncritical pro-subjective probabilist POV on the ‘Bayesian’ logical positivist philosophy of scientific method. Whether it is confusing or clarificatory, or indeed any more confusing than this currently highly confused article, I leave it to the reader to judge.

Whether Tomixdf’s other insulting justification that its first sentence is “utterly incomprehensible and poor style.” is also false, again the reader may judge for themselves or even suggest improvements.

It should be noted that since 18 February Tomixd has made some 25 or so edits of this article without first offering a single prior discussion of any of their proposed edits on this Talk page beforehand, prior to this attempted justification for removing somebody else's addition.

Overall it should be born in mind that Logicus's additions deal with the most important critical issue of all for 'Bayesian' probabilist philosophy of scientific method, portraying the 'Logic of Science' being THE main concern of the probabilist philosophy of such as Jaynes and others. Not to mention the main problems of and alternatives to probabilist theories of scientific method is extremist POV.

The text in question in the 'Applications' section:

“Some regard the scientific method as an application of Bayesian probabilist inference because they claim Bayes's Theorem is explicitly or implicitly used to update the strength of prior scientific beliefs in the truth of hypotheses in the light of new information from observation or experiment. This is said to be done by the use of Bayes's Theorem to calculate a posterior probability using that evidence and is justified by the Principle of Conditionalisation that P'(h) = P(h/e), where P'(h) is the posterior probability of the hypothesis 'h' in the light of the evidence 'e', but which principle is denied by some [8] Adjusting original beliefs could mean (coming closer to) accepting or rejecting the original hypotheses. { Logicus's addition: However the dominant 20th century and contemporary philosophy of science apparently believed by most scientists, namely realist instrumentalist fallibilism that maintains all scientific laws are false but are more or less useful instruments of prediction, poses an intractable problem for the thesis that scientific reasoning is subjective epistemic probabilist, based on degrees of strength of belief that scientific laws are true, since if all scientific laws are believed to be false, they must be assigned prior probability zero whereby all posteriors must also be zero, thus offering no epistemic differentiation between theories. < Fn. As Duhem expressed the falsity and endless refutation of all scientific laws posited by fallibilism, "Thus, the struggle between reality and the laws of physics will go on indefinitely: to every law that physics may formulate, reality will sooner or later oppose a rude refutation in the form of a fact, but, indefatigable, physics will improve, modify, and complicate the refuted law in order to replace it with a more comprehensive law in which the exception raised by the experiment will have found its rule in turn." Duhem's 1905 Aim and Structure of Physical Theory, p177 of the Athaneum 1962 edition.> Alternative non-probabilist fallibilist philosophies of science such as those of Duhem, Popper, Lakatos, Laudan, Worrall and others based upon objective normative criteria of supercession of theoretical systems that specify when one system is better than another, such as when predicting more novel facts or having less anomalies than a competitor and thus constituting scientific progress, do not suffer this difficulty. So they raise the question of what problems of the theory of scientific theory change are solved, if any, by subjectivist probabilism that are not solved by non-probabilist objectivist theories. Another defect of the purely subjective expert-opinion based approach of subjectivist epistemic probabilism is the political criticism that it is elitist and anti-democratic because it makes the rationale of science and evaluation of its theories depend solely upon the purely subjective strengths of belief of an elite in their truth, as opposed to their objective publicly ascertainable ability to predict novel facts and solve problems, for example. Subjectivist probabilism runs counter to the democratic ethos that public research funds are not to be doled out on the basis of the subjective strength of some academic’s belief that their theory is true. Thus for example, the early 18th century changeover to heliocentric astronomy from geoheliocentric astronomy dominant in the 17th century is apparently easily explained by heliocentrism’s successful prediction of the novel fact of stellar aberration experimentally confirmed in 1729, not predicted by any form of geocentrism. And the earlier 17th century changeover from pure geocentrism to geoheliocentrism is easily explained by the 1610 telescopic confirmation of the novel fact of the phases of Venus first predicted by Capellan geoheliocentric astronomy. What more is needed to explain such theory changes that only subjectivist epistemic probabilism can explain, or beyond explaining problems of its own making ? }

--Logicus (talk) 21:16, 4 March 2008 (UTC)

You are spamming the discussion page, please keep it short and to-the-point. About your contribution: (a) it contained overly long sentences that are unreadable (very bad style), (b) It does contain many unreferenced statements (starting at the first sentence!) and (c) it contains lots of POV/OR (example: Another defect of the purely subjective expert-opinion based approach of subjectivist epistemic probabilism is the political criticism that it is elitist and anti-democratic because it makes the rationale of science and evaluation of its theories depend solely upon the purely subjective strengths of belief of an elite in their truth, as opposed to their objective publicly ascertainable ability to predict novel facts and solve problems, for example.'. In conclusion, I stand by decision to delete your section. Again: if you respond to this, keep it short and to-the-point, thanks. -Tomixdf 130.225.125.174 (talk) 10:57, 5 March 2008 (UTC)