Talk:Bayesian inference

From Wikipedia, the free encyclopedia

Contents

[edit] About legal applications

I believe that the example is a wrong application of the Bayes theorem. The prior probability of "defendent is guilty" can only be 0 or 1: a person is, or is not guilty, and should have the benefit of the doubt in the other cases. As a consequence, applying the Bayes theorem does not bring any value. So, I agree with the Court of Appeal.

This does not mean that there would be no other legal application. Could you find an example where another prior probability is used ? Pcarbonn 19:03, 23 May 2004 (UTC)

It sounds like you may be a frequentist, instead of a Bayesian. A frequentist believes that probabilities should only be assigned to truly random events, while a Bayesian uses probabilities to quantify his or her own limited knowledge about the events. Thus, a frequentist would recoil in horror at the thought of assigning a probability to "defendent is guilty" (that being a deterministic concept), while a Bayesian would happily assign a probability between 0 and 1. A frequentist would thus agree with the Court of Appeal.
Since this article is about Bayesian inference, we should adopt the Bayesian view of probabilities for this article. The appropriateness of the legal application can be debated (I didn't like it so much, myself), but it should not be deleted because it is Bayesian. -- hike395 05:15, 26 May 2004 (UTC)
To insist that the prior probability of guilt or innocence be either 1 or 0 (as Pcarbonn suggests) would be to dispense with the need for a trial. One has a trial in order to make the best judgment possible about the probability of guilt or innocence and, given that probability estimate, make the decision that the law requires. Furthermore, the presumption of innocence in criminal cases does not entail that the prior probability of guilt be set at zero; it only means that the prior probability of guilt be no more than, perhaps, the prior probability that any other person in the country or, possibly, the prior probability that anyone else in the world is guilty. Such a prior probability would be small, but it would not be 0 (zero). (anonymous law teacher)
Courts generally do not use Bayesian logic in an "explicit" way. But this may just mean that they don't use symbolic and mathematical notation when they reason about evidence. It is possible, for example, that the following rule about the relevance requirement for the admissibility of evidence is the equivalent of, or assumes, Bayesian logic:

Federal Rule of Evidence 401: "'Relevant evidence' means evidence having any tendency to make the existence of any fact that is of consequence to the determination of the action more probable or less probable than it would be without the evidence."

(anonymous law teacher)


I understand the difference between bayesian and frequentist, and I accept the Bayesian view in general. However, I do not accept it in this legal application (am I POV ?): I would not want to be wrongly put in jail because, statistically, I'm "80% guilty". A way out may be for the article to better explain this point, and be more NPOV compared to the decision of the court of appeal than the initial article. An article should never take only one position in the controversy (e.g. bayesian only), but state both positions honestly. Pcarbonn 05:45, 26 May 2004 (UTC)

Well, decision theory states that you should be put in jail if you are >= 80% guilty if putting an innocent person in jail costs (the jury? society?) 4 times more than letting a guilty person go free. You probably don't like it as an innocent person, because you would bear the costs and none of the benefit :-).
I personally didn't like the legal example, and thought that the last sentence about the Court of Appeal was POV (I was too lazy to fix it, though.) So, as far as I am concerned, it can stay deleted. But, I think the article would benefit from some worked-through example. The example at Naive Bayesian classification is much too complicated for an introduction to the method. I wouldn't mind bringing back the example and make the decision be about something else, less controversial. Do you have any suggestions?
As for the Bayesian vs frequentist, I think this controversy is well-explained at Bayesian and frequentist probability, and does not have to be fully repeated here. I would suggest that a simple link there can suffice. -- hike395 14:06, 26 May 2004 (UTC)

Aren't there some good worked-through examples in the Bayes' theorem article already, in particular the medical one ? I can't think of any legal example. OK for the link to the controversy article. Pcarbonn 17:18, 26 May 2004 (UTC)

Deleting the legal section was over the top. How do you think criminal courts work? The prosecution makes an allegation. It then produces evidence to show that what it has said is almost certainly true (i.e. it can be believed beyond a reasonable doubt). The defence may either seek to show that there are gaps in the prosecution evidence or that some previous not considered possibility is credible (in either case increasing the doubt about the prosecution case). The jury then assesses its belief about the prosecution case and how it has changed given the evidence presented by both sides. That is where Bayesian inference can be used in terms of degrees of belief about what has already happened but is not known. --Henrygb 13:37, 28 May 2004 (UTC)
If I'm not mistaken, the courts are hesitant to admit actual computations of probability in testimony. I've read of a couple of cases in which numerical computations were made by the prosecution but the conviction was later overturned on appeal, on the grounds that the computations were inadmissable. Now I am certainly leaving out many important details and mixing things up, so I'll try to find a reference so that we can have a proper discussion about it. Regards, Wile E. Heresiarch 04:16, 2 Jun 2004 (UTC)
About legal applications, I see the present discussion as somewhat problematic. Essentially what we have right now is a calculation as it could be carried out, but in fact not usually, and maybe never. The few court cases that I've heard about which involved probability are cautionary tales -- the calculation was thrown out for one reason or another. One that I've seen referenced a couple of times is People vs Collins. This paper [1] gives a summary of that case and some others. See also Peter Tillers's web site [2] -- he is a law professor with a longstanding interest in Bayesian inference and he usually has something interesting to say. Incidentally I think Tillers states the editorial remark that was recently struck out -- that in the Regina versus Denis Adams case, the court allowed any irrational inference but disallowed a rational one -- in another context; I'll try to find it. -- Bayesian inference in law is something of a mess, I'm afraid; I don't know where to go from here, maybe someone has some ideas. Happy editing, Wile E. Heresiarch 21:49, 8 Jun 2004 (UTC)
There are quite rational methods to supply priors in legal cases. For example, if one can reasonably assume that a crime was committed by someone living within the city where the crime was committed (and statistics on convictions ought to give good information about that), then P(suspect committed the crime|this background information) is approximately equal to 1/N where N is the population of the city. In other words, if the police were simply to arrest someone at random, that's the probability that they chose the guilty person, i.e., the appropriate prior.
Of course, the police don't arrest people at random, but regardless of their reasons for arresting a particular individual, these reasons have to be examined by the jury and cannot enter into the prior probability. Thus, all the evidence that resulted in the arrest has to be presented at trial. In particular, the jury should not take the prior probability of guilt equal to or greater than 1/2 (the test for probable cause) simply because the police have to have probable cause to arrest. For them to do that and then to consider the evidence that led to arrest would be to (illegitimately) use the same data twice, a no-no. Rather, the jury (if they are Bayesian and such calculations are allowed in the privacy of the jury room) should choose a prior that reflects the probability of guilt in the absence of any evidence that would likely be presented at the trial.
This is quite apart from Wile E. Heresiarch's comment about the admissibility of probability calculations in courts of law. I don't know what the answer to that is. I'm only dealing with a hypothetical court that acts in a Bayesian manner.
But, it most certainly is appropriate to provide priors on guilt, and indeed if one did not do this one would be in the unenviable position of "Verdict first, trial afterwards." Bill Jefferys 20:10, 13 June 2006 (UTC)

[edit] Focus on Algorithm ?

As it stands, this is a odd article. Perhaps it is misnamed and should be moved? Bayesian probability already describes the notion of subjective probability (degrees of belief). Bayes' theorem already talks about the theorem and a simple worked out example. This article seems to be about Applications of Bayes' theorem.

I propose that this article should be about algorithms for performing Bayesian inference, in a Bayesian network. For example, the message passing of Judea Pearl (max product and sum product belief propagation), loopy belief propagation, and perhaps a link to Expectation maximization. --- hike395 15:04, 28 May 2004 (UTC)

Sounds like a good idea to me ! (But I do not know enough to expand it that way.) Pcarbonn 18:21, 28 May 2004 (UTC)
I would disagree. Bayesian inference should be about how evidence affects degrees of belief. Have a page Bayesian algorithm if you wish, or add to Bayesian network. But most Bayesian inference is not about the specialist subject of networks. --Henrygb 19:05, 28 May 2004 (UTC)
Henry: what can you say about Bayesian inference that is not already described in the Bayes' theorem article (specifically the Examples section) and that is not about Bayes net algorithms? And is not a list of applications? This is not a rhetorical question --- your answer should become the content of this article. If there is nothing more to add, then this article should be a redirect to Examples of Bayes' theorem, and we can spawn two new articles --- Applications of Bayes' theorem, and Inference in Bayes nets, perhaps. -- hike395 02:34, 29 May 2004 (UTC)
This page is designed to be the main introduction to Bayesian statistics and Bayesian logic which both redirect here. Nobody who understands probability theory would reject Bayes theorem, but many refuse to apply Bayesian methods. So Bayes theorem is not the place for an introduction to the methods. I have added a section explaining what I think this is about in terms of inference and the scientific method. The legal section (especially R v. Adams comments) is really about inference rather than Bayes theorem. --Henrygb 00:51, 31 May 2004 (UTC)
I agree. Bayes' theorem is a simple mathematical proposition that is not controversial. Bayesianism, on the other hand, is the philosophical tenet that the same rules of probability should apply to degrees of belief that apply to relative frequencies and proportions. Not at all the same thing. Michael Hardy 01:08, 31 May 2004 (UTC)
Well, I wouldn't mind putting examples of Bayesian inference here --- I just object to the redundancy with the material in the Bayes' theorem article. If you look over there, much of the article is about Bayesian methods. If the general consensus is that Bayesianism should be described here, we should probably move a lot of the material from the other article over here. -- hike395 01:50, 31 May 2004 (UTC)
Hello. I agree that there is some redundancy among the various articles about Bayesian stuff. I think the basic examples in Bayes' theorem should stay there, and more specialized stuff should be developed in the other articles. It's true that Bayesian inference is pretty sketchy at present, but I think it should be filled out with new material, not by moving material over from Bayes' theorem. Happy editing, Wile E. Heresiarch 04:12, 2 Jun 2004 (UTC)
It seems that we have an impasse. Henrygb and Michael Hardy want to distinguish Bayes' theorem from Bayesian philosophy. I don't want to have 2 duplicate articles. Wile Heresiarch don't want me to move material over. I don't know how to make edits that fulfill all of these goals. --- hike395 06:04, 3 Jun 2004 (UTC)
I guess I'm not seeing what there is to be moved over from the Bayes' theorem article. Presumably you don't mean the historical remarks, statements of the theorem, or the references. Among the examples, the cookies seems like a trivial example that's appropriate for an introductory article. The medical test example is more compelling but really no more complex than the cookies (and therefore still appropriate). I guess that leaves the binomial parameter example. Maybe we can bring that one over. In the same vein, Bayesian inference could talk about other conventional statistical models reinterpreted in a Bayesian fashion. I guess I should get off my lazy behind and get to it. Happy editing, Wile E. Heresiarch 06:47, 4 Jun 2004 (UTC)

[edit] Move material from Bayes' theorem

If we take it as given that this article should be the introductory article about Bayesian reasoning, and that Bayes' theorem should be devoid of "Bayesianism" (because that belongs here), then the following sections reflect "Bayesianism" and belong here:

  • Historical remarks --- treats a parameter as a random variable,
  • Examples --- All of the examples treat unknowns as random variables.

I propose moving these over as examples of inference here, which would then flesh the article out and leave Bayes' theorem agnostic to Bayesian inference.

--- hike395 16:00, 4 Jun 2004 (UTC)

It seems pointless to obscure the fact that the major uses of Bayes' theorem are Bayesian. Henrygb and Michael Hardy have made some general statements, but they didn't mention moving anything, so it looks to me like you're putting words in their mouths in defense of your personal program. In any event Historical remarks is a report of Bayes' original essay; one can't accurately report the essay without reporting its motivations. I've restored the historical remarks to Bayes' theorem; cutting them out is exactly analogous to striking out the discussion of Martin Luther's theology on the grounds that the Catholics will find it objectionable. -- As for the examples, I'm not opposed to moving them so long as they're replaced by something else. One trivial example and one more interesting example seems a good formula. Regards, Wile E. Heresiarch 16:43, 4 Jun 2004 (UTC)
Wile: "you're putting words in their mouths in defense of your personal program" is completely incorrect. I am actually a Bayesian (although I dislike that term) and have written programs that use Bayesian inference. I was responding to what I thought Henry and Michael wanted. Please do not attribute sinister motives to me. --- hike395 16:59, 4 Jun 2004 (UTC)
Well, I guess I couldn't tell what you're getting at. Anyway I am happy to let it go. Let's make some new edits and talk about that instead. Wile E. Heresiarch 18:26, 4 Jun 2004 (UTC)
I agree that keeping the Historical Remarks section over there is probably better. I would still like to avoid redundancy between articles: so, to try and make everyone happy (including myself), over in Bayes' theorem, I added a one-sentence mild warning that examples of Bayes' theorem typically involve assuming Bayesian probability ("Bayesianism"), and then a link that points to this article's examples section. I believe that this can make everyone happy: Bayes' theorem can remain relatively pure (except for historical commentary, which is fair enough); readers of Bayes' theorem can very easily look at examples; and we don't need to have multiple pages about Bayesian inference. Comments? --- hike395 07:08, 6 Jun 2004 (UTC)
Looks OK to me. Wile E. Heresiarch 16:16, 9 Jun 2004 (UTC)

Further to this discussion, I've made a proposal on how to improve the relationship between the Bayesian inference article and the article on Bayes' theorem. Please see the Bayes' theorem talk page for details. Cheers, Ben Cairns 07:57, 23 Jan 2005 (UTC).


[edit] Evidence and the Scientific Method

I strongly object to this piece

Supporters of Bayesian method argue that even with very different assignments of prior probabilities sufficient observations are likely to bring their posterior probabilities closer together. This assumes that they do not completely reject each other's initial hypotheses; and that they assign similar conditional probabilities. Thus Bayesian methods are useful only in situations in which there is already a high level of subjective agreement.


I am puzzled by the line

This assumes that they do not completely reject each other's initial hypotheses; and that they assign similar conditional probabilities.

To completely reject someone else's hypothesis one would have to assign it a probability of zero. This is a big no-no in Bayesian circles. It is called Cromwell's Rule. Never assign prior probabilities of zero or 1, as you then lose the ability to learn from any subsequent information. As Dennis Lindley puts it "You may be firmly convinced that the Moon is not made of green cheese, but if you assign a probability of zero, whole armies astronauts coming back with their arms full of cheese will not able to convince you."


If two people with different initial priors see the same data, use the same likelihood and both use Bayes' theorem to coherently incorporate the data, then as the data accumulates their posteriors will converge. (If they are not using the same likelihood then, of course, no agreement is to be expected).

The next line does not follow at all...

Thus Bayesian methods are useful only in situations in which there is already a high level of subjective agreement.

Sir David Cox (who is not a Bayesian) put it nicely: "Why should I be interested in your prior?". To answer this, there are 3 cases to consider

a) My prior is very like your prior.

In this situation it is reassuring to find that others have independently come to the same conclusions as oneself.

b) My prior is unlike your prior, but the volume of data available has swamped the prior information, so that our posteriors are similar.

This tells me that my results are robust to major changes of prior, and this tends to make it easier to convince others.

c) My prior is different to your prior and my posterior is different to your posterior. In this situation a Bayesian would say that reasonable men and women can come to different conclusions, given the state of the evidence. The onus is then on them to get more evidence to resolve the matter. In this situation a frequentist analysis which gives a single, supposedly 'objective' answer is a travesty of the scientific method. (Indeed, Bayesians sometimes reverse-engineer frequentist analyses, assuming them to be equivalent to some Bayesian analysis with some sort of uninformative prior. The quest is then to find out exactly what that uninformative prior is and to consider whether it is a reasonable prior.)Blaise 17:42, 27 Feb 2005 (UTC)

[edit] Clear statements are welcome

Please give your reasoning or build an argument when adding a notice to the article. A bald notice is not helpful to the editors. Ancheta Wis 19:06, 26 August 2005 (UTC)



[edit] Diagrams for formulas

I put a diagram I frequently use on here, all about Bayes Theorem...Image talk:Bays Rules.jpg Other novices like me might also find it useful. --Marxh 09:04, 13 September 2005 (UTC)

[edit] Compliments

User Ralph sent the following e-mail to the help desk.

very nice writeup on bayesian inference, I'm impressed........Ralph

Well done to everyone who has worked on it.

Capitalistroadster 23:43, 29 November 2005 (UTC)

[edit] Anonymous Observer's Comments

Anonymous Observer made the following comment in the main page:

  • ANONYMOUS OBSERVER: The above exposition is lovely -- as far as it goes. But the above discussion ignores possible important uncertainties. For example, did a police officer "plant" the DNA "found" at the crime scene? This possibility may be more than de minimis. Furthermore, the posterior probability that a defendant was the source of the DNA found at a crime scence is not necessarily equivalent to the posterior probability of defedant's guilt -- because, for example, the defendant may have deposited his or her DNA before or after the crime was committed; and because, alternatively, someone in addition to defendant may have left his or DNA at the crime scence; and because, alternatively, defendant alone may have deposited his or her DNA at the time of the crime but yet not have committed the crime charged (because, e.g., no one committed the crime -- there was no crime -- and defendant was an innocent passerby who touched the person who suffered harm). Moral of the story: If one uses numbers to express uncertainties, (s)he must try to identify all important uncertainties and are then represent them with an appropriate probability statement, expression, or symbol.

While I agree with the sentiments, this is the wrong way and place to include them in the article. Work needs to be done to seamlessly incorporate this information into the main article. The article is not a talk page, and these comments look like "Comment, response", not like an encyclopedia entry.

So let's discuss how to include this information! Bill Jefferys 23:39, 6 December 2005 (UTC)

I neglected to mention that this had been in the In the courtroom section. Bill Jefferys 23:41, 6 December 2005 (UTC)

See Edwin Thompson Jaynes. Probability Theory: The Logic of Science. Cambridge University Press, (2003). ISBN 0521592712. This book has a beautiful quantitative treatment of this problem of competing hypoteses. The hypotesis that the defendend is guilty is not supported if the hypotesis that the evidence has been tampered with is likely to be true. Bo Jacoby 10:31, 7 December 2005 (UTC)

I have the book and I agree. I'm not quarrelling with the thrust of the comment. The point is, what is the right way to put this information into the article? Anonymous Observer's method wasn't the right way because the main article is supposed to be an encyclopedia entry, not a talk page. So again, how do we include this information in the article in a seamless way that makes it read like a professionally written encyclopedia entry?

Perhaps A. O. or someone else can have a crack at doing this the right way. Bill Jefferys 13:58, 7 December 2005 (UTC)


Another anonymous observer's comments:

Why is this stuff presented like if is was a matter for controversy or Bayesian statisticians were members of a sect? (Bayesians believe, etc..) This stuff is just math. Sure there are cases where you start with a prior probability which is controversial, and that casts doubts on one's conclusions, but that does not make the deduction process itself controversial.OK now, excellent rewrite!

[edit] Correction required to "in the courtroom"

Looking at the case Regina v Dennis John Adams mentioned in the article, it would appear from the report at http://www.bailii.org/ew/cases/EWCA/Crim/1996/222.html, cited as DENNIS JOHN ADAMS, R v. [1996] EWCA Crim 222 (26th April, 1996), that Adams' conviction was quashed and a re-trial was ordered, precisely because the use of Bayesian logic was deemed inappropriate for use in a criminal trial.

I agree; it is ironic that this was the outcome, since the use of Bayesian logic clearly undermines the prosecution's case. Were the justices suggesting that eliminating Bayesian inference would be an advantage to the defendant? I don't know. Bill Jefferys 03:22, 12 December 2005 (UTC)

[edit] David Hume

I am pretty sure that Hume has made pretty good criticism of (or criticism that can be used for) bayesian inference...his works on causality most of all. 09:19, 15 May 2006 172.172.67.230

[edit] External links - applications of Bayesian inference?

Hi, I'm not a mathematician at all, but I've written an article on how to easily implement "Bayesian Rating and Ranking" on web pages here: http://www.thebroth.com/blog/118/bayesian-rating. From my own experience, I wished that there'd been such an article before - so I simply wrote one myself. It's a practical guide how to use it in a real world application. And we all know, most rating systems currently in use on the internet could do with a good dose of Bayesian! So - any suggestions what to do? Is the article good enough to be listed on Bayesian_inference or another Bayesian related article, possibly in the external links section? Wyxel 10:47, 13 June 2006 (UTC)

Hmm, nobody objected or commented, so I added the link. I hope it is considered a useful resource.--Wyxel 05:47, 23 June 2006 (UTC)

[edit] Is there a Bayesian fallacy?

If there is, then I think it should be included in the article. I'm thinking about where people confuse p(a|b) with p(b|a), assuming that they are roughly equal in value. . —The preceding unsigned comment was added by 81.104.12.8 (talk • contribs) .

For example because there are more young men in prison than in proportion to the population, p(young_man|criminal), people can fallaciously infer that young men are likely to be criminals, p(criminals|young_men), leading to Demonization and prejudice. —The preceding unsigned comment was added by 62.253.48.86 (talk • contribs) .

Whoops! I meant Conditional probability rather than Bayesian, but the above comments still apply. —The preceding unsigned comment was added by 62.253.48.86 (talk • contribs) .
See our article on the Prosecutor's fallacy. I agree our Conditional probability article should link to this, but I don't see it as being a particularly Bayesian issue. -- Avenue 23:41, 22 July 2006 (UTC)
Confusing p(a|b) with p(b|a) is essentially affirming the consequent. --AceMyth 22:33, 4 October 2006 (UTC)