Talk:Raven paradox

From Wikipedia, the free encyclopedia

Socrates This article is within the scope of the WikiProject Philosophy, which collaborates on articles related to philosophy. To participate, you can edit this article or visit the project page for more details.
B This article has been rated as B-Class on the quality scale.
Mid This article has been rated as mid-importance on the importance scale.
This article has been reviewed by the Version 1.0 Editorial Team.
Version 0.7
This article has been selected for Version 0.7 and subsequent release versions of Wikipedia.

Contents

[edit] Proof of santa?

Can u not use the same reasoning to provide evidence for a statements like -

'All christmas presents are made in Santas workshop"

Finding non-christmas presents that are not made in santas workshop is easy. I think that the article should clarify if this kind of reasoning also applies to statements that are obviously wrong.

[edit] On "Using Bayes theorem"

It would be nice to explain the meaning of X, T, and I in terms of ravens and apples. Ulan Degenbaev 20:58, 16 July 2007 (UTC)

Quote:

The statement "all ravens are black" is logically equivalent to the statement "all non-black-things are non-ravens".

The argument above is simply the contrapositive which is logically equivalant. There should not be any discussion about this. -- Ram-Man

Ummm....please see: Tuxedos, night skies, crude oil, soot. The statement "all [objects] have [quality]" is not equivalent to "all [things not that kind of object] do not have [quality]". For example: All humans are mammals. All frogs are not mammals. So far, so good. But all cows are mammals! Danger, Will Robinson! Obviously "All non-humans are non-mammals" is ridiculous.

Should be rephrased as "All non-black-things are *not* ravens". This is obvious, like saying all non-mammals are not humans. There is no paradox, as far as I can see.

-Emmett

Not really. Your example wasn't done properly. Let me redo it for you

Following what the quote says, All humans are mammals is equivalent to All non-mammals are non-humans. Let's test that using your examples

All frogs are non-human -- looks good
All cows are non-human -- looks bad - cows are mammals, and we're supposed to be checking non-mammals according to the quote.

As you say Obviously "All non-humans are non-mammals" is ridiculous. but that's not what the quote is saying. It says that "All non-mammals are non-humans" which is obviously commonsense. -- Derek Ross


The argument is flawed. If you found a green raven, is it a raven? Is its greenness enough to make it a different bird? In that case all ravens are black is a tautology. Ravens are by definition black. Finding a green raven will either change your definition or won't. See No true Scotsman. EW

That's just stepping into biology. Of course, if you found a green raven, then that would alter your belief in the statement "all ravens are black". But this article is not about that. It's about observing a non-black thing which is not a raven (clearly not a raven). -- Tarquin 11:43 Oct 28, 2002 (UTC)

Tarquin is absolutely correct. In fact you do occasionally find white ravens in the wild. So ravens are not by definition black. They're actually the children of older ravens. But all that means to the argument is that we'd eventually work our way through checking the non-black objects until we identified one particular (white) non-black object as a raven. At which point we'd note that the statements "All non-black objects are non-ravens" and "All ravens are black" are both false but both still equivalent, and decide to pick a better example. -- Derek Ross 12:19 Oct 28, 2002 (UTC)

Let me rework my argument. The article assumes that the statement "All ravens are black." has an absolute meaning independent of context, but that is not so. It can mean either: All normal ravens are black. or It can mean: There are no creatures that are fundamentally similar to ravens that are not black.

This ambiguity exists in all statements of this type. It can only be resolved if there is some way of determining the true meaning of the statement. In other words, the argument is intentionally misleading because it assumes something that is not true. EW

"All ravens are black" is just an example. We're using it because it's traditionally used for this paradox, to the point that the paradox bears the name of this example. -- Tarquin 15:24 Oct 28, 2002 (UTC)

Perhaps this example will clear things up. Let's take the statement "All Klefs are Smodgy" where a "Klef" is a prime number between 0 and 2,000,000,000,000 and an object is "Smodgy" if its name has appeared in print somewhere prior to the 1st of January 2000.

The sort of ambiguity that you see in the Raven statement doesn't appear in this one. There is no doubt about which numbers we are talking about. Neither does the self-reference. Smodginess is not a part of the Klef definition in the way that blackness is (arguably) part of the Raven definition. Finally the statement isn't assumed to be true. The point is that it has to be checked. We can't say whether the statement is true or not because we don't know whether the Klefs are Smodgy for sure until we check. And that's what the "paradox" is really about: the checking process. It states that we can get the same results whether we:

  1. Check all the objects which we know to be Klefs in order to find out whether they are also Smodgy.
  2. Check all the objects which we know to be not Smodgy in order to find out whether they are also not Klefs

Logic states that these two methods are equivalent, as far as checking whether the statement is true or not is concerned, but we know that the latter method takes a lot more work before we can be sure whether or not "All Klefs are Smodgy" is a True statement or a False one, and so to many people it seems paradoxical that it can work at all. Hence the title of this article. -- Derek Ross 17:42 Nov 10, 2002 (UTC)

I do appreciate your clarification of the paradox, but I still run into a problem. The problem is with the word paradox. I understand the word paradox to mean a contradiction that cannot be satisfactorily resolved no matter how you work at it. After your clarification of the problem, it is so clear that there is nothing to bother you about it any longer, and it no longer fits my definition of paradox. <grin> Ezra Wax 01:39 Nov 11, 2002 (UTC)

All paradoxes can be resolved. If there were an actual contradiction in the world, the world wouldn't exist (at least mathematically speaking) :-) AxelBoldt 02:09 Nov 11, 2002 (UTC)

Further discussion of paradoxes in general should probably go in Talk:Paradox... on another topic, I'm concerned about the paragraph that starts, "This principle is known as "Bayes' theorem". It is foundational to the mathematics of probability and statistics..." Now, I am personally a Bayesian, and I agree with the content of the paragraph — but any devoted frequentist would likely find it pretty inflamatory. A rewrite is needed... Cyan 07:36 Apr 12, 2003 (UTC)


The claim that the paradox does not arise using Bayes’ theorem strikes me as somewhat contentious. In the first example, the criterion for selection is an apple, not a non-black thing or a raven, so the example is irrelevant, and it is not surprising that selecting the apple makes no difference to the belief. In the second example, in which the selection is relevant, the paradox arrises, albeit to a very small degree. Why should observing a red apple be supporting evidence for ‘all ravens are black’ at all, even to a very slight degree? Banno 02:02, 20 Dec 2003 (UTC)


please re-read what you have written above: "an apple, not a non-black thing or a raven". An apple IS a non-black thing, unless your apples are very different to mine. -- Tarquin 18:50, 20 Dec 2003 (UTC)

The article says: 'If you ask someone to select an apple at random and show it to you, then the probability of seeing a red apple is independent of the colors of ravens' - what is specifically requested is an apple. On the implicit assumption that no apples are black (except the one I pulled out of my daughter’s schoolbag after the vacation) the fact that the probability of the apple being red is independent of the colour of ravens is trivial. That is, the sampling method biases the result. If, instead, you were asked to select a non-black thing, then the probability of picking a non-black raven must indeed be included, and so the probability of seeing a red apple would not be independent of the colour of ravens.

Re-stating my point, the first example is not an example of the Raven paradox, and the second example (in the last paragraph) shows that the raven paradox holds; therefore the statement that the paradox does not arise is simply not true. I suggest replacing it with 'this principle shows that the influence of such examples is vanishing small' or some such. Banno 20:37, 20 Dec 2003 (UTC)

I believe the paradox arises because the induction principle ("If an instance X is observed that is consistent with theory T, then the probability that T is true increases") doesn't take background information (such as sampling conditions) into account, so it sometimes leads to improper inferences. Consider the case that you are an expert ornithologist (i.e. untrickable) and you find a white raven feather. While this object is indeed a non-black non-raven, it's pretty obvious that blind application of the induction principle leads one to a logically indefensible increase in the probability that all ravens are black.
Bayes' theorem (as I've formulated it in the article) explicitly forces one to consider the prior information that goes into the probability calculation, such as the sampling conditions and whatnot. Sometimes using Bayes' theorem leads one to the same conclusions as the induction principle; other times, it will not. (For instance, try applying it to my example above.) Bayes' theorem resolves the paradox because it takes care of the cases when the induction principle fails due to lack of consideration for background information. -- Cyan 02:23, 21 Dec 2003 (UTC)

I agree with your analysis. So in effect the Bayesian approach agrees with that of the red apple example given earlier in the article (from Quine isn’t it?)– observing a red apple really does increase the chance of all ravens being black, since it slightly increases the confirmation in the background information. OK, thanks for the reply. Banno 10:58, 21 Dec 2003 (UTC)

Actually, I'm inclined to say that given the implicit background information most people are using, the observation of a red apple (drawn at random from a world of objects) conveys no information about ravens at all: the color of the apple is irrelevant to (i.e. independent of) the color of any or all ravens, and vice versa. The reason I say this is because the red apple gives equal support to all theories of the form "All ravens are Z", where Z is any color. In this case, Pr(X|TI) = Pr(X|I) for any given T (in English, the probability of observing the apple is the same no matter what theory of raven color we assume), so by Bayes' Theorem, Pr(T|XI) = Pr(T|I) for any given T. -- Cyan 17:26, 21 Dec 2003 (UTC)
The red apple does not give equal support to all theories of the form "All ravens are Z", where Z is any color. Consider the case where Z is the colour red. In this case it is impossible to draw a red apple from the set of non-red objects and therefore it is impossible for a red apple to support the theory that All ravens are red.
(I interject: you are correct. My argument above is flawed. -- Cyan 08:14, 23 Feb 2004 (UTC))
Even in the case where Z is some other colour such as blue, it is wrong to say that observing a red apple can give us no information about the colour of ravens. Suppose that we methodically check the non-blue objects for ravens, one colour subset at a time, starting with the red objects. We check the red objects one after another until only one red object remains to be checked and so far we have seen no red ravens. If the last red object turns out to be a raven then we have gained the information that the hypothesis All ravens are blue is false. If the last red object turns out to be an apple we have gained the information that there are no red ravens which makes the hypothesis that all ravens are blue slightly more likely. Of course it also increases the possibility that they are all green, yellow, white, black, or that the hypothesis is false. This increase in the likelihood of the remaining colours continues as we rule out each colour until only one possibility remains (with a likelihood of 100% of course). I hope that this clarifies why we now know a little more about the colour of ravens than we did before owing to our observation of the red apple. -- Derek Ross 05:11, 20 Feb 2004 (UTC)
The logical flow of the preceding argument is unnecessarily convoluted. All I mean to demonstrate is that Bayes' Theorem allows one to hypothesize independence when it makes sense to do so, whereas the induction principle sometimes forces logical dependence inappropriately. -- Cyan 18:23, 21 Dec 2003 (UTC)
I disagree. When Bayes theorem is applied to hypotheses involving a finite set of objects, such as the number of objects in the universe, it will come to the same conclusions as the induction principle in all cases. In fact if the numbers of the various objects are known it can give us an exact calculation of how much the observation of a red apple will affect the likelihood that all ravens are black at any stage of the observation process. I would not be surprised if complete logical independence is possible, only when there are an infinite number of objects to be considered. -- Derek Ross 05:26, 20 Feb 2004 (UTC)

Now it's my turn to expose a flaw in your argument. (Heh heh.) Consider the following toy problem. Our universe is limited to two possibilities: first, there are n black ravens and m non-black non-ravens (henceforth NBNRs) (label this possibility "T"); second, there are n-k black ravens, k white ravens, and m NBNRs (label this possibility "~T"). Let us assume that sampling objects occurs uniformly; call the observation of an NBNR "E". Call the specification of the entire situation "X". It's easy to demonstrate that E and T are independent; that is, Pr(E|T and X) = Pr(E|X).

By the law of total probability:

\Pr(E|X) = \Pr(E \mbox{ and } T|X) + \Pr(E \mbox{ and } \sim T|X)

\Pr(E|X) = \Pr(T|X)\cdot \Pr(E|T \mbox{ and } X) + \Pr(\sim T|X)\cdot \Pr(E|\sim T \mbox{ and } X)

Can you give a little more detail here ? I see no way of getting from Pr(E and T|X) to Pr(T|X) . Pr(E|T and X) (even if we assume the independence of E and T which of course we are not allowed to do), so I'd like to see how you do the substitution. -- Derek Ross 20:46, 23 Feb 2004 (UTC)
Conditional probability is typically defined by the relation Pr(A|B) = Pr(A and B) / Pr(B), which can also be written Pr(A and B) = Pr(B) * Pr(A|B). The above manipulation is clarified by identifying B with T (or ~T) and A with E, keeping in mind that X is just the prior information that specifies the problem (usually people don't bother to include it in the probability symbols, but it's always there implicitly). -- Cyan 21:07, 23 Feb 2004 (UTC)
Thanks, I see now. The extra X's were confusing me. Your original problem specification guarantees that Pr(X) = 1 which means that Pr(Z|X) = Pr(Z) and Pr(Z|Y and X) = Pr(Z|Y) too. That makes all the equations simpler. -- Derek Ross

Now note that Pr(E|T and X) = Pr(E|~T and X) = m/(m + n). In words, the likelihoods for the two possibilities are equal to each other: this is the key point of the toy example, and I discuss it more fully below. We proceed:

\Pr(E|X) = \left(\Pr(T|X) + \Pr(\sim T|X)\right)\Pr(E|T \mbox{ and } X)

\Pr(E|X) = \Pr(E|T \mbox{ and } X)

This example demonstrates the independence I alluded to above, even though the numbers of objects involved is finite.

Now, in reality, the prior information X at our disposal is at once more rich and more vague than that of the toy problem. In particular, our way of sampling objects in our surroundings in no way resembles the uniform sampling we assumed for the toy problem. We can identify three cases:

Case 1:

\frac{\Pr(E|T \mbox{ and } X)}{\Pr(E|\sim T \mbox{ and } X)} > 1

For whatever reason, we believe that it is more likely that E will occur if T is true than if T is false. Once E occurs, by Bayes' theorem, our personal odds that T is true increases, in agreement with the induction principle.

Case 2:

\frac{\Pr(E|T \mbox{ and } X)}{\Pr(E|\sim T \mbox{ and } X)} = 1

For whatever reason, we believe that the occurence of E is independent of T. The occurence of E has no influence on our belief in T; this disagrees with the induction principle.

Case 3:

\frac{\Pr(E|T \mbox{ and } X)}{\Pr(E|\sim T \mbox{ and } X)} < 1

For whatever reason, we believe that it is more likely that E would occur if "all ravens are black" is false. For example, E might be the observation of an object we recognize as a white raven feather. Although a white raven feather is a NBNR, our prior information strongly suggests that it is evidence for the existence of a non-black raven. Once again, this disagrees with the induction principle.

-- Cyan 08:14, 23 Feb 2004 (UTC)

This discussion strikes me as closely related to that between the philosophers mentioned in the article who question the principle of equivalence, and those that say our intuition is flawed – Cyan’s position is similar to the former, Derek’s to the latter. In either case, it is apparent that, although the Bayesian treatment goes a long way towards explaining away the paradox, the statement in the article that “Using this principle, the paradox does not arise” is not quite right. Instead, the paradox is transformed, becoming the problem being discussed here. Banno 20:33, 20 Feb 2004 (UTC)

I want to be clear: the Bayesian argument does not reject the principle of equivalence; it rejects Hempel's induction principle. -- Cyan 08:14, 23 Feb 2004 (UTC)

Non-black non-ravens should still increase your certainty that all ravens are black because it's another thing that fits in with it, more evidence. The thing about logic is that A \Rightarrow B \Leftrightarrow \neg B \Rightarrow \neg A - The more non-raven non-black things you see, the more of the universe you've seen which supports your hypothesis, the closer you are to being certain. --mjec 16:09, 22 Aug 2004 (UTC)

Agreed. And for finite sets such as "All objects in the universe", that should be true whether you look at things from an inductive viewpoint or from a Bayesian one. -- Derek Ross | Talk 18:32, 2004 Aug 22 (UTC)

begin quote: What in the crap is this article about? end quote 68.18.169.220 03:49, 12 November 2005 (UTC)

[edit] I would have thought...

It seems to me the solution to the raven paradox is that observing a non-black non-raven, or indeed observing a black raven, should not actually increase your belief in the statement "All ravens are black," at least, not when it's phrased as openly as that. If you got to the stage where you'd obsserved all the ravens in the world and found they were all black, or else observed all non-black things and found none of them to be ravens, you could believe the statement. But before that, it depends how you find the things you observe.

If you see a red apple in a supermarket, obviously it wouldn't increase your belief in the raven statement, because you know there are no ravens there, and that there are apples. So it's not just bare observation - there's some choice invilved on your part. (Similarly, if you went around observing ravens in a place called "The Black Ravens Only Zoo and Gift Shop," those observations wouldn't increase your belief in the statement, as you are self-selecting black ravens.) However, if someone says to you: "I'm going to show you a selection of objects that aren't black, chosen at random from all the non-black objects in the Universe," then under those circumstances, observing each non-black non-raven would indeed increase your beief in the proposition, because if there were non-black ravens, you'd expect to see one sooner or later in the great procession of non-black objects.

Agreed. The statement "All objects which are Ravens are black" refers to ALL the Raven objects in the universe and the statement "All objects which are not black, are not Ravens" refers to ALL the non-black objects in the universe. The two statements are only equivalent if the number of objects in the universe is finite and if we are talking about ALL the objects. -- Derek Ross | Talk 18:31, 2004 Nov 7 (UTC)

[edit] Another possible resolution

I remember reading someone (Gardner? Hofstadter?) reacting to this paradox by proposing a restriction of domain. I'm pretty sure the wording was almost exactly this: "When we say, 'all ravens are black', we are not claiming of all objects that if it is a raven, then it is black. Rather, we claim of ravens that all of them are." Then the contrapositive becomes "Among ravens, if it is not-black, then it does not exist" and this can safely be used for induction.

Unfortunately I can't recall where exactly I read this, and I cannot find any references to it. Does it sound familiar to anyone? --67.180.142.97 01:11, 5 Mar 2005 (UTC)

That doesn't really work. Definitions in general single out a specified subset (domain) from a larger set, by imposing some required property for membership in the subset. When you adopt a set theoretic approach to logic, there is no difference between P1(x)="x is a raven" where we know that ravens are objects, and P2(x)="x is an object such that x is also a raven". Putting it another way, when you use the word "raven" it inherently denotes just those objects that have whatever properties go into the definition of what a raven is supposed to be. If the definition is a good one, then there is no question about the selected domain. Indeed, we need to consider other specified domains during the Bayesian analysis. — DAGwyn (talk) 00:26, 28 March 2008 (UTC)

I was thinking of this and, on the surface it seems quite simple:

"Domain: Ravens

∀x[Bx]"

Which simply doesn't have the logical equivalence of "all non-black things. Problem solved! But then I started to worry about the "Domain" bit. What does it mean? It means something like "for all ravens the following statement is true". So lets give the "following statement" a letter, say "B". We're saying if R then B and we're back with the paradox! Translating the contrapositive we get to the (now garbled) "IF it is a raven THEN if it is not-black then it does not exist" again a few deft letters to show the logical form of the sentence and we are once again back where we started from. --87.80.133.64 19:54, 9 May 2007 (UTC)

[edit] This is related to Popper's critique of Induction

David Hume: induction could not be logically justified. Karl Popper: (L.Sc.D., Conj.&.Ref.) scientific theories cannot be verified, only falsified, etc. So this is not a paradox, it is a refutation of positivism. (A crucial test [falsification attempt] for a statement is not a crucial test for its contrapositive. (There should be a bayesian argument for the previous sentence, but it is not in C.&R.; and I don't have L.Sc.D. at hand. )

Could someone fold these ideas into the article - I just don't have the energy. --mpcalc

[edit] The Above Comment is Correct

What Hempel's Paradox tells us is that confirmation can be somewhat meaningful, or trivial or even meaningless. As a red apple tends to confirm that all ravens are black, it also tends to confirm that all ravens are white (even the same apple in the same instance). Confirmation of an empty set is meaningless and confirmation of a necessary paradox is (if possible) even less meaningful.--The Vampire LOGOS

As you say the red apple tends to confirm that ravens are black, or white, or blue, or yellow, etc. However the one thing that it does not do is tend to confirm that ravens are red. In short the red apple tends to confirm that ravens are not-red. When put in those words, the statement is neither meaningless nor paradoxical. -- Derek Ross | Talk 23:10, 10 October 2005 (UTC)

I'd also like to point out that while "the above statement is correct" for infinite sets where bayesian methods are the only ones that make sense, the Raven Paradox discusses an enormous but finite set, the Number of Objects in the Universe, where frequentist methods can, in principle if not in practice, be used. -- Derek Ross | Talk 23:23, 10 October 2005 (UTC)

Um, yeah, so every object in the universe must help us prove that ravens are all black. so I was wondering if black or even green raven DNA might qualify as a raven-object. Finding a piece of fresh raven spit DNA on a grain of sand might increase our belief that there was a raven there? Thus you will technically have to examine every piece of the universe to look for raven DNA. There could be a piece of raven DNA stuck inside a solid block of stone. No? How would you ever really know? Am I going nuts? -- (Anon who didn't sign)

[edit] Wow

Looking at this talk page, it appears that quite a lot of people did not exactly put on their thinking caps when they read the article. Matt Yeager (Talk?) 01:04, 23 February 2006 (UTC)

[edit] Removed a paragraph

However, a hypothetical experiment can demonstrate that there is a problem with the above reasoning: Suppose all the ravens in the universe were magically placed inside a large box, and that you have not seen a raven yet. Now imagine going over every non-black thing in the universe (with the exception of looking inside the box) and verifying it is indeed not a raven. The problem is that after doing this you could still not make any statement regarding the color of a raven. The reason is that you could also have verified in the same manner that every non-red, non-green or non-blue thing in the universe is also not a raven (since the ravens are safely in the box). So at the end of the experiment you have seen almost every non-black thing in the universe and verified it is not a raven, but having done that did not contribute anything to knowing the color of one raven.

The preceding paragraph misses the entire point of the paradox--that the object being shown must be drawn from the pool of EVERY item in the universe. Showing a non-raven item is identical to showing an apple. The paragraph is redundant and confusing, so I tossed it. Matt Yeager (Talk?) 05:39, 18 August 2006 (UTC)

Good man! -- Derek Ross | Talk 06:29, 27 September 2006 (UTC)
Thanks! Matt Yeager (Talk?) 06:08, 28 September 2006 (UTC)

What Kind of psuedo-logic is this?

In the above context, seeing a Red Apple does not support (and never confirms) Ravens of any color other than Black, in the sense that seeing an example that agrees with your Hypothosis, supports your Hypothosis to a small degree, in relation to the number of observations, and the size of the set of things you are looking at. Seeing a Red Apple does not support Ravens are not Red, because 'Redness' is not being tested 'Black-ness' and 'Raven-ness' is what is being tested. Being Not Black AND Being Not Raven COMBINED is what supports 'All Non-Black things are Non-Raven Things' and by original logic 'All Ravens are Black'. This is only the very slightest of support but it is there. Seeing a Black Raven does increase your belief in 'All Ravens are Black' AND 'All Non-Black things are Non-Raven Things' significantly, because your sample size has increased by one observation, and it agrees with your hypothosis directly and the sets involved (Raven-Things and Black-Things) are comparitively small. One interesting point is that nothing is true, when you have made zero observations, but as soon as you make one...and every other one afterwards, your probability will eventually go in the direction of the truth.

Replace the All in the original statement with 100%, then perform your observations, do the math, and allow the percentage to adjust in the original statement, sampling from each set involved (Raven, Non-Raven, Black, and Non-Black which combined is everything in the universe) to have a valid test you must have at least a sample from each set. After making lots of observations, you come across a Non-Black Raven, depending on the number of supporting observations already made this will decrease your probability below 100% but only slightly, unless it is amoung your first observations.


Theorectically with enough samples you can approach an estimate of the probability density of Black Ravens in the Universe as well. Then we can go for a ride on the heart of gold, with the inifinite improbability drive, and see where the Non-Black Non-Ravens take us!

Also in the magic box scenerio above, if you observe absolutely every non-black thing in the universe and none of them are Ravens, and someone holds this magic box and says all of the Ravens are in it, you can absolutely say that all Ravens are Black without looking, because you have observed everything of every other color in the universe, that is deduction, not induction. Either the Ravens are Black or the box is empty. If the logic of 'All Ravens are Black' is true, then you will observe every non-black thing without looking in the box, if it is not true, you have not observed every non-black thing. You must be free to select any complete condition to sample from, either all Raven things, all non-raven things, all black things, or all non-black things to apply this deduction. This is similar to epimenidies (non) paradox of the liar, where you make 2 contradicting statements and call it a paradox, it is no more a paradox than saying 0 = 1, you are contradicting yourself.

15.251.169.70 23:43, 9 January 2007 (UTC)chayden@hp.com

[edit] A false premise

"If an instance X is observed that is consistent with theory T, then the probability that T is true increases"

This is a false premise. Observing a black raven does NOT in itself increase the probability that all ravens are black.

Surprising? Not really. The plausibility of any hypothesis can only be be judged against other hypotheses, unnamed by Hempel. For example, we could be contemplating one world in which there are 100 ravens - all black - out of a total 1,000,000 birds against another world in which there are 200,000 black ravens and 1,800,000 white ravens. Now, let's say that we observe a black raven. This (due to the Bayes theorem) increases by the factor 1,000 the probability that we are living in the second world, in which most ravens are NOT black.

In short, to judge a hypothesis's probability given some evidence, we have to consider the alternative hypotheses as well as their prior probabilities.

The above argument against the "paradox" is due to I. J. Good (1967). I found it quoted by E. T. Jaynes in Chapter 5 of "Probability Theory: The Logic of Science", page 522.

Funny how everyone falls into the trap of thinking about apples ;-) —The preceding unsigned comment was added by 84.129.55.251 (talk) 20:47, 6 December 2006 (UTC).

That's a good argument when the only possibilities are the two worlds mentioned. If I knew for certain that one of the two had to be true, observation of a black raven would definitely make me think that I lived in world two (because the chances of seeing a raven at all in world one are so low). Moreover seeing 101 black ravens makes it completely certain that I am living in world two where some ravens are white. However the maths changes depending on how many possible worlds you admit. For instance if you admit a third possible world where there are 200,001 black ravens and 1,800,000 white ravens and a fourth where there are 1,999,999 black ravens and 0 white ravens the factor changes (and 101 ravens is no longer a magic number). Since there is a very large number of possible worlds in the real universe (and an infinite number of possible worlds in the problem universe) which all need to be taken into account to model, "Bayesianly", a statement like "all ravens are black" which makes no comment about the actual numbers of objects/ravens involved, the calculation needs to become a lot more complex then the simple one presented by I J Good. It needs to include all the possible combinations of non-ravens/black ravens/non-black ravens as possible worlds and then show how observation of a black raven affects the chances that you live in one of the worlds where all ravens are black. It would be interesting to see what conclusion it would come to though. Who knows ? It might even show the premise to be true after all (in the real universe at any rate). -- Derek Ross | Talk 00:02, 7 December 2006 (UTC)
I think that Good/Jaynes simply want to show that the problem is not specified enough for making any conclusions (as you also noticed). Hempels initial assumption upon which the paradox is built is unfounded. The "paradox" aspect refers to the difficulty of interpreting the non-black non-raven observation. I agree with you that the "two worlds" argument does not quite hit the nail on its head. However, we can easily show the problem is missing information to permit ANY conclusions. Let's just attempt to apply the Bayes theorem to find out the updated odds for the AllRavensBlack hypothesis:
O(AllRavensBlack | BlackRavenSeen) = O(AllRavensBlack) * P(BlackRavenSeen | AllRavensBlack) / P(BlackRavenSeen | SomeRavensNotBlack)
The likelihood ratio P(BlackRavenSeen | AllRavensBlack) / P(BlackRavenSeen | SomeRavensNotBlack) can be anything you like, depending on the relative number of black and non-black ravens in the (one, actual, our) world. So Hempel's premise that "BlackRavenSeen supports AllRavensBlack" crumbles. He then proceeds to deduce something from his false premise and wonders that deductions can go either way. It seems like a rather embarassing blunder. —The preceding unsigned comment was added by 134.106.27.84 (talk) 11:21, 7 December 2006 (UTC).
But Good's "possible worlds" approach actually does gives us a method of working with the hypothesis, "AllRavensBlack". Although the likelihood ratio in our actual world can't be calculated directly because we don't know the numbers, we know that our actual world is one of a finite number of possible worlds in each of which the likelihood ratios are calculable because the numbers of non-ravens, black and non-black ravens are all known.
The possible worlds can be enumerated as:
There are 0 objects in the universe, 0 are black ravens, 0 are other ravens;
There is 1 object in the universe, 0 are black ravens, 0 are other ravens;
There is 1 object in the universe, 0 is a black raven, 1 is another raven;
There is 1 object in the universe, 1 is a black raven, 0 are other ravens;
There are 2 objects in the universe, 0 are black ravens, 0 are other ravens;
There are 2 objects in the universe, 0 are black ravens, 1 is another raven;
There are 2 objects in the universe, 0 are black ravens, 2 are other ravens;
There are 2 objects in the universe, 1 is a black raven, 0 are other ravens;
There are 2 objects in the universe, 1 is a black raven, 1 is another raven;
There are 2 objects in the universe, 2 are black ravens, 0 are other ravens;
There are 3 objects in the universe, 0 are black ravens, 0 are other ravens;
There are 3 objects in the universe, 0 are black ravens, 1 is another raven;
There are 3 objects in the universe, 0 are black ravens, 2 are other ravens;
There are 3 objects in the universe, 0 are black ravens, 3 are other ravens;
There are 3 objects in the universe, 1 is a black raven, 0 are other ravens;
There are 3 objects in the universe, 1 is a black raven, 1 is another raven;
There are 3 objects in the universe, 1 is a black raven, 2 are other ravens;
There are 3 objects in the universe, 2 are black ravens, 0 are other ravens;
There are 3 objects in the universe, 2 are black ravens, 1 is another raven;
There are 3 objects in the universe, 3 are black ravens, 0 are other ravens;
etc. (to a very large but finite number for the real world)
This is obviously not a Bayesian calculation for the faint-hearted but it is a possible one (and its sum is 1 since one and only one of the hypotheses must be true). However forgetting the difficult calculation for the moment, one can still see that each new observation rules out all possible worlds (ie hypotheses) in which the universe contains less objects than have been seen so far and, in particular, that each observation of a black raven rules out all hypotheses in which less than that number of black ravens are posited. Since each hypothesis has a likelihood associated with it, that means that the overall likelihood for the set of hypotheses where "0 are other ravens" (which is the set that corresponds to the statement, "All Ravens are Black") will change too. It's an interesting question as to exactly how it changes when we see a black raven (or a red apple for that matter). The answer would tell us whether Hempel's premise, "BlackRavenSeen supports AllRavensBlack", is out to lunch or not. -- Derek Ross | Talk 18:28, 7 December 2006 (UTC)
Wow. That's a very, very interesting string of logic, there, by both of you. I never even thought about it like that, but after a while, this thought occured to me: that the premise (seeing a black raven increases the odds for the statement to be true) IS true. It has to be true, simply because there is a finite number of objects in the universe. Every object we observe (whether a black raven or a red apple) that is not a non-black raven increases the probability that the statement is true. I think the statement is valid and true on that reasoning alone. Matt Yeager (Talk?) 01:47, 22 December 2006 (UTC)

I came across this article and was astonished to note that it did not include a discussion of I. J. Good's argument or a citation. It should, since Good's argument shows with a clear counter-example that the assertion that observing an instance (e.g., a black raven) necessarily supports the hypothesis that all ravens are black, is simply false. Thus, no paradox can exist since one of Hempel's crucial premises is false.

I believe that some mention of this should appear in the main article. Without an appropriate mention, the article is unbalanced.

The relevant papers are:

Good, I. J., "The Paradox of Confirmation," Br. J. Phil. Sci. 11, 145-149 (1960)

Good, I. J., "The Paradox of Confirmation (II)," Br. J. Phil. Sci. 12, 63-64 (1961)

Good, I. J., "The White Shoe is a Red Herring," Br. J. Phil. Sci. 17, 322 (1967)

Hempel, C. G., "The White Shoe: No Red Herring," Br. J. Phil. Sci. 18, 239-240 (1967)

Good, I. J., "The White Shoe Qua Red Herring is Pink," Br. J. Phil. Sci. 19, 156-157 (1968)

Good's first (1960) paper contains an error which is corrected in the second. The Hempel paper is a response to Good's third paper, and Good refutes Hempel's argument in the last paper. Hempel imagines that one can argue in the absence of background information and alternative models, but Good shows otherwise. As far as I know, Hempel did not respond further. Bill Jefferys 19:21, 15 August 2007 (UTC)

Derek's observation that you need to enumerate all of the possible worlds is on the right track but not quite correct as he stated it. When fixed it reveals why Hempel's argument (and also the purported Bayesian calculation in the article) aren't correct.

Derek's idea of enumerating the possible worlds is sound, but he didn't enumerate all of them that are needed. Recall that there have to be some non-black non-ravens in the world, and he didn't list any worlds of that sort. So let i=number of black ravens in a world, j=number of non-black ravens in a world, k=number of black non-ravens in a world and l=number of non-black non-ravens in a world. The possibilities are therefore represented by all quadruples (i,j,k,l) with i,j,k,l integers greater than or equal to zero; we can write the prior probability that the actual world is represented by a particular quadruple by P(i,j,k,l), which is a real number on [0,1]; the numbers so assigned are the prior probabilities of each of these worlds, and they must be assigned so that they add up to 1.

But now we see that observing a non-black non-raven, or even observing a black raven, may decrease, increase, or leave unchanged the posterior probability that we are in a world where all ravens are black. We see this immediately in the simple cases described by I. J. Good in the citations I gave above, as well as the two-world case given at the top of this section. But that's just a particular choice of prior, where the priors on two particular worlds are given positive values and the priors on all other worlds are zero. This is a counter-example to the claim that the article's purported Bayesian calculation makes, as well as a counter-example to the entire Hempel argument. It's clear that whether ones confidence is increased, decreased or remains unchanged depends critically on the assignment of the prior probabilities P(i,j,k,l). Therefore, the problem as stated is simply ill-posed. There is no paradox.

The point really points up two critical mistakes that I have seen frequently in the philosophy of science literature when alleged Bayesian arguments have been made. These are points made trenchantly and convincingly in Jayne's book (cited above). Mistake number 1: You cannot make a correct Bayesian calculation without enumerating all the alternative hypotheses and assigning a prior on them. Here, the alternatives are represented by the mutually exclusive and exhaustive quintuples (i,j,k,l). Mistake number 2: Being vague about what background information is being used. The discussion of Bayesian resolutions in the article is vague about the background information. As we have seen, the background information includes the priors assigned to all the various possible worlds. Without this background information, no conclusions can be reached.

The Bayesian part of the article is wrong and needs to be rewritten. Bill Jefferys (talk) 17:39, 10 January 2008 (UTC)

(Guilty as charged, <grin>. I didn't explicitly list non-black, non-ravens, (although I did implicitly list non-ravens). Perhaps wrongly, I didn't think that it was necessary to go to that level of detail for the point I was trying to make. But for the point that you are making I fully agree with your approach in extending to a quadruple. -- Derek Ross | Talk 18:41, 3 April 2008 (UTC))
Bill, your explanation of how to apply Bayes's theorem to this is correct, but even so, a paradox does remain. It's called a paradox because the application here of logical reasoning, including Bayesian inference, leads to a counterintuitive conclusion with no clear reason why it's reasonable. In your formulation, you agree it's possible that selection of certain priors (estimates about the frequency of possible worlds) can influence whether "observation of a non-black non-raven" should make one alter his estimate of the probability he assigns to "all ravens are black". It is this fact that makes it a paradox: Why, we wonder, should an observation of a red apple, have any influence whatsoever, on the probability of the claim "all ravens are black"? What does apple color have to do with raven color? So, AFAICT, the paradox is unresolved until you can repair the intuition that leads to rejecting the raven/apple color connection, OR you show a relevant way in which "all ravens are black" is unconnected to its contrapositive. MrVoluntarist (talk) 21:56, 2 April 2008 (UTC)

I don't agree. The problem is ill-posed, as Good's example shows, so there cannot be a paradox. Bill Jefferys (talk) 18:04, 3 April 2008 (UTC)

I don't know which "problem" you refer to as being posed. The paradox is precisely that a reasonable-sounding line of argumentation leads to an unintuitive and apparently wrong conclusion. A resolution of that paradox requires showing where the argument went astray. — DAGwyn (talk) 23:53, 2 May 2008 (UTC)

[edit] Theory vs. Statement

Can somebody change the article to replace the word "theory" in the opening paragraphs. It is misleading, and it connotes something different for me as a reader. A better word might be "statement." 71.136.50.128 22:47, 21 December 2006 (UTC)


The Theorem is Sound, there really is no paradox: (lets dismiss the nit picking about potential white ravens, we are looking at the inductive logic)

First, you must use Hierarchial relationships carefully in your example (Mammals being the superset, and humans, ravens, crows or cows being the subset) in the intended usage of Bayes Theorem or the paradox statement, the events should be de-coupled (logically independant variables) or properly ordered sets that are being analyzed for dependancy or correlation. The difference here is that no where is it stated that all things black are ravens (Equivalence). If we say all Ravens are Black, then we assign Ravens as a subset of all Black-things, and the reverse (all Black things are Ravens) is not implied. So when stating the inverted logic (All Non-Black things are Non-Ravens) you must place the larger superset first, because in the original statement (All Ravens are Black) we proclaim the smaller set as a member of the larger set, not an equivalence (NOT Black = Raven NOR Non-Raven = Non-Black). This is just the syntax of the english language.

Second, The set of all ravens is very small compared to the set of all non-ravens, also the set of all black things is small compared to the set of all non-black things (regardless of the actual physical frequencies/occurances in nature, we have many different non-black colors, and many different non-raven things defined, regardless of observing them) thus, seeing a Red Apple should increase the belief, in both 'All Non-Black Things are Non-Ravens' and 'All Ravens are Black' in proportion to the set of all Non-Black AND Non-Raven things (a very small proportion) because it was a non-black, non-raven thing observed. Seeing a Black Raven yields a greater belief increase in both statements because the observation is in proportion to Black things and Raven-things (much smaller sets), and is thus much larger proportion.

It would be interesting to see an evaluation of what affect observing a Black Apple does to the probabilities invovled....

Of course observing a White Raven would cause the entire function to collapse, in a true-false sense, but if the record of observables was stated as a probability or percentage and fed back to the original statement (99.9% of all Ravens are Black) through several observations an iterations, it would still hold that 99.9% of all Non-Black things are Non-Raven Things, even though these sets are larger, so are the number of observables that satisfy the logic. The probability that a Raven is White then is so very slight, much smaller than the remaining 0.1% because it is a proportion to Non-Black things (huge set) and there are still other color possibilities (if ravens with dyed feathers are accounted for to fill in that remaining 0.1%)

15.251.169.70 21:31, 9 January 2007 (UTC)chayden@hp.com

[edit] Confusion over the paradox

This is NOT A PARADOX, for the following reason:

Statement A: "All ravens are black." Statement B: "Everything that is not black is not a raven."

The argument is correct in saying that A and B are logically equivalent. However, the argument is WRONG in saying that observation of a red apple proves statement B.

Observation of a red apple actually proves "SOMETHING that is not black is not a raven". Observation of many red apples proves that "MANY THINGS that are not black are not ravens". In order to prove statement B, we would have to observe EVERY SINGLE NON-BLACK OBJECT in the universe. Then we would be able to say that there are no non-black ravens in existence (because we looked at ALL non-black objects and none of them were ravens), therefore, if any ravens exist, they must be black.

For the purposes of the paradox, it doesn't matter whether Statement A is actually true in nature. We don't need to know whether it is true. We simply need to know that A and B are logically equivalent. As I have discussed, IF we can prove Statement B beyond the shadow of a doubt, then we have also proven Statement A -- and vice versa, and for disproving as well.

No paradox at all. False alarm. INTARWEB LIES 2 U!

"The origin of the paradox lies in the fact that the statements "all Ravens are black" and "all non-black things are non-ravens" are indeed equivalent, while the act of finding a black raven is not at all equivalent to finding a non-black non-raven. Confusion is common when these two notions are thought to be identical." --Gwern (contribs) 18:22 23 February 2007 (GMT)
Certainly they are equivalent, though. Finding a black raven in no way proves that all ravens are black. Finding a non-black non-raven likewise in no way proves that all non-black objects are not ravens. Seems perfectly equivalent to me. Jarandhel]] [[User_talk:Jarandhel|(talk) (talk) 17:24, 13 April 2008 (UTC)
They are not equivalent as observations; Gwern is correct. The statements "all ravens are black," "all non-black things are non-ravens" are statements that have a universal quantifier, that is, they are referring to the set of ravens, the sent of non-black things. On the other hand, the observation of a black raven, or of a non-black, non-raven, are unique singular observations, not statements about a universal property of some set. As observations they not only have different impacts upon our beliefs, but can even have contrary impacts. Read the articles by I. J. Good described elsewhere on this page. Bill Jefferys (talk) 17:52, 13 April 2008 (UTC)

[edit] It's just missing one statement

Can you just say:

1. Ravens are black 2. All non-black things are non-ravens 3. BUT, NOT ALL BLACK THINGS ARE RAVENS EITHER

because then things like coal, by logic would be ravens

190.44.226.122 18:54, 7 April 2007 (UTC)

I'm not exactly sure what you're saying here, but maybe it will help to note that the following is argument is invalid:
1. All ravens are black.
2. All non-black things are non-ravens.
----
All black things are ravens.
Simões (talk/contribs) 19:43, 7 April 2007 (UTC)

Huh? #1 and #2 are entirely equivalent. So erase #2 and any resemblance to a syllogism vanishes. Bill Jefferys 01:09, 30 August 2007 (UTC)

[edit] Doesn't seem like a paradox

Seems intuitive enough to me. I think the article could use a little more explanation as to why people might find this counterintuitive.Awinkle

[edit] Poem?

The paraphrase of the Burgess poem seems to me completely irrelavant, and besides that is unsourced commentary. I am removing it; if you see reason it should be there feel free to say why and revert it. bolddeciever

[edit] Rewrite

I heavily edited the page.

There were several serious errors:

1) The Bayesian solution very popular but the assertion that it has solved the ravens paradox is extremely contentious and doesn't belong in an encyclopedia.

2) This strange 'principle of induction' is not very helpful in this context. Nicod's criteria (which contradicts this principle of induction) is the traditional way of introducing the ravens paradox. But this is also needlesly complex.

3) There were some silly assertions about science and induction.


I've tried my best to correct these and others. Many things still need to be improved:

1) We need to add the full Bayesian proof that discovering an item from a small group confirms better. Trouble is, the proof is moderately long, obvious and dull. Couldn't think of a way to get it in without dragging down the whole article to borish levels. Likewise with the proof that the Bayesians need P(random object is black) to be unaffected by learning 'all ravens are black'. Hopefully, the references will do for now.

2) Exploring the alternative options: Lipton's and perhaps Goodman's need more work.

3) We need to solve the damned paradox!

Ibn Sina —Preceding unsigned comment added by 81.101.136.31 (talk) 16:08, 7 September 2007 (UTC)

Good work. The article flows much better now. Thanks. -- Derek Ross | Talk 22:08, 7 September 2007 (UTC)

[edit] Contrapositive of #1

‘All ravens are black’ is logically equivalent to its contra-positive ‘All non-black things are non-ravens.’

This is not correct. The contrapositive of 'All things with the property "raven" are things with the property "black".' is "Any thing without the property "black" cannot have the property "raven".' The negation of "All A" is "Any not-A", not "All not-A". -RobKinyon 20:05, 10 September 2007

Yes it is. Both 'all Fs' and 'any F' are used to make general claims about Fs in english. 'All non-Fs are non-Gs' and 'Any non-F is a non-G' are logically equivalent. Both will be rendered as ∀x(~Fx → ~Gx) in predicate logic. 82.35.83.52 23:02, 10 September 2007 (UTC)
User:82.35.83.52 is correct. -- Derek Ross | Talk 23:43, 10 September 2007 (UTC)

[edit] Confirmation

What does it mean to "confirm" something? According to this article it would seem a person can be confirmed dead by finding someone who isn't them to be alive, or that first finding a featured article on Wikipedia means all articles are featured articles. Unless of course I don't know what confirm means. --WikiSlasher 08:08, 11 September 2007 (UTC)

It means to check whether something is true or false. For example, imagine that you are given a sealed box and a note stating that "this sealed box contains a rock" you can confirm the statement on the note by opening the box. If it really does contain a rock you will have confirmed that the statement on the note is true. if it does not contain a rock, you will have confirmed that the statement is false.
A person can't be confirmed dead by finding someone who isn't them to be alive but they can be confirmed dead by finding that everyone who is alive isn't them. For instance if I have been told that Jim Jones (born in Auchtermuchty in 1973) is dead, I can either ask for him to be brought to me and take a good look to confirm that he is dead or I can ask for all the "non-dead" people (ie the living people) to be brought to me and take a good look to confirm that all of them are not Jim Jones (born in Auchtermuchty in 1973). Either way will let me confirm whether Jim Jones is dead but the second way is about six billion times more work. The first method is like checking whether all ravens are black things; the second method is like checking whether all non-black things are non-ravens. The thing is that Jim Jones is only confirmed dead once you have looked at all the living people without finding him. Looking at one or two of the living tells you next-to-nothing.
Likewise you can confirm the statement "all articles on Wikipedia are featured articles" either by looking at all the articles on Wikipedia to see whether they are featured (this will confirm that the statement is untrue); or you can look at all the non-featured items (including images, talk pages, user pages, policy pages, etc., etc.) to confirm that they are all non-articles (this will also confirm that the statement is untrue). So both methods will give the same answer. But in both cases, if the statement is true you have to look at all of the relevant items in the group that you are checking before you can confirm the statement true. If the statement is false (as it actually is at the moment) you will be able to confirm it false as soon as you find an article-which-is-not-featured in the first case or an unfeatured-item-which-is-not-a-non-article in the second.
Hope this makes things a bit clearer. -- Derek Ross | Talk 15:43, 11 September 2007 (UTC)
It was "For example, ‘all ravens are black’ is confirmed by discovering my white shoes, or any other object which is neither a raven nor black" in an earlier revision of the page that confused me, as well as "Instance Confirmation: positive instances of a hypothesis confirm that hypothesis. That is, for any properties A and B, discovering an object which is both A and B confirms ‘All As are Bs’." The article makes more sense to me now. --WikiSlasher 12:05, 17 September 2007 (UTC)
Aah, right. I see. Well, the examples I gave were all-or-nothing confirmations (where you know for sure after you've checked) whereas that type of confirmation is a partial confirmation where you suspect more strongly that something is true or false as a result of a particular piece of evidence without knowing for sure. I'm glad the article is making a bit more sense to you now though. Cheers -- Derek Ross | Talk 14:51, 17 September 2007 (UTC)

[edit] Induction

I think that this article needs some rewriting. It was not apparent to me until late in the article that the paradox involved inductive rather than deductive reasoning, and hence that "confirms" is being used in the sense of "provides an additional piece of evidence for." Looking at previous comments, I see at least two that seem to involve the same misapprehension. Apologies: I'd normally do it myself, but I've got a lot on my plate right now and I can't. If no-one else does it, I'll try to get back to it at some point, but I hope someone else can take it on, as it's also not my area of expertise. Eggsyntax 20:23, 11 September 2007 (UTC)

[edit] Retarded

"All Ravens are black" has nothing to do with the color of non-Ravens. I don't get it.--Loodog (talk) 14:32, 6 December 2007 (UTC)

Notice the stipulation of in strictly logical terms, if you were to express the statement "All Ravens are black" it would be: for every X, X then Y, with X = "being a raven" and Y = "is black". With this you can perform modus tollens and get a relationship between color and non-ravens (namely that all non-black things are also non-ravens).--droptone (talk) 13:42, 7 December 2007 (UTC)

[edit] Question

Question, the first statement "that in all circumstances where (2) is true, (1) is also true; can't be correct if there are no black objects because if (2) is true, all this means is that all black objects are ravens, if there is nothing black then there are no ravens? —Preceding unsigned comment added by 78.145.44.244 (talk) 22:34, 17 January 2008 (UTC)

Let's change that to a slightly different statement where there definitely are no examples: unicorns. If I say that (1) "All unicorns are magical", that is the same thing as saying (2) "All non-magical objects are not unicorns". The fact that no object in the universe is magical implies that there are no unicorns. Likewise if there were no black objects in the universe that would imply that there were no ravens. Does that make it clearer ? -- Derek Ross | Talk 15:08, 28 March 2008 (UTC)
Further thought about your question, leads me to the idea that when there are no ravens, both of the statements "All ravens are black" and "All ravens are not black" are true; when there is one raven, only one of them is true; and when there are two or more ravens, one of the statements may be true, or neither of them may be true. Interesting. -- Derek Ross | Talk 06:09, 8 April 2008 (UTC)

[edit] Recently Proposed Solution

A recently proposed solution goes like this:

The act, D, of attributing the predicate "black" to the subject "all ravens" is not equivalent to the act, E, of attributing the predicate "non-raven" to "all non-black things".

The two acts have different effects on the probabilities that "Socrates is a raven" and "Socrates is black":

P(Socrates is a raven|D) = P(Socrates is a raven)

P(Socrates is black|D) = P(Socrates is black OR Socrates is a raven)

while

P(Socrates is not a raven|E) = P(Socrates is not a raven OR Socrates is not black)

P(Socrates is not black|E) = P(Socrates is not black)

which are equivalent to:

P(Socrates is a raven|E) = P(Socrates is a raven AND Socrates is black)

P(Socrates is black|E) = P(Socrates is black)

In short, if we attribute a predicate to a subject then we do not change our estimate of how many of the subjects there are, but we increase our estimate of how many things have the predicate. If we decide to believe that "The students in the class are lazy", then we increase our estimate of how many lazy things there are but it does not affect our estimate of how many students there are in the class.

This subject-predicate asymmetry is mirrored by an asymmetry between the antecedent and the consequent in if-then judgments; if a person decides "If the weather is good tomorrow then I will go outside" he does not change the probability that the weather will be good but he increases the probability that he will go outside, setting it equal to the old probability that "The weather will be good tomorrow OR I will go outside tomorrow".

The asymmetry only appears when you consider the /change/ in probabilities that occurs when a decision is made to believe a proposition. When we decide to believe that all the ravens are black, we use induction to attribute the predicate "black" to "all ravens", so it is the predicate "black" which is extended to cover more things than it had previously been thought to cover. Extending "non-raven" to "all non-black things" would be an entirely different action, since afterwards more things will be thought to be non-ravens than before, while your estimate of the number of non-black things is unchanged.

The different actions require different evidence - if enough evidence supporting D: "All ravens are black" accumulates, we will make decision D. We will go from "All these ravens are black" to "All ravens are black", making the changes to the probabilities shown above for D, which are different to the changes caused by E. In order to take action E: "All non-black things are non-ravens", we would need to start with "These non-black things are non-ravens", so we would need to begin with examples of things like white shoes.

Mathematically, we can calculate the weight of evidence provided by the proposition that "Socrates is black" in favour of the proposition that "All ravens are black" when it is assumed that "Socrates is a raven". This is positive:

e("Socrates is black" -> D) = log ( P(Socrates is black|D) / P(Socrates is black|not D) ) = log ( P(Socrates is black OR Socrates is a raven) / P(Socrates is black|not D) ) > log ( P(Socrates is black) / P(Socrates is black|not D) ) >= log ( P(Socrates is black) / P(Socrates is black) =0

so e("Socrates is black" -> D) > 0

"Weight of evidence" has been used in Good's sense.

When we calculate the weight of evidence provided by the proposition that "Socrates is black" in favour of the proposition that "All non-black things are non-ravens", we find that it is *exactly* zero:

e("Socrates is black" -> E) = log ( P(Socrates is black|E) / P(Socrates is black|not E) )

But P(Socrates is black|E) = P(Socrates is black) [ see above ] which implies P(Socrates is black|not E) = P(Socrates is black)

So e("Socrates is black" -> E) = log ( P(Socrates is black) / P(Socrates is black) ) =0

So discovering that "Socrates is black" when it is known that "Socrates is a raven" provides exactly zero evidence supporting the inductive action of extending the predicate "Non-raven" to "All non-black things".

Likewise, finding that a shoe is white or that a white thing is a shoe provides exactly zero evidence supporting the inductive action of extending the predicate "black" to "all ravens".

There's a (moderated) discussion of this solution going on at sci.math.research with the title "The paradox of confirmation" and a more detailed description of the solution is at http://arxiv.org/abs/0712.4402 and http://philsci-archive.pitt.edu/archive/00003932/ .

I think it's the unique correct solution, but I'm the author so I'm biased. I think it's worth editing the wikipedia entry if nobody can find a flaw.

--latexallergy (talk) 01:22, 31 March 2008 (UTC)

One of the major problems I see in the above argument is that it unnecessarily introduces additional terminology "attributes a predicate" and "evidence provided by a proposition", which require discussion and clarification. The "how many things" idea needs to be investigated further. It's also confusing to use the concrete "Socrates" when a variable should be used instead. I think that by the time you plug all the holes in the argument, to the extent it is correct it is also identical to the Bayesian argument already explained (more clearly, I think) in the article. — DAGwyn (talk) 20:20, 31 March 2008 (UTC)
I think it may not be possible to understand induction without understanding evidence and predicates. I agree that it would be preferable to leave Socrates out of the discussion, but if X is a variable then "X is a raven" is not a proposition and can't be assigned a probability, so it is not possible to talk about the change in the probability that X is a raven when the inductive step is made.
There are some rather important differences between my solution and the "Bayesian" one:
a. The Bayesian solution accepts the equivalence between "All ravens are black" and "All non-black things are non-ravens"; my solution does not accept that equivalence and points out the difference between them.
b. The Bayesian solution accepts the conclusion of the paradox - that seeing a white shoe does indeed support the conclusion that all ravens are black. My solution does not accept this conclusion, and my solution calculates the amount of evidence provided by the observation of a white shoe in favour of the inductive act of attributing the predicate "black" to "all ravens" and shows it to be exactly zero.
c. The Bayesian solution involves purely *deductive* reasoning. The Bayesian solution involves no generalization. According to the Bayesian way of thinking, I can observe that a million ravens are black and it does not affect the probability that the remaining ravens are black. According to the Bayesian way of thinking, the proposition that "All ravens are black" becomes more probable only because some possible counterexamples have been ruled out and the remaining possible counterexamples are no less probable. In the Bayesian way of thinking, there is no way to get from "These ravens are black" to "The other ravens are black". My solution involves *inductive* reasoning; we go from "These ravens are black" to "All ravens are black" in *one* step, called induction, which involves taking the predicate "Black", which has been observed of *some* ravens and predicating "black" of *all* ravens. Induction involves a *risk of error*; after making the inductive step we think that all ravens are black and we might be wrong. In the Bayesian way of thinking there is no risk of error.
The raven paradox is clearly about inductive reasoning, not deductive reasoning. The Bayesian solution is inadequate for this reason. My solution is not.
It might help to compare it to Goldbach's conjecture. Many people believe that all even numbers greater than two are the sum of two primes, because all of the even numbers greater than two which have been checked so far have turned out to be expressible as the sum of two primes. These people might be wrong, but they have made the inductive step of going from "These even numbers greater than two are the sum of two primes" to "All even numbers greater than two are the sum of two primes", and they did not use Bayesian reasoning to do it. Euler, for example, said "That every even number is a sum of two primes, I consider an entirely certain theorem in spite of that I am not able to demonstrate it." Bayesian reasoning can never get you to certainty; all of the infinite number of possible counterexamples remain possible counterexamples. Bayesian reasoning says "I have seen that the first N even numbers greater than two are expressible as the sum of two primes, and that tells me nothing about the remaining numbers." Inductive reasoning says "I have seen that the first N even numbers greater than two are expressible as the sum of two primes, and now I think that the remaining ones are also expressible that way."
I think that in the light of these differences it is not plausible that my solution is the same as the Bayesian one. Latexallergy (talk) 22:46, 31 March 2008 (UTC)
Just to clarify what I mean by evidence and predicate (although these concepts were not introduced by me):
Evidence: Suppose that a coin is biased so that it lands on one side twice as often as it lands on the other. We can try to find out which side the bias favours by tossing the coin repetitively and keeping track of how many times it lands on each side. After we have observed N occurrences of heads and M occurrences of tails, the weight of evidence, measured in bits, in favour of the proposition that the bias favours heads is E=N-M. For each coin toss, there is a corresponding proposition which says that the result of the toss is heads. When we observe the result of the coin toss, we discover that this proposition is true. We then increment E by a certain amount. This amount is called the amount (or weight) of evidence provided by the proposition, H, that the result of the coin toss is heads in favour of the proposition, B, that the bias of the coin is in favour of heads. It coincides with the amount by which the logarithm of the odds of B changes when H is given:
log( P(B|H) / P(not B|H) ) = log( P(B) / P(not B) ) + log( P(H|B) / P(H|not B) )
Hence the quantity log( P(H|B) / P(H|not B) ) is given the title "The weight of evidence provided by H in favour of B".
See the weight of evidence article or read Good's book "Probability and the Weighing of Evidence" or his paper at http://links.jstor.org/sici?sici=0035-9246%281960%2922%3A2%3C319%3AWOECEP%3E2.0.CO%3B2-W or read chapter 6 of Jaynes' book "Probability theory; the logic of science" for more information about weight of evidence.
A sensible person will not require the coin to be tossed an infinite number of times before accepting that the bias is in favour of heads. If the coin has been tossed three million times and two million results were heads and one million were tails, then most people would begin to operate under the assumption that the bias is in favour of heads. The amount of evidence provided has exceeded a threshold beyond which they are willing to accept the risk of being wrong because it is so small. When they decide to accept this risk, they attribute the predicate "biased in favour of heads" to the coin. This is comparable to attributing the predicate "black" to "all ravens" when enough evidence has accumulated, which also carries a risk of error. —Preceding unsigned comment added by Latexallergy (talkcontribs) 01:59, 1 April 2008 (UTC)
Predicates: In the sentence "John is taller than Mary", "John" is called the subject and "taller than Mary" is called the predicate. The subject is the thing about which the sentence says something. The predicate is the thing which is said of the subject.
When we attribute a predicate to a subject, we come to believe that the subject has the predicate. If we do not already believe that John is taller than Mary and we then attribute the predicate "taller than Mary" to "John", then afterwards we will believe that John is taller than Mary. In doing this, we change what is believed *about the subject*.
This is why attributing the predicate "taller than Mary" to "John" is very different to attributing the predicate "shorter than John" to "Mary". If we attribute "taller than Mary" to "John" then we are not changing what we think about Mary, but we are increasing our estimate of John's height. If we attribute the predicate "shorter than John" to "Mary" then we are decreasing our estimate of Mary's height.
This asymmetry between the subject and the predicate is why attributing the predicate "black" to the subject "all ravens" has a different effect to attributing the predicate "non-raven" to "all non-black things". My solution is the only one which recognizes this asymmetry. Recognizing this asymmetry resolves the paradox. Latexallergy (talk) 00:39, 1 April 2008 (UTC)
The image at: http://www.physics.rutgers.edu/~oflan/ravens.png might make what I am saying obvious. Latexallergy (talk) 05:51, 1 April 2008 (UTC)
I have several disagreements with the above. When I said that once the argument is sufficiently patched up it becomes the Bayesian argument, I was also implying that to the extent that it differs from the Bayesian argument it has failed to take something(s) into account. If you draw a diagram of a simple discrete set of ravens and apples, some of each being red and some being black (preferably with different numbers for each pairing), which represents the initial background domain of existents (usually denoted by "U"), and "Socrates" denotes a uniform random specific choice from the available domain, you can see immediately that some of the formulae are wrong. P(Socrates is a raven|D) is larger than P(Socrates is a raven), because in the former case the domain (U&D) is smaller than U; another way of putting this is that the two uses of "Socrates" refer to draws from different domains, since in the |D case Socrates definitely cannot be a red raven whereas in the latter case he can. Equating those two uses of the same proper noun is an error.
Assuming excluded middle, there actually is a strict logical equivalence between "raven implies black" and "non-black implies non-raven". That's not Bayesian; it's just simple set-based logic and can easily be seen from truth tables. So whatever underlies the paradox, it's not inherent in those phrases by themselves.
What the Bayesian argument "accepts" is merely its conclusion, so if the conclusion differs from that of another argument (and both arguments use the same input) then at least one of the arguments is wrong.
You misstated a result of the Bayesian argument. Observing each black raven does increase one's a posteriori assessment of the likelihood of the hypothesis (that all ravens are black), which can be applied to predict the outcome of the next observation. This is explained in the article (in text I added a few days ago). Each likelihood estimate of course involves a recognition that the hypothesis has a chance of being wrong. (It also generalizes nicely to fuzzy inference, but we don't need to digress into that here.)
To actually perform the likelihood computations, one would need some reasonable estimate of how many things exist (and for some purposes, an estimate of what fraction of all things are ravens). However, even without specific numerical values the Bayesian approach explains inductive reasoning and resolves the paradox, in terms of a deductive mechanism that shows how it works (when correctly applied), rather than considering it to be a completely separate form of reasoning.
There are examples of mathematical "theorems" about which many were "certain" that turned out not to be true (e.g. counterexample was eventually discovered). One could reasonably question whether it is a flaw for a methodology to be unable to produce that flavor of "certainty". Also note that many philosophers (especially Kant) considered that mathematical truths were a different kind of knowledge than knowledge about the empirical world, on the grounds that certainty was possible in the former case but not in the latter. Empirical inductive reasoning does not seem to be valid to apply to truth values for unproven mathematical results. Further, a problem with application to numbers is that their domain is infinite, unlike tangible objects in the real world. If you applied Bayesian analysis (ignoring that the very notion of "probability" of truth is problematic for mathematics), the observations can still totally disprove a hypothesis via an observed counterexample, but do not alter one's a priori assessment of the truth of a hypothesis via an observed noncontradictory example, due to the infinite domain involved. As an example, I have a short computer program that for each nonnegative integer input produces a single nonnegative integer output. You can probe it by supplying various inputs, and I would be willing to bet that you see nothing but 0 outputs. Yet, the hypothesis "for any nonnegative integer input, this program outputs 0" would be false, and I could easily prove that it is false. There are an infinitude of such examples available, several of them being of greater mathematical interest than my program.
I suspect that I've read more of Good's writings that you have, over a longer period of time. The Bayesian approach is completely compatible with "weight of evidence", and indeed Good is widely considered to be a "Bayesian". — DAGwyn (talk) 18:58, 1 April 2008 (UTC)


Thanks for your analysis. Let me try to explain what you've missed. The problem, as usual, is too much set theory and not enough logic.

Logic is not based on set theory; it must be known before the set theoretic axioms can be dealt with. One can talk about propositions and implication and even about probability without ever using sets. The rules P(A or B)=P(A)+P(B)-P(A and B), P(A and B)+P(A and not B)=P(A) and 0<=P(A)<=1 relate propositions to probabilities without any set theoretic axioms.

You're well aware that material implication is not an adequate way of representing natural language conditionals, such as:

T:If the weather is good tomorrow then I will go outside

which is a sentence of the form "If A then B", with A being "The weather will be good tomorrow" and B being "I will go outside tomorrow".

I doubt that you will attempt to argue that P(A|T) differs from P(A). A person who makes the decision T does not affect the weather. He increases the probability that he will go outside tomorrow, though: P(B|T)>P(B). Please resist the urge to interpret T, A and B as sets when they are propositions. Thinking about set theory will only cause confusion.

The question is: How do the probabilities of propositions change when the decision T is made?

The answer is: P(A|T)=P(A), P(B|T)=P(B or A), P(AB|T)=P(A|T)=P(A).

Note that the contrapositive (S, say), "If I don't go outside tomorrow then the weather won't be good", has a different meaning when interpreted in the way that natural language is usually interpreted. It suggests that going outside affects the weather. To a person who does not know enough about the words "weather" and "go outside", or to a person who believes in rain dances, the following would be the natural assignment of probabilities:

P(I won't go outside tomorrow|S)=P(I won't go outside tomorrow)
P(The weather won't be good tomorrow|S)=P(The weather won't be good tomorrow OR I won't go outside tomorrow)

which are equivalent to:

P(I will go outside tomorrow|S)=P(I will go outside tomorrow)
P(The weather will be good tomorrow|S)=P(The weather will be good tomorrow AND I will go outside tomorrow)
or P(B|S)=P(B) and P(A|S)=P(AB)

These changes are exactly what the sentence: "If I don't go outside tomorrow then the weather won't be good" suggests. It suggests that "I will go outside tomorrow" is to be thought of as a *necessary* condition of good weather, while the sentence "If the weather is good tomorrow then I will go outside" suggests that good weather is (after the sentence has been understood) to be considered a *sufficient* condition of going outside.

Now S and T have different effects on the probabilities of A and B, despite the fact that they are contrapositives.

Neither of these are material implication, B or not A. When "B or not A" is given, the probability of A decreases and the probability of B increases:

P(not A|B or not A)=P(not A and "B or not A")/P(B or not A)=P(not A)/P(B or not A) > P(not A)
P(B|B or not A)=P(B)/P(B or not A) > P(B)

That is, material implication is not able to accurately capture the changes in probabilities which occur when sentences of the form "If A then B" are understood.

If we substitute material implication willy-nilly, then we would come to the conclusion that when the sentence "If the weather is good tomorrow then I will go outside" is first understood, the probability that the weather will be good decreases. This is obviously false.

So, when you said that set-based logic and truth tables constrain us so that we must conclude that "If A then B" and "If not B then not A" are exactly the same, you are correct insofar as we substitute material implication willy-nilly. However, doing this leads to error; the sentence "If the weather is good tomorrow then I will go outside" is clearly not equivalent to "I will go outside tomorrow or the weather will not be good" since they have different effects on the probabilities of propositions.

If you take the sentence "If it's a raven then it's black" and understand the if-then construction in the same way as it is understood in the natural-language sentence "If the weather is good tomorrow then I will go outside", then you will see that it is not equivalent to the sentence "If it's not black then it's not a raven."

That is, the two decisions:

Whatever is currently believed to be a raven will henceforth be believed to be black.
Whatever is currently believed to be non-black will henceforth be believed to be a non-raven.

are very different decisions, and a failure to notice the difference leads to the paradox.

...

Now about the so-called Bayesian solution. You say that the usual Bayesian methodology can deal with induction by considering the hypothesis that "All ravens are black" as something which gains credibility when black ravens are seen. Okay, let's suppose there are N ravens. Before we can go any further, we need to have an assignment of probabilities to various propositions (by the way, if you've read as much of Good's writings as you claim to have then you'll be aware that Bayesians talk about the probabilities of propositions, not the probabilities of sets).

Suppose that the propositions "Raven 1 is black", "Raven 2 is black" and so on are independent. Let's call the propositions B1, B2 and so on.

Then:

P(B1 and B2 and B3)=P(B1)P(B2)P(B3)

and so on, so P(B1|B2 and B3)=P(B1). That is, no matter how many of the other ravens you find to be black, the probability that "Raven 1 is black" does not change.

Now you will certainly be entitled to interrupt me at this point to tell me that by making the propositions independent at the beginning, I have made it inevitable that no induction can happen. Then I can ask you: What assignment of probabilities implements the "Bayesian" form of induction? There is no way to start off thinking that the blackness of one raven is independent of the blackness of another and, through observing the colour of many ravens, come to the conclusion that the blackness of one raven is not independent of the blackness of another.

You might say that we can introduce the hypothesis, H, that all ravens are black. But then what is not H?

P(B1|H)=1
What is P(B1|not H)? P(B1 and B2|not H)?

Not H cannot correspond to the hypothesis that the colour of one raven is independent of the colour of another, because that does not exclude the possibility that all ravens are black:

P(H|not H)=P(B1 and B2 and ... and BN|not H)=P(B1)P(B2)...P(BN) > 0

There is no natural or unique probability assignment to the propositions B1, B2, and so on which implements induction in the sense that P(B1|B2 and B3 and ... BN)>P(B1). If there was, then you would be able to tell me what probabilities are assigned to B1, B2, B1 and B2, B1 and B2 and ... BN and so on, but you haven't done it, and neither has anybody else. Until such a probability assignment has been specified, the claim that a Bayesian form of induction has been presented is simply false.

Saying that not H = not B1 or not B2 or ... or not BN doesn't help. That is completely consistent with the propositions B1, B2 and so on being independent; that is, it doesn't specify an assignment of probabilities and it doesn't give you a way to implement induction. Latexallergy (talk) 22:32, 1 April 2008 (UTC)

To simplify it a bit - suppose that there are only two objects in the universe, object 1 and object 2. We know nothing at all about what the word "logomorphic" means. As far as we know, each object is just as likely to be non-logomorphic as it is to be logomorphic. The probability that each object is logomorphic is 1/2, initially, because of our ignorance.

Now, we learn that object 1 is logomorphic.

What is the new probability that object 2 is logomorphic?

If mere loyalty to the Bayesian jihad endows us with the ability to use induction, then there should be some reason for us to assign a probability greater than 1/2 to P(object 2 is logomorphic|object 1 is logomorphic).

However, regardless of whether we fight alongside the Bayesians or we fight against them,

P(object 2 is logomorphic|object 1 is logomorphic) = P(object 2 is logomorphic AND object 1 is logomorphic)/P(object 1 is logomorphic)

and P(object 2 is logomorphic AND object 1 is logomorphic) is not constrained in any way by any Bayesian doctrine. Joining the Bayesian club does not give us the ability to perform induction, because it does not tell us what probability to assign to P(object 2 is logomorphic|object 1 is logomorphic).

In order to know the value of P(object 2 is logomorphic|object 1 is logomorphic), we must specify it ourselves, by specifying the prior value of P(object 2 is logomorphic AND object 1 is logomorphic). Bayes does not tell us how to induct. We must do it ourselves, by specifying the prior.


Latexallergy (talk) 03:01, 2 April 2008 (UTC)

I don't buy into much of that. There is an exact equivalence between propositions and set membership. The advantage of sets is that one can draw representative diagrams to check whether or not he has made an error, by simply identifying relevant domains and counting members. Probability estimates (actually we should be calling them "likelihoods") are specific to states of knowledge, so context must always be kept straight. P(A|T) could well differ from P(A), because this is not about causality, it is about knowledge. There could well be some correlation between P(A) and P(T); if I didn't think there was much chance of good weather, I might have made definite indoor plans (P(T)=0), whereas if I thought good weather was likely, I might have decided to flip a coing to decide whether to go out (P(T)=0.5).
Using natural language such as "will go" etc. adds extra dimensions to the problem that are not inherent in the logical issue pertaining to the effect of different kinds of evidence.
Judging from some of your rhetoric, you have an anti-Bayesian bias. Prejudice about what the answer should be can lead one to make logical errors and stick to them.. — DAGwyn (talk) 16:40, 2 April 2008 (UTC)

From our discussion so far, I am a better representative of the Bayesian point of view than you are. There is a good reason why Bayesians talk about the probabilities of propositions instead of sets, and this is actually central to the Bayesian way of thinking. If you need to look at pictures and use set theory as a mental crutch to help you think about probability then that is a deficiency of yours. By your own admission, you are confused by the concept of the probability of a proposition such as "Socrates is a raven" because in order to understand it you try to rephrase it in frequentist terms, talking about draws from domains. To somebody who has actually made himself familiar with Bayesian probability, there is nothing confusing at all about saying, for example, that the probability that Socrates is a raven is 1/2. It means that, based on the information we have available (which is represented by a proposition and not a set), we have no more reason to think that Socrates is a raven than that he isn't. To attempt to construct some scenario which involves something being drawn from some distribution or from some set is an enterprise entirely alien to the Bayesian way of thinking.

The original Bayesians made great advances in probability theory - Good, Jaynes, Jeffreys and so on. Unfortunately the word "Bayesian" is derived from a proper noun and so it lacks a definition. It is now just a symbol which people choose to fight for or against. It would really be better for probability theory if the word "Bayesian" was banned altogether. Whenever the word appears it removes people's attention from the mathematical questions and the people instead adopt the us-versus-them psychology more suitable to politics than mathematics. People who might otherwise be intelligent enough to pay attention to questions of mathematics start saying things like "Me Bayesian; you anti-Bayesian. Me hate you. Me think you wrong." This puts people into mindless zombie mode, after which they are engaged in the "Bayesian jihad" that I referred to, trying to fight everyone until everyone bows down before the symbol "Bayesian" and pledges allegiance to it.

You are entirely correct that we are concerned about knowledge and not causality. Causality is merely the thing which prevents you from asserting that a person who makes a decision, namely to go outside tomorrow if the weather is good, affects the weather. Of course he does not; we know that because we know about causality. In general, though, when we examine the meaning of the sentence "If A then B", we find that A is regarded as a *condition* and B is regarded as something which is *asserted* under that condition. When I say to you "If A then B", I am not trying to tell you that A is or isn't likely. I am trying to tell you that B is likely, and, in fact, certain, under the assumption that A is true. It is the probability of B which increases when we first accept "If A then B". The probability of A doesn't change.

There is an asymmetry in the expression "If A then B" which isn't present in the expression "B or not A". In the expression "If A then B", A is the antecedent and B is the consequent. In the expression "If not B then not A", not B is the antecedent and not A is the consequent. From an examination of the expression "B or not A", there is no way to tell what is the antecedent and what is the consequent. Substituting "B or not A" in place of "If A then B" erases the information about whether A is the antecedent or not B is the antecedent. This has nothing to do with causality. When you said that there is nothing inherent in the phrase "If A then B" which isn't to be found in "B or not A", you were incorrect - by looking at "If A then B", I can tell which is the antecedent and which is the consequent. I cannot do this with "B or not A". Please tell me whether or not you understand this. I really have no idea whether or not you do.

You say: "There could well be some correlation between P(A) and P(T); if I didn't think there was much chance of good weather, I might have made definite indoor plans (P(T)=0), whereas if I thought good weather was likely, I might have decided to flip a coin to decide whether to go out (P(T)=0.5)."

Indeed, all of this is true, except for the bit about there being a correlation between P(A) and P(T). If there is a correlation, it would be between A and T, not P(A) and P(T) which are scalars. You might have decided all or any of these things. But I hope you will agree that decisions occur at specific times. Before a decision is made, various propositions have various probabilities. After the decision has been made, the propositions have different probabilities. For example. I might decide that I will not go outside tomorrow, P(B)=0. If somebody asks me what is the probability that I will go outside tomorrow, I will say zero. I really believe that I won't go outside. Then I might change my mind and decide that I definitely will go outside tomorrow. After I make this decision, P(B)=1. The probabilities of propositions change when I make decisions. However, regardless of everything that has been said, I cannot affect the weather by making any decision. P(A|T)=P(A). Unless you contradict this, you are conceding my point.

The confusion that you blame on the occurrence of phrases like "will go" is a red herring. The only mathematical question is whether or not you understand the difference between:

P(A|T)=P(A) P(B|T)=P(B or A) P(A and B|T)=P(A|T)=P(A)

and

P(A|S)=P(AB) P(B|S)=P(B) P(A and B|S)=P(A|S)=P(A and B)

In the first case, the effect of conditionalizing on T makes A into a *sufficient* condition of B, in the sense that B1, B2, ... BN are the sufficient conditions of B if B implies and is implied by "B1 or B2 or ... or BN". In the second case, conditionalizing on S makes B into a *necessary* condition of A, in the sense that A1, A2, ... AN are the necessary conditions of A if A implies and is implied by "A1 and A2 and ... and AN". Conditionalizing on T adds A to the list of sufficient conditions of B: afterwards, B implies and is implied by "B1 or B2 or ... or BN or A"; conditionalizing on S adds B to the list of necessary conditions of A, so that afterwards, A implies and is implied by "A1 and A2 and ... and AN and B".

Note that saying "A is a sufficient condition of B" is equivalent to saying that "B is a necessary condition of A". However, *adding* A to the list of sufficient conditions of B, when it was not already a sufficient condition, has a different effect to *adding* B to the list of necessary conditions of A. The propositions T and S above accomplish those two different changes. Please tell me whether or not you understand this. You don't need a mathematical representation of the natural-language expression "will go"; I am showing you the mathematical representation of the natural-language expression "If A then B".

If you don't understand that, then at least tell me whether you understand that conditionalizing on S and conditionalizing on T, as shown above, have different effects on the probabilities of A and B.

You should also either say how one is to assign a prior probability to the proposition "Object 1 is logomorphic AND object 2 is logomorphic", or concede that you have failed to show a Bayesian form of induction. Induction is where I see that object 1 has some property and then I think it more likely that object 2 has that property, is it not? So if there's a Bayesian form of induction, P(Object 2 is logomorphic|Object 1 is logomorphic) should be greater than P(Object 2 is logomorphic).

Unless you can address this point, your claim that you have presented an inductive Bayesian solution to the raven paradox is false.

Latexallergy (talk) 04:54, 3 April 2008 (UTC)

Since you have misstated what I said, and have used perjorative terms about people who use Bayesian inference, I am not inclined to get into detailed analyses of your lengthy argumentation. I will point out that previously I noted where you didn't properly take into account state of knowledge (appropriate priors, if you will), using the same term for two different things. — DAGwyn (talk) 16:33, 3 April 2008 (UTC)

I think you mean "pejorative". I don't think you're nearly as offended as you claim to be. It seems quite evident to me that you've lost the argument, and pretending to be too offended to reply is your way of avoiding admitting that you were wrong.

Latexallergy (talk) 23:36, 3 April 2008 (UTC)

[edit] Bayes doesn't resolve

I tried to read the discussions above, and that was quite a bit to wade through. I couldn't see anyone making this specific objection, but I apologize if I missed it -- just point me to it, though I doubt any refutation of what follows would convince me.

Here we go: I just don't see how using Bayes's theorem resolves the paradox, and yes, I understand the initial paradox, which is: "if it's a raven, it's black" is logically equivalent to "if it's not black, it's not a raven"; therefore, evidence for one should be evidence for the other. But it's counterintuitive to us that seeing a non-black non-raven should make us raise the probability we assign to "if it's a raven, it's black".

So my problem: Bayes's theorem, at least as explained in the article, doesn't add any insight. Let T = "If it's a raven, it's black." Let X = "observation of a red [non-black] apple [non-raven]". So, I figure P(T|X), which I hope to compare to P(T) to see if Bayes makes me change my opinion. In the process, I have to compute P(X|T), the probabity of observing a red apple, given that "if it's a raven, it's black".

At this point, the article says: if one selects an arbitrary object at random, then the probability of seeing a red apple is independent of the color of ravens: Pr(X | TU) = Pr(X | U). (btw, I think the U is redundant and should be removed for clarity) That looks to me like circular reasoning -- it's assuming what we were trying to establish. Why would the probability of seeing a red apple be independent of whether or not "if it's not black, it's not a raven" holds? I think X should be, in order to maintain consistency: "after selecting across the set of all non-black objects, seeing a non-raven". If T holds, then, by logical equivalence, "if it's not black, it's not a raven" holds, and we must assign 100% probability to X -- thus P(X|T) = 100% != P(X).

This is a minor point that has only a small effect on calculated estimates (due to the relative quantities in the various domains). Another such approximation was pointed out explicitly in the article. — DAGwyn (talk) 16:39, 3 April 2008 (UTC)

We're back where we started, lacking a reason for saying that the observation of a non-black non-raven enhances our probability estimate that "if it's a raven, it's black". All the Bayes example seems to do is work through the theorem, while assuming away the problem as an intermediate step. Anyone agree? MrVoluntarist (talk) 19:42, 2 April 2008 (UTC)

Hi, MrVoluntarist.
I think you're mostly right, but the article incorrectly claims that it makes no difference whether we select an apple at random or whether we select an object at random: If one selects an apple at random, or more generally if one selects an arbitrary object at random, then the probability of seeing a red apple is independent of the color of ravens.
The article is incorrect on this point. If we ignore my pet theory that I describe above, and suppose that the proposition "All ravens are black" is equivalent to the proposition "All non-black things are non-ravens", and both are equivalent to the proposition "Everything is either black or a non-raven", and we call all of these allegedly equivalent propositions T, then the effect of discovering or assuming that T is true is merely to remove all the non-black ravens from our probability space.
(To see this, consider that the set corresponding to the proposition "raven implies black" is the union of the set "black things" with the set "non-ravens", and this union is the complement of the set "non-black ravens".)
That is, (and here I will drop the U because you are right that it is useless, although including it demonstrates loyalty to a certain political movement):
P(a random object is a non-black raven|T)=0
and
P(a random non-raven is red|T)=P(a random non-raven is red)
and
P(a random object is a red non-raven|T)=P(a random object is a red non-raven)/P(a random object is not a non-black raven)
So the article is correct that P(a random apple is red|T)=P(a random apple is red). This is because removing the non-black ravens from the population has no effect on the proportion of apples which are red (supposing that no ravens are apples and hence removing ravens has no effect on the apples).
The article, however, is incorrect when it states that the probability that a random object is a red apple does not change when T is given. When we remove the non-black ravens from the population, the red apples make up a larger fraction of what remains. So P(a random object is a red apple|T)>P(a random object is a red apple).
There are well-written Bayesian approaches to the paradox. I would recommend Brandon Fitelson's paper, "How Bayesian Confirmation Theory Handles the Paradox of the Ravens", which is available at: http://fitelson.org/ravens.pdf . There's also Peter Vranas's article which has the best bibliography by far on the paradox: http://philsci-archive.pitt.edu/archive/00000688/00/hempelacuna.doc
I agree with you in general, though. The Bayesian approach doesn't actually resolve the paradox. It accepts the paradoxical conclusion that finding a red apple makes us more likely to think that all ravens are black.
Actually it shows that the degree of confirmation of the hypothesis is exceedingly slight, and explains why that is so. In a much smaller universe, the Bayesian approach would be essential to properly update estimates based upon acquired knowledge. — DAGwyn (talk) 16:39, 3 April 2008 (UTC)
My complaint against the Bayesian solution is more serious - it doesn't involve induction. When we discover that something is true, we simply remove certain objects (e.g. non-black ravens) from the list of things which we think exist and count what remains. That is called deductive reasoning. There is no generalization involved.
It explains induction by exhibiting the mechanism that makes it work. — DAGwyn (talk) 16:39, 3 April 2008 (UTC)
You might think that it is only my pet theory that the Bayesian formalism on its own doesn't implement induction. You might want to look at: http://books.google.com/books?id=CicvkPZhohEC&pg=PA262&lpg=PA262&dq=impossible&source=web&ots=JghULmM3f0&sig=_11SRAhgKqgGKyQHZ-dVv5nH0-A&hl=en . Carnap, whose writings you evidently haven't read, was concerned with finding a formalism which could implement induction precisely because the bare Bayesian formalism doesn't do it. If you intend to stick to your claim that the Bayesian formalism accomplishes induction without needing any additional formalism, you will have to claim that Carnap was hideously mistaken. Of course, you can avoid that by pretending that you're too offended to respond. Latexallergy (talk) 20:46, 3 April 2008 (UTC)
I believe I can write a much more coherent article, with appropriate references doing justice to the large literature which has developed over the last three quarters of a century, if anybody thinks it worthwhile.

Latexallergy (talk) 06:09, 3 April 2008 (UTC)

I have to say that DAGwyn's contributions to the article and the way that he responds to questions are significantly less than satisfactory. His contributions to the article contain a number of false statements and unjustified claims and his response to MrVoluntarist's insightful remarks was dismissive and incorrect. In fact, he did not address MrVoluntarist's point at all - Why would the probability of seeing a red apple be independent of whether or not "if it's not black, it's not a raven" holds? The response given, This is a minor point that has only a small effect on calculated estimates (due to the relative quantities in the various domains). Another such approximation was pointed out explicitly in the article, does not address the question at all. MrVoluntarist was pointing to a circularity in the argument presented in the article, and he is correct. DAGwyn's response - claiming that there is an approximation which he feels entitled to make without explicitly stating that he has done so, is not satisfactory. The article should be changed so that it doesn't make false statements.
Do you think your question has been adequately addressed, MrVoluntarist? Latexallergy (talk) 20:38, 3 April 2008 (UTC)
More misrepresentation. All I did to the article was to refine the previous text that described the Bayesian solution and to add some additional examples to show that the methodology works even in the "edge" cases.
The reason I am not responding in detail to your argumentation is that I recognize from its tone that I would be wasting my time. When I was younger and less experienced, I did engage in such arguments, and found out the hard way that consensus cannot be reached unless both sides share the same goal. An anti-Bayesian presenting the Bayesian argument in the article would be a sure way to prevent the actual argument from reaching the audience. Some of us don't consider that a legitimate editorial goal. — DAGwyn (talk) 23:57, 3 April 2008 (UTC)
My apologies; I had gotten the impression that you had written the entire Bayesian section of the article. I don't claim at all to be an anti-Bayesian; it was you who accused me of that and I reject that characterization. As I said, I think the word "Bayesian" just causes fights, and this is a perfect example of that. If I write an exposition of the Bayesian take on the paradox, incorporating the recent work done by prominent Bayesians, will you review it and give your criticisms of it in a calm manner? I don't care for the idea of having an edit war and from your responses so far it seemed quite likely to me that you would just erase whatever I write without reading it because you have identified me as an "anti-Bayesian". On the other hand, I did previously think that you would be jealous and protective of what is currently written in the Bayesian section because I thought that you wrote it all, so maybe you won't insist on immediately undoing any change. Latexallergy (talk) 00:58, 4 April 2008 (UTC)

The reason I am not responding in detail to your argumentation is that I recognize from its tone that I would be wasting my time. That, incidentally, is called an ad hominem attack, or "judging the man and not the argument". It's one of the oldest logical fallacies in the book. Latexallergy (talk) 20:38, 4 April 2008 (UTC)

That, incidentally, is called an ad hominem attack, or "judging the man and not the argument". It's one of the oldest logical fallacies in the book.
I don't think so. I have just reread the article on ad hominem; the fallacy refers to saying that someone's argument is wrong because of some personal failing of the person making the argument. Here, I read DAGwyn as saying only that he would be wasting his time to argue further, not that your argument is wrong because of some personal failing on your part. That is, he is bowing out, not refuting you. He is frustrated because (in his view, as I read him), you aren't engaging him in a fruitful way.
Anyway, that's the way I see it, for what it's worth. Bill Jefferys (talk) 02:10, 5 April 2008 (UTC)
Thanks for your insights, Bill. The beginning of the article says: An ad hominem argument, also known as argumentum ad hominem (Latin: "argument to the man", "argument against the man") consists of replying to an argument or factual claim by attacking or appealing to a characteristic or belief of the person making the argument or claim, rather than by addressing the substance of the argument or producing evidence against the claim. The process of proving or disproving the claim is thereby subverted, and the argumentum ad hominem works to change the subject.
That seems to me to be a pretty accurate description of DAGwyn's position. He sees me as an anti-Bayesian, and that alone is sufficient, in his eyes, to make discussion with me a waste of time. I had presented several arguments supporting the claim that the solution I presented was correct, but DAGwyn never addressed any of them and instead worked (successfully) to change the subject, by accusing me of insulting Bayesianism.
Again you have misrepresented what I said. It is that kind of thing that makes me disinclined to engage in an argument with you. — DAGwyn (talk) 18:18, 8 April 2008 (UTC)
Of course. Latexallergy (talk) 00:59, 9 April 2008 (UTC)
If you are right that he is bowing out and not refuting me, then it would appear that he does not claim any more that my solution is not correct, or that it is identical to the Bayesian solution, and so he should not object if I alter the article to include an explanation of it. Nobody, to my knowledge, disputes the correctness of my solution. The article, in any case, makes incorrect statements about the Bayesian approach which should be corrected anyway. I do not think it would be fair or reasonable for me to alter the article without first explaining here what changes I intend to make and checking that nobody has any objections.
I propose that the proposed resolutions be grouped into three sections, since there are exactly three (logically) possible responses to the paradox. One can reject either of the two premises (one premise being that "all ravens are black" is equivalent to "all non-black things are non-ravens", and the other premise being that observing an object with property A and property B supports the hypothesis that all objects with property A have property B), or one can accept the conclusion that observing a red apple does support the hypothesis that all ravens are black. Each of the three possible positions has been taken at various times by various authors. There have also been several so-called Bayesian solutions presented. I think a representative sample of each of the three positions would be a much more systematic and encyclopaedic way of presenting the information.
Right now, the article practically presents a poorly written exposition of one particular Bayesian approach as the unique correct solution when this is very far from being agreed upon by modern philosophers and logicians. This amounts to favouring a specific point of view over others, in effect telling the reader that he is not entitled to judge for himself because the author of the encyclopaedia article has made the decision about which solution is correct.
Does anybody object? Latexallergy (talk) 22:42, 5 April 2008 (UTC)
The whole point of the Bayesian section of the article is that it is presented as a different approach from other "solutions" that philosophers/logicians have proposed. Presenting significant alternative POVs is certainly in line with the WP:NPOV guidelines. If you want to subdivide the Bayesian POV into sub-POVs, that is okay, so long as you provide references to reputable sources for them. I'm skeptical that there is significant variation among Bayesian solutions, since all of them will be based on Bayes' law and will be based on the difference in the relative sizes of the domains. — DAGwyn (talk) 18:18, 8 April 2008 (UTC)
Maher's article, at http://www.jstor.org/pss/188737 , presents a refutation of this. In section 4.3, "The Relative Rarity of Ravens", he says "One popular response to the paradox has been to note that we know there are far fewer ravens than there are non-black things and to argue that, given this knowledge, A gets much less confirmation from a non-black non-raven than from a black raven. [...] But Corollary 1 showed that, given any sample proposition as background evidence, a non-black non-raven confirms A just as strongly as a black raven does, according to all popular measures of confirmation [...] this response to the paradox cannot be correct."
The popular measures of confirmation he proves this for are the "difference measure", P(A|X)-P(A), the "ratio measure", P(A|X)/P(A) and the "likelihood ratio", P(X|A)/P(X|not A).
Another way of seeing what's wrong, apart from reading Maher's theorems, is to consider what would happen if the domains *weren't* dramatically different in size. If it's really true that the reason we *think* the conclusion is absurd is just because there are so few ravens compared to non-ravens, than that is like saying that if the number of non-ravens were comparable to the number of ravens, we would not see a paradox at all.
Now, it's difficult to imagine that half of the objects in the universe are ravens, but we could restrict ourselves to humans and replace "raven" with "man" and "non-raven" with "woman" and "black" with "tall" and "non-black" with "short". Then the paradox goes: "all men are tall" is equivalent to "all short people are women". So finding a short woman provides evidence that all men are tall. (Here I am taking "short" to mean "not tall".)
I agree that the paradox is better analyzed by bringing the relative domain sizes close to each other, which is why I suggested it previously. A general problem with all such arguments, pointed out in the article (although for the particular example it was quantitatively insignificant), lies in the selection process. To select a short person, one would operationally have to examine some number of the members of the universe until the first short member is seen. In that process, it is fairly likely that one would observe at least one tall man, which presumably does provide evidence in favor of the original hypothesis! (I suppose one could consider using some "shortness filter" that prevents observation of the sex. The point is, in most formulations of such problems too much is left unsaid, and when different analysts interpret the unsaid conditions differently, they get into disagreements.) — DAGwyn (talk) 15:56, 10 April 2008 (UTC)
The paradox hasn't gone away. It's just as bizarre to give support to "all men are tall" by observing a short woman as it is to give support to "all ravens are black" by observing a white shoe.
Note that this is true even if the number of short people is thought to be comparable to the total number of people. However, if the number of short people is known exactly, then observing that a woman is short leaves fewer remaining short people which could be men. One possible counterexample to the proposition "No short people are men" has been ruled out, and ruling out counterexamples is called deductive reasoning. So if the total number of short people is known, then you can use deductive reasoning to get from "This person is a short woman" to "There are fewer short people remaining who could possibly be men."
If you don't know the number of short people, though, you cannot do this. Finding that a particular person is a short woman would increase your estimate of how many people are short. The number of possible counterexamples in this case is estimated, not known, so no deductive reasoning based on ruling out counterexamples is possible. Latexallergy (talk) 01:32, 9 April 2008 (UTC)
(I am, incidentally, grateful, that you do not object strenuously to my editing the article.) Latexallergy (talk) 09:58, 9 April 2008 (UTC)
Just so long as the presentation of the Bayesian argument is a faithful recounting of that position and is kept distinct from criticism of that argument. — DAGwyn (talk) 15:56, 10 April 2008 (UTC)
I've rewritten the proposed resolutions section. Let me know if there's anything you find objectionable. Latexallergy (talk) 11:48, 29 April 2008 (UTC)

[edit] Where's the paradox?

Shouldn't it be simple to prove that there is no paradox? There are albino ravens. Thus, all ravens are not black. Thus, not all things that are not black are not ravens. 05:12, 8 April 2008 (UTC)

Yes, we know that but that's not where the paradox lies. All that does, is prove that the statement, All Ravens are Black, is false -- which, as I said, we already know. The paradox actually lies in the fact that logic seems to indicate that we can prove the statement false by looking at unblack objects instead of looking at ravens. And a lot of people don't buy that. Hence the use of the word, paradox. -- Derek Ross | Talk 05:58, 8 April 2008 (UTC
Yes, for purposes of discussing the paradox it is best to assume that we don't (yet) know whether or not all ravens really are black, so that "all ravens are black" is a plausible hypothesis. Once we know for sure one way or the other, it is a degenerate (special) case and is not as illuminating. — DAGwyn (talk) 19:31, 8 April 2008 (UTC)
Alright, but I'm still not seeing a paradox... it is not inductive logic to take a single case and argue from that case for a general rule pertaining to all cases. Actual inductive logic a) would need a larger sample size to make any prediction whatosoever, and b) could at best only argue for a likelihood, not a certainty. The only way to use inductive logic to prove it for a certainty would be to look at ALL unblack objects, not merely look at a single one, which is of course an impossibility. Jarandhel]] [[User_talk:Jarandhel|(talk) (talk) 02:24, 9 April 2008 (UTC)
The single observed case is considered a "confirming instance", and it does not prove the hypothesis; however, it is generally considered to lend some credence to the hypothesis. (I.e. increase the likelihood that the hypothesis is true.) The paradox is that observing a non-raven doesn't seem like it should have any bearing on the hypothesis, yet it apparently does support the contrapositive, which is logically equivalent to the original hypothesis. Another way of seeing the paradox is that observing a red apple apparently supports both the hypotheses "All ravens are black" and "All ravens are white", which are mutually contradictory (if any ravens exist). — DAGwyn (talk) 15:45, 10 April 2008 (UTC)
To clarify that last thought, observing a red apple supports the general hypothesis, "All Ravens are non-Red", of which the hypotheses, "All Ravens are Black" and "All Ravens are White", are two particular specialisations. I discussed this in more detail earlier on this page (in 2004 and 2005). -- Derek Ross | Talk 16:08, 10 April 2008 (UTC)
Then the paradox is that the observation supports (adds evidence for) two contradictory hypotheses, which seems (to some people) to be something that shouldn't be able to happen. — DAGwyn (talk) 23:58, 2 May 2008 (UTC)
When you generalize from a sample size which is too small, like a single case, it's called a hasty generalization, which is considered a misuse of induction. How much evidence is enough for the generalization not to be hasty is a question which lacks an objective answer. You're certainly right that induction always involves a loss of certainty. It can get remarkably close to certainty, though - when we drop an object by removing whatever supports it, we are almost certain that it will fall, even though we have seen only a very tiny fraction of unsupported objects. Similarly, with the induction that is imagined to take place in the paradox, we imagine that somebody has seen only a tiny fraction of the ravens, but becomes almost certain that the next raven he sees will be black - just as certain as we are that an unsupported object will fall.
When we imagine somebody who has never seen a raven becoming almost certain that all ravens are black, on the basis of the examination of a tiny fraction of non-black non-ravens, then we see that such a person must have made a mistake in his reasoning. And yet he is apparently not using any reasoning which we do not use ourselves when we come to the conclusion that unsupported objects in general fall. Latexallergy (talk) 02:44, 9 April 2008 (UTC)
I think this is off the point, but indeed it is a component of the paradox. The observation of a green apple should not lead to the conclusion "all ravens are black" with certainty, but it is a piece of evidence supportive to such a hypothesis. I never thought it was paradox, and I basically agree with the Carpanian approach. People usually equate "a confirmation of truth" to "an evidence for a hypothesis", and when in this problem we have the latter, we think it tries to be the former, and reject it erroneously.
Just to quote something from Einstein: "No amount of experimentation can ever prove me right", and compare it to the usual view that "we proved relativity by seeing that bending of lighhhht!". We see that we use certain experiments to reduce the likelihood of theories being false, and although there is often nothing that could grant us absolute certainty, we take the experimental results to be "proofs" of the theories, and it is such a way of thinking that we have to get around, in order to resolve the paradox.129.67.38.26 (talk) 16:17, 2 May 2008 (UTC)