Talk:Correlation does not imply causation

From Wikipedia, the free encyclopedia

ATTENTION: This page was moved after a vote at Talk:Correlation implies causation/Page title.

Contents

[edit] Suggested Format

The "general pattern" section should be reorganized around the standard threats to casual inference:

1) omitted (unobserved) variables (possibly with a link to the weak page on Mediator variable

2) selection bias

3) simultaneous causation (reverse causation)

4) measurement error

It would be nice to have vivid examples of each type of problem. Cristo00

[edit] Additional information

Moved from the main article (articles are not for discussion!!)

But I thought it important to note that even though a logical fallacy, there was a stronger but deeper link between causation and correlation. If you believe Reichenbach's principle, then you believe that robust correlation implies SOME causation, just not necessarily direct links.

An earlier version of this page offered two examples "that show that it is sometimes quite difficult to judge correlations":

  • Statistics prove that most car accidents happen between vehicles driving at rather low speeds. Few accidents take place at, let's say, 100 mph. Does this mean that it is safer to drive fast? Of course not, most accidents take place within 25 miles of their primary city or suburban residence, usually driving at a moderate speed, ergo most accidents happen at moderate speeds.

Yes indeed. Although mostly this is a common failure to take base rates into account when doing correlations (or any other kind of inference). The prediction is that once you separate out the fact that most driving happens at low speeds, the correlation between speed and accidents will change.

Note that you could make another faulty inference given the explanation. Most accidents do take place within 25 miles of home. Does this mean it is more dangerous to drive near your house than far away? It might mean this. It might mean that people become complacent on familiar roads, and are more alert and safer when travelling. But those seem unlikely, and it is far more likely that it is another base rate effect: by far most driving happens near home than away. After all, every successful trip both starts and ends at home.

  • the correlation between reserve parachute deployment failures and death is quite high; if the main parachute fails and then the reserve parachute fails, the parachutist almost always dies. However it is the sudden deceleration when the parachutist hits the ground that causes the death, not the parachute failure. The parachute failure leads almost inevitably to death, but does not cause it.(1)

Actually I strongly disagree here. The reserve failure most certainly does cause the death! A good way to kill your (parachutist) enemies is to disable their chutes. Granted, the ground is a more proximate cause of death, but there will always be a more proximate cause of death. For example, even the impact with the ground is not the most proximate cause. The impact of your organs with your skeleton handles that. Etc.

The correlation between reserve chute failures and death is in fact a good indication of a causal link. Not that we needed statistics to find that one. :-)

Even though there are ways to go more securely from (many) correlations to (some) causal structure, unmeasured common causes can always interfere with those attempts. And most social scientific methods do not even try to do proper causal inference. They merely apply regression measures (correlations) and then make causal inferences from there. All statistics textbooks warn (overly much) against inferring causation from correlation, but in practice correlation methods are almost always used to establish causation. And this practice commonly leads to faulty conclusions drawn from scientific research.

This is in contrast to experimental science with proper controls, where you hold everything else constant and wiggle A. When you do this (and only then), you see B wiggle. That is a pretty foolproof way to establish causation. Lacking perfect control, a randomized clinical study is the best alternative, because it is the best guarantee that other causes are averaged out between the treatment and the non-treatment population.

The previous version included links to Problem of induction, and physics along with a 1-paragraph discussion that I feel blurred the important distinction between experimental correlations and observational correlations. For example, I do not think that moving from Newton's laws to the Theory of Relativity is a good example of problems with correlation. There is more on that shift no doubt under Philosophy of science, especially topics like theory choice, confirmation, induction, and paradigm-shift.

This page should probably be broken up.

The discussion in this page is a bit tortured. I've revised the section on "determining causation" to emphasize counterfactual reasoning. This should help readers think about conditions under which correlation provides persuasive evidence of causation. 09:14, 28 August 2006 (UTC)

The Latin name for this fallacy is post hoc ergo propter hoc: literally, "After the fact, therefore because of it."

(1)The United States Parachute Association (http://www.uspa.org/) terms this type of fatality "Impact with ground"

[edit] Political Issue

Since this is an encyclopedia, perhaps the bit about gun control could be changed to a more neutral example. As it stands, the article appears to oppose gun control.

Regards, Rajeev.

I agree. A more neutral example would be better, and I'll change it (sooner or later) if there are no objections. I think it is worse than just being political; the argument seems to assume that either gun ownership or crime is the cause, and the other is correlated whilst ignoring the possibility of a common cause. In fact, none of the examples are very inspiring. Rls 23:49, 24 Aug 2004 (UTC)
Changed it. I don't really like my example, but feel free to edit. Sir Elderberry 00:51, 2 May 2006 (UTC)

[edit] Art imitates penguins (or was it the other way around)?

Another example illustrating this fallacy was a study which found that British arts funding levels had an extremely close correlation with Antarctic penguin populations. Neat factoid, but an encyclopedia is not a factoid collection. Please provide a source and/or an explanation for this, as with the other examples. For all our readers know we completely made this up. Removing it for now. 82.92.119.11 20:34, 24 Dec 2004 (UTC)

[edit] Vague reasoning

But if there was a common cause, and you had that data as well, then often you can establish what the correct structure is. Likewise (and perhaps more usefully) if you have a common effect of two independent causes.

What does "that data" refer to? Some sort of data about the "common cause", I presume, but what exactly? I can't figure out what these sentences are supposed to mean. 82.92.119.11 20:38, 24 Dec 2004 (UTC)

Reichenbach's principle is closely tied to the Causal Markov Condition used in Bayesian networks. The theory underlying Bayesian networks sets out conditions under which you can infer causal structure, when you have not only correlations, but also partial correlations. In that case, certain nice things happen. For example, once you consider the temperature, the correlation between ice-cream sales and crime rates vanishes, which is consistent with a common-cause (but not diagnostic of that alone).

This is presumably obvious if you know what Bayesian networks are. But let's assume I am just a reader interested in logic—do I care that "certain nice things happen"? What exactly does it mean to "consider" the temperature? Enter the data into the sample? And how does that make the relation "vanish"? If we want to mention Bayesian networks, we should put it in better context for non-technically inclined readers. 82.92.119.11 20:43, 24 Dec 2004 (UTC)

[edit] Expert attention

Someone familiar with statistics needs to give some concrete examples and a clearer, more detailed explanation so that non-scientific readers can understand this concept. -- Beland 02:40, 28 Feb 2005 (UTC)

195.42.89.34 08:56, 26 December 2005 (UTC) Perhaps it is not easy, with statistics. I saw a book, in English it should be entitled like "Paradoxes in mathematical statisitics" with sveral examples that are hard to believe. Examle is testing new drugs. Testing new drug in two hospitals independently shown drug is effective. Then summarising quantity of patianets classes - and we find that fro two hospitals in total drug is counter-effective. Hard to believe, but... Ok, i still want to donate: Common Russian joke about statistics state that It was prooved that 100% of men, died of cancer, had eaten cucumbers. We all know now how harming cucumbers are.

PS: BTW, what perhaps could be more example, is games. Do wild, cynic games provoke wild, cruel behaviour, or do cruel people prefer such kind of games ? AD&D history or later PC Games scandals.

[edit] Hume

There should be more on David Hume, who essentially said that 'correlation is causation' - it is impossible to see causation, all we can know is correlation. - 05:31, 10 November 2005 (UTC)

Actually, there is a way of seeing causation. Conduct an experiment (however horribly impractical and expensive it may be), and you can determine causation, as opposed to comparing sets of data to one another. The statement made by Hume would apply to the Chemical "x" causes Cancer situation. You can see that large concentrations of Chemical "x" might have a correlation with higher cancer rates by comparing data on cancer rates in certain areas with data on concentrations of Chemical "x" within those areas, but that essentially tells you nothing about true causation, though it provides possibilities.
The way to know causation for certain is to actually conduct an experiment with this chemical, by exposing people to it (and controlling other possible variables) and checking for the emergence of cancer from the direct effect of this chemical. That is how you can know causation, by direct experimentation and observation, as opposed to checking various statistics and lining them up for examination. I will admit, however, that Hume is right in a way. Most experimentation of the sort that would provide meaningful proof of causation for statisticians is either unethical or extremely impractical. In all practical terms, David Hume's statement can be assumed to have a grain of truth. It's clear, however, that in simple experimentation and observation, Hume's logic is incorrect; say you were to find a very high correlation between pedaling a bike harder and the bike going faster. This would be, of course, based on two specific sets of data you studied: the speed at which the bike was pedaled, and also the rates of speed that the bike travelled at. You can assume correlation doesn't imply causation here, but by direct experimentation, and observation of your own, it's safe to assume that pedaling harder will make the bike travel faster. Of course, these are no-brainers (And probably worded falsely somewhere...I'm nothing but an ametuer). The point is, Hume's logic does indeed apply in the grand scale, but his generalization in smaller cases is more a matter of formality as opposed to the actual case of assuming causation. The preceding unsigned comment was added by 70.95.202.104 (talk • contribs) .
Even with numerous experiments, all you actually see is correlation, maybe even 100% correlation, but you can't actually see causation. I'm speeking in a philosophical sense here, not in the day-to-day practical science sense. I'm guessing the article on causation, maybe also Popper's falsifiability, deals with some of this Philosophy of Science. - Matthew238 22:25, 22 August 2006 (UTC)

[edit] Monty Python example

The Monty Python example is flawed. As everyone knows, after Sir Belvedere's explanation, the accused woman herself says "It's a fair cop", confessing to being a witch. JIP | Talk 17:52, 31 May 2006 (UTC)

[edit] Unorganised and confusing -> more systematic

The article as it stood was (is) very unorganised and confusing, trying to explain through loose examples. I've tried to make things more systematic in the first paragraph. The later paragraphs should perhaps point to cases (1)-(4) to make things easier to understand.

Also, does "Teenage girls eat lots of chocolate, teenage girls are most likely to have acne, therefore, chocolate causes acne" give an example of a correlation? I'm not sure it does, and if not, the example should be removed.

I'm also not convinced if (4) actually is a possible outcome of a correlation.. help me out :) Narssarssuaq 16:31, 3 July 2006 (UTC)

I've removed the following because it doesn't contain a correlation:

[edit] Health and INcome

I really dont like the income health correlation example. I think it is bised for rich countries. In rich countries it may be true that income does not have much meaning in terms of health because everyone has a good diet. But in poor countries the health income correlation may well be indicative of good income causing good health. If you have a 1300 calorie diet, a higher income may well mean that you will have a 1600 calorie diet instead. Although both diets are not sufficient, I think the person with 1600 will be healthier.

Another example:
Teenage girls eat lots of chocolate.
Teenage girls are most likely to have acne.
Therefore, chocolate causes acne.
This argument, and any of this pattern, is an example of a false categorical syllogism. One observation about it is that the fallacy ignores (4), the possibility that the correlation is coincidence. We can pick an example where the correlation is as statistically "robust" as we please, but we still cannot assume one factor causes the other. If chocolate-eating and acne were strongly correlated across cultures, and remained strongly correlated for decades or centuries, it may not be a mere coincidence. However, in this particular example, the last statement is a logical fallacy because it ignores the possibility that a third factor may be the cause of eating chocolate and having acne (e.g. being young). See joint effect.
Someone correct me if I'm wrong about this. Narssarssuaq 16:50, 3 July 2006 (UTC)

[edit] Cannabis and possible other 'examples'

I think bringing in subjects like this is introducing POV bias and implications into the article. We should be able to demonstrate what a logical fallacy is without bringing up morally contentious issues as examples. Things like 'going to bed with shoes on' are great demonstrative examples, I don't see why we can't continue along those lines with the rest. Even the myopia one isn't really horrible beause I doubt you're seeing a lot of people out there with a strong interest in demonstrating causative links between lights being on and myopia.

On the other hand, bringing in a subject like cannabis use here is just opening a door to controversy as well as being a little devisive with regards to the implications of doing so in an article like this. The section is "examples of logical fallacies", and the topic we're given is a statistical link between cannabis use and mental illness. The section does not cite a specific case study (nor probably should it, this article isn't about a large debate like that), but rather makes a strongly implied generalisation that studies like this are or may be commiting logical fallacies. For all we know the studies go to great lengths to show that there is a causative effect behind the observed statistical correlations.

Ask yourself this: If most studies about links between cannabis use and mental illness didn't involve logical fallacies (as they supposedly do here), would this be a good example or a confusing/misleading one? I think it would clearly be the latter, and therefore the presence of this example is based on the underlying assumption that a statement about causal link between cannabis use and mental illness is something likely to be a logical fallacy. Do we know this? Have we done surveys of studies, looked at whether or not they are mostly guilty of logical fallacies, or do they mostly take this into account? Frankly, doing so is way beyond the scope of this article anyway.

The fact is, by putting claims like this in the section that demonstrates the fallcy, we are implying without evidence that the statement itself usually is false by way of the fallacy, which is not NPOV--and going to lengths to demonstrate it is infact NPOV (such statements statistically usually are fallacial) is out of the scope of the article. I simply don't see what we gain by having this here, we can do just as well explaining the topic while using entirely neutral examples. Honestly even the myopia one has problems really, but at least there the topic itself isn't a morally controversial one. --Rankler 00:51, 28 September 2006 (UTC)

I too agree that any possible controversial examples would not be appropriate, or maybe I should say best-suited, for this article. Instead of using a topic that may hinder groups of people from interpreting this article correctly, simple and neutral topics would probably be best to use. - Dozenist talk 01:33, 28 September 2006 (UTC)

[edit] NPOV and cannabis

Hi. I read over the section listed as a POV violation and couldn't really see it. I'm not a smoker of cannabis, nor do I really support legalisation, so I don't think it's an issue of my own engrained bias speaking for me.

The article is quite clear that the entire section is "maybe" and "possibly", without stating that the statement is false due to fallacy (any more than any other fallacy in this section - remember, just because a statement falls into Cum Hoc Ergo Proctor Hoc doesn't ever mean it's necessarily a 'false' statement (that, in fact, would be a case of Cum Hoc Ergo Proctor Hoc itself - just because a lot of statements that imply causation from coorelation are false (for instance, shoes to sleep and headaches) doesn't mean that they all are (for instance, cigarettes and lung cancer)).

Anyway, put short, the unsubstantiated fear that some Joe Average might read this article and get confused and immediately assume that pot is/is not harmful/helpful/related/unrelated to mental/physical/spiritual illness/well-being is not, by itself, reason enough to change the example given - which cannot easily be replaced with another, toned-down example, since the entire progression of the article goes from simple examples to more complex, real-worldish examples. In fact, using this argument to justify a rewrite is itself a fallacy - quite ironic, given the subject matter at hand. --151.200.252.164 09:08, 8 October 2006 (UTC)

I re-examined it and found it to be a bit defensive at the very beginning - as stoner stuff tends to be - and removed a bit of the "ALLEGED NATURE OF THE ALLEGED ALLEGED (POSSIBLY UNTRUE) UNVERIFIED [citation needed] [citation needed] [citation needed] " stuff in the beginning. Consensus? --151.200.252.164 09:13, 8 October 2006 (UTC)

[edit] Are the tags still needed?

Are the tags at the start of the article ({{Cleanup-date}}, {{expert}} and {{sources}}) still needed? Seems to me that they aren't and can be removed. Any Objections? Rami R 08:07, 18 October 2006 (UTC)

Objection. I do not believe that an expert has weighed in on this subject. There are many points without reference as well, so both the cleanup and sources templates apply. Chris53516 13:10, 18 October 2006 (UTC)
I sought out an expert (Prof. Yaacov Ritov). This was his input. He has, however, suggested that a science philosopher look at the article (mostly because of the terminological discussion in the beginning, and may be there is a need for an extended historical references beyond Hume). So the expert tag can stay. But i'm not sure what needs to be referenced. Could you show me an example? Rami R 07:32, 22 October 2006 (UTC)


[edit] Godwins Law?

Is there any relevance to Godwin's Law in the "See also" section, Or should it be removed? Rodo2 08:04, 25 October 2006 (UTC)

Definitely no relevance. I am removing it.--Dylan Lake 23:10, 25 October 2006 (UTC)

[edit] Shoes example

The example about waking up with a headache after sleeping with shoes on is really an example of post hoc; shouldn't the first example be more specific to this article? --Tardis 06:52, 31 October 2006 (UTC)

[edit] Remove Monty Python Example?

As much as I love Monty Python, the monty python example in this article is confusing, and I don't think it helps to illustrate the "Correlation does not imply causation" point. I'm thinking about replacing it with a simpler example from popular culture/news/etc. Anyone would care to comment? Claus Aranha 08:55, 31 October 2006 (UTC)

I agree with Claus that the monty python example should be removed, it is quite long winded and not particuarly helpful. Unless I see a specific example though I am not in favour of putting something else there, I think the Simpsons example above is fine. Grumpyyoungman01 10:48, 31 October 2006 (UTC)
I also agree - in fact having read the article, I came to the talk page specifically to see if anyone else had objected to it yet. It seems much less clear than the Simpsons example, and is longer. I'm not sure if the "Popular Culture" section merits such a long example, lest it start to dominate the article. Bobstay 19:26, 7 November 2006 (UTC)

[edit] Remove "Popular culture" section including Simpsons example

I think the examples in popular culture are crap and are not encyclopedic. I agree with the change made by the anonymous user that removed the section. – Chris53516 (Talk) 15:06, 9 November 2006 (UTC)

That anonymous user was a vandal, have a look at contributions from that ip and you will see that they were all reverted. The reason that ip address gave for deleting the Simpsons example was that it was 'irrelevant to the article', which is not true. What reasons do you have for wanting to delete the Simpsons example? So far you have said because it is 'crap' and because it is 'unencyclopedic', that is circular reasoning. Grumpyyoungman01 22:50, 9 November 2006 (UTC)
It's not encyclopedic because it is unnecessary in conveying the meaning of the article. When was the last time you read Encyclopedia Brittanica and read something about a cartoon in a logic article? – Chris53516 (Talk) 03:45, 10 November 2006 (UTC)
And if you had bothered to look further at the contributions of that contributor, you might have noticed multiple edits which decided aren't vandalism, since often anon IPs span multiple users. The text is irrelevant since even though it gives an example of the subject, there were already equally valid examples earlier in the article which A) aren't as long (since they don't require a contextual description), and B) aren't related to a pop culture reference, which is inherently more encyclopedic - having a "Pop culture" section is irrelevant because, as a logical argument, it invariably will appear in many popular pop culture media. Not signing since this IP changes several times a day, 18:32, 10 November 2006 (UTC)
Thanks for the support, but I think we'd all appreciate if you would sign in when your IP address changes a lot. – Chris53516 (Talk) 19:16, 10 November 2006 (UTC)
Where can I get one of those rocks that repels Tigers? How much? Just kidding. If anyone deletes this, they are morons. This is a discussion. As for the reference, it is a poingant example. You may not think it is "relevant," but it points out an example of this falacy that has been used in front of millions of people.
Not to stray too far off topic, but I don't plan on actually setting up an account here (I remember reading an essay in the WP namespace with some reasons, but darned if I can find it right now). 22:59, 10 November 2006 (UTC)
I think that Chris is right when he/she says that a popular culture section is unecyclopedic. For people coming to the article they are not interested in references in popular culture because that by itself is irrelevant. However, I think that the example itself is a good one, it is easy to understand and I think that it "hits the spot". I am in favour of removing the popular culture section, but placing the Simpsons example elsewhere so that it can stand on its own merits, not on the irrelevancy that it appeared in a popular television programme. Grumpyyoungman01 22:24, 10 November 2006 (UTC)
Granted. My original edit summary referred to the section, not the example (though I obviously disagree with including the example), though perhaps my argument wasn't that apparent. I do apologize for not consulting the Talk page first (the type of article seemed like one of those where discussion on the Talk page ends up with inaction in the article, so I skipped it), and for the somewhat aggressive tone of my last post - I just hate being unilaterally labeled "vandal" after supplying (IMHO) justification for a bold edit). 22:59, 10 November 2006 (UTC)
Sorry for calling you a vandal. Unilateral action (without consulting the talk page) under the be bold principle is good. You just need to be prepared to be reverted if someone doesn't like it. No problem there, my "bold" edits are reverted all the time and consensus is usually reached fairly promptly. So far we only have three contributers to this discussion, we need some more for a consensus. Grumpyyoungman01 05:05, 11 November 2006 (UTC)
The reason I don't like the Simpsons example is more mathematical. Its illustrates a very limited form of Correlation where the two random variables can just take binary values: bear patrol/no bare patrol, bears/no bears or rock/no rock, tigers/no tigers. Correlation is normally used in situations where you have continuous random variables, like the pirates examples. As such it is a poor possibly misleading example, and may actually be an example of a different logical falacy. --Salix alba (talk) 09:00, 11 November 2006 (UTC)

This is sort of related, so I'll ask here: is the Flying Spaghetti Monster example (and attendent image) really necessary? It's funny, but not a very good example (not to mention a much more limited audience has heard of FSM compared to the Simpsons - which I also think should be removed). VirogIt's notmy fault! 05:15, 12 November 2006 (UTC)

[edit] Correlation does not imply causation includes a misnomer in the word imply

The usage section is misleading. If we look at Logical implication which gives a strict mathematical meaning to the word imply we see that it is equivilent to If P then Q. This must hold for all possible P, that is all cases where there is a correlation between two variables. Further the souce cited is an email discussion which does not satisfy WP:RS. --Salix alba (talk) 11:39, 11 November 2006 (UTC)

You are talking about a mathematical meaning to the word imply. The word imply does not have that same meaning in ordinary usage. Yeah, it is a dodgy source, but I included it because it argues the case nicely. I think the section is unambiguious and I would have also thought uncontroversial, thus not in need of a citation. The correlation between diminishing numbers of pirates and levels of CO2 in the atmosphere does imply a causation, not in some hokey mathematical way, but in an Oxford dictionary way. Many readers would be coming from an Oxford dictionary school of thinking and not a scientific "lets take a common word and give it a subtle but significantly different meaning for our own purposes" school. Speaking about the oxford dictionary, here is a definition of the word "imply" from the Macquarie dictionary, there are two conflicting definitions, one follows the form from logical implication and another follows what I was arguing for.
1. To involve as a necessary circumstance
3. To indicate or suggest, as something naturally to be inferred
Now of course usually the mathematical definition would be used in an article on an aspect of logic, but the phrase "correlation does not imply causation" is widely used by lay people, with I think, use a lay definition of "imply", that is definition 3. But I have no evidence for that last point. My current thoughts are that ambiguities exist and the article needs a section to clear up these ambiguities. I agree that the section as it currently stands is misleading when using the logical definition of imply, but the opposite (such as the article pre my edit to "Usage"), is equally as misleading. Grumpyyoungman01 12:25, 11 November 2006 (UTC)
Brilliant research, Grumpyyoungman01! We'll have to mention this ambiguity of "imply" in the article --however, I don't think the correct concept is misnomer, as one of the concepts under the term imply's umbrella is actually the correct one. This leads to the question: Does a wrong term need to be used for it to be a misnomer, or is it sufficient to use a wrong concept that shares the same term? I think the first is correct. ---I added some stuff in an attempt to clarify, but I'm in serious doubt if the use of the concept prove in the paragraph is totally correct. In that case, "Correlation does not prove causation" would be true, and thus follows(?) that a totally precise and foolproof formulation of the fallacy would be "Correlation proves causation". I'm looking forward to some informed discussion here. Again, thanks to those who brought this up. Narssarssuaq 23:59, 11 November 2006 (UTC)
Also, given that "suggests causation" is true, that doesn't necessarily... uh, imply(involve as necessary circumstance) "suggests the direction of causation". Narssarssuaq 01:39, 12 November 2006 (UTC)
That was good edit Narssarssuaq. I have removed the dispute tag as the consensus is that imply is not a misnomer. Grumpyyoungman01 22:15, 12 November 2006 (UTC)

I think its important to give the technical mathematics definition of the term. For someone versed in mathematics when they use the word imply they mean the logical implication form of the meaning. From that page

The truth table associated with the material conditional if p then q (symbolized as p → q) and the logical implication p implies q (symbolized as p ⇒ q) is as follows: (snip)

Following Humpty Dumpty

"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean -- neither more nor less."

when a stastician uses the word they will mean precisly this form. --Salix alba (talk) 23:28, 12 November 2006 (UTC)

[edit] Disagreement about "suggests"

I still disagree that "correlation suggests causation" because of the lack of a time-order sequence of events in correlation. Often correlations are of two events that occur at the same time, and in order for one to cause the other, it must precede the other. If one event does not precede the other, there should be no implication or suggestion of causality. – Chris53516 (Talk) 14:38, 13 November 2006 (UTC)

This may exclude events that cause an instantaneous response. For example, a CO2 emission at once causes increasing the greenhouse effect, but there is still an undisputed causal arrow. In this, and many other examples of causation, the cause doesn't really precede the effect, at least not in a strong sense of the word. I'd like to hear your viewpoints on this if you disagree. Narssarssuaq 15:34, 13 November 2006 (UTC)
Even if the interval of precession may have to be measured in nanoseconds, the cause still precedes the effect. In such cases where the cause and effect appear to us to be co-existing, we should have doubt about causality until further evidence is available. For your example, I would say that the release of the CO2 is the cause, not the CO2 itself. Therefore, there was an event of the release of C02 that preceded a rise in the greenhouse effect. As the event unfolded, the greenhouse effect changed in proportion. Besides, the greenhouse effect is very complicated, and the release of more pollutants would take some time to effect global warming. (I disagree that you think this example is undisputed. I have heard accounts that have disputed the effects of CO2 emissions. Of course, I don't agree, but the fact is that it is disputed.)
I digress. My point is that unless there is solid proof of one event preceding another, we should not assume or even think of causality. It's like what politicians do. Our economy may have causes in the far past, but politicians will claim their bill passed last month or last year improved the economy. This is an example of where the two events have coincided: The bill was passed as the economy was improving, not before.
I think that, as humans who search for patterns, we are prone to seek patterns (e.g., causal patterns) and see it where it simply is not. It's like seeing that face on Mars. (If you shift the viewing angle, it doesn't look like a face anymore.) Or like seeing patterns in the stars. Therefore, I think the section that states that causality is suggested by coinciding is simply misleading and lends itself to the mistakes we make as pattern-seeking humans. People will read it and say, "So one event could cause the other; therefore, the number of pirates probably does effect global warming."
I believe that most events are caused by multiple events, since no event is in a vacuum. Therefore, saying A causes B ignores the other potential factors. In science, I would like to think we try to pinpoint which causes are stronger, not which cause is the only cause.
Chris53516 (Talk) 16:08, 13 November 2006 (UTC)
First of all, you're in part talking about the Regression fallacy. Secondly, the point is that the article excludes in every single case that correlation implies causation when "imply" is used in the logical, strong sense. Sometimes, however, if only very rarely, correlation MAY be suggested, however weak, by a correlation. The point in this article is just that a correlation doesn't NECESSARILY IMPLY a causal relationship or a certain causal arrow. So while "Correlation suggests causation" may usually be false, as you're saying, it's beyond the scope of this fallacy.
Thirdly, if an event is caused by multiple events, it may still just be made up by the sum of the parts. For example, the greenhouse effect is made up from CO2 emissions AND methane emissions AND NOx emissions AND CFC emissions etc. Your comments may fit into a philosophical debate of holism vs reductionism. However, encyclopedic activity and all science is based upon the phenomenon of analysis of a system, where everything outside the system is considered constant or irrelevant. Sometimes, this works perfectly well at least from a pragmatic point of view; at other times, one will want to make a more complex analysis by bringing more parameters into the picture. This has the advantage that it yields a better result, but the disadvantage that it takes longer time. Bringing in an infinite number of parameters would give total understanding, but would take infinitely long and thus would be the stupidest experiment ever. :)
After giving it a little thought, I agree that a CO2 molecule is emitted (if only nanoseconds) before it starts trapping infrared light or whatever they do to enhance the greenhouse effect. Thanks for pointing that out. However, if a CO2 emission lasts for, say, an hour, at a macroscopic level there will be a simultaneous increase in greenhouse effect this hour, excepting only the first nanoseconds or so, and adding the first nanoseconds after the emission stops.
In sum, I feel that you bring up a number of interesting issues, but right now I don't quite see what may be added or removed in the article from what you suggest. Narssarssuaq 17:32, 13 November 2006 (UTC)
I know. You're preaching to the choir regarding research scope. Someone once said something like "All things are probably related, but a study designed to examine the relationship among all things would fall apart." Any better ideas for a page title? – Chris53516 (Talk) 17:56, 13 November 2006 (UTC)
See also Fallacy of the single cause! Some of the fallacies listed on Wikipedia are apparently logical "shortcuts" that may actually work at some pragmatic level. Also, it's worth noting that committing fallacies is perfectly OK in humour. And it may also be in social chit-chat, where the main point often is to agree on something, and not necessarily to find truths. But that's a different story :) Narssarssuaq 18:20, 13 November 2006 (UTC)

[edit] Requested move to Correlation does not imply causation

I've started a requested move to Correlation does not imply causation at /Page title. --Salix alba (talk) 09:40, 20 November 2006 (UTC)

[edit] Removing the "Pirates vs global warming" chart

I have removed the graph which shows a correlation between the number of pirates and global warming (Image:Pchart.jpg). Even though it is trying to make a good (and fun !) point, this graph is of such low quality that we should not put it on one of Wikipedia's page. The X-scale, in particular, is terrible:

  • Most importantly, the points are not in order (35000, then 45000 then 20000) ! Any variable can be shown to be related to another variable with this kind of trick (It may not be a deliberate choice by the person who made the graph; Excel is particularly keen on plotting graphs such as this, and in most cases will not order your data).
  • It is not linear (the difference between 45000 and 20000 is the same as the difference between 400 and 17 !), thus increasing the feeling that there is an almost linear relationship between the variables.
  • It is reversed compared to common usage, thus giving the impression that there is a positive correlation.

Making a good graph would not be too hard; however, it would probably not be as impressive in term of showing correlation. Schutz 13:48, 27 November 2006 (UTC)

I agree with you, but I would like to point out that no data set will ever truly be completely linear. Data will go up and down. Regressions estimate a possible explanatory line that lay between differing points. But perhaps I'm preaching to the choir. – Chris53516 (Talk) 14:29, 27 November 2006 (UTC)
You are completely right, of course — which is the reason why the same point could probably be made using this data without doctoring it (and rearranging the x-axis qualifies as such), even if the graph would not look as impressive. Schutz 14:50, 27 November 2006 (UTC)
Yes, a correlation doesn't necessarily imply a linear relationship. Only a perfect correlation demands a linear relationship, and perfect correlations are just special cases. As far as I can remember from statistics class. Narssarssuaq 16:56, 27 November 2006 (UTC)

[edit] Redirect from "Cum hoc ergo proptor hoc"

Is there a reason this redirect doesn't exist? Mnc4t 17:10, 8 December 2006 (UTC)

You mis-spelled it. Its propter,[1] not proptor. However, if you think that's a common mis-spelling, you can make the redirect. — Chris53516 (Talk) 17:24, 8 December 2006 (UTC)
Nevermind, I made it. — Chris53516 (Talk) 17:29, 8 December 2006 (UTC)

[edit] Flying spaghetti monster example

This section is not well explained and will communicate nothing unless one already knows about the flying spaghetti monster pirate/global warming joke. Needs to be rewritten. --Xyzzyplugh 11:56, 10 December 2006 (UTC)

I do like this section as it does illustrate the point that its possible to get correlation between two wholely unconnected sets of data. --Salix alba (talk) 22:00, 10 December 2006 (UTC)
Except that I would like to actually see the data these correlations are based on — or more to the point, I'd like to see the value of these correlations. There was this ugly incorrect graph, which we removed recently, but I have not seen anything else. It is very likely that there is a correlation here (the number of pirates went down and the average temperate went up over the last few centuries), but if we are not able to present the data and/or show what this correlation actually is, then this example is not worth much and should be removed. Schutz 22:33, 10 December 2006 (UTC)
Yes, the pirates/global warming example is an amusing and fitting one, it's just not being explained properly as of now. --Xyzzyplugh 13:41, 12 December 2006 (UTC)

[edit] Fact templates

The numerous {{fact}} templates are probably a bit redundant with the main "citations needed" template at the beginning of that particular section. I move to take them out (to allow for easier reading) and trust that the first, enormous template will alert editors to try to find sources for the statements about Mr. Hume (or take out the section altogether, if everybody else here understands it as well as I do). 64.90.198.6 22:40, 12 December 2006 (UTC)

The {{fact}} templates are there to point out which statements need citations. That's the point of the template, and it's good to point out to the reader which statements are without source. — Chris53516 (Talk) 14:11, 13 December 2006 (UTC)