Talk:Birthday paradox/Archive 1
From Wikipedia, the free encyclopedia
Old Discussion
I think the birthday paradox is one of those problems where you have to be careful not to run into limited precision floating point problems, but I'm not sure. Anyone know? Martin
The problem doesn't seem to be overly sensitive to limited precision. Here's a bc program:
for (scale = 1; scale<10; scale ++){ prod = 1 for (i=1; prod>0.5; i++) { prod = prod*((365-i)/365) } print "scale=",scale," i=",i, " prod=", prod, "\n" }
(in bc, the variable scale is the number of digits used after the decimal point). The output is
scale=1 i=6 prod=.5 scale=2 i=20 prod=.48 scale=3 i=23 prod=.481 scale=4 i=23 prod=.4915 scale=5 i=23 prod=.49260 scale=6 i=23 prod=.492690 scale=7 i=23 prod=.4927016 scale=8 i=23 prod=.49270264 scale=9 i=23 prod=.492702751
which means that the correct answer i=23 is produced already with three-decimal-digit precision. The results are even better if one uses prod*(1-i/365) instead of prod*((365-i)/365). AxelBoldt 20:55, 19 Feb 2004 (UTC)
- The theory behind it was described in the American Mathematical Monthly in 1938 in Zoe Emily Schnabel's The estimation of the total fish population of a lake, under the name of capture-recapture statistics.
I'm not sure what "the theory behind" the birthday paradox is, but I'm sure that people like Pascal, Fermat, maybe even Cardano knew the formula or could have derived it with ease. They would probably have considered questions like "you throw a fair die 3 times; how likely is it that you get three different numbers?" AxelBoldt 20:11, 19 Feb 2004 (UTC)
What is the intuition of most people in this case? I really don't know. It doesn't contradict my intuition. Andries 20:17, 16 May 2004 (UTC)
- Most people find the probability of a match surprisingly high. That's why it was given the (not quite correct) name birthday paradox. -- Nunh-huh 20:24, 16 May 2004 (UTC)
- I believe most people have two problems with this: underestimating the number of matches that don't include themselves, and underestimating the multiplicative growth of each additional person added to the total. I imagine the thinking is similar to this: 365/2 = ~180 people will give me 50% coverage of birthdates. They disregard the fact that 180 birthdates chosen randomly are likely to have many collisions. Clearly, for 50% probability, or 23 persons, 180 is incongruous.
As far as I can see, the series of inequalities on the bottom of the page is wrong. The wrong step is the inequality that goes form the sum to the integral. If you actually do compute the sum (which isn't that hard to do), you get 1 − (n − 1) / 730; if you evaluate the integral you get 1 − n / 730, but 1 − (n − 1) / 730 > 1 − n / 730, not the other way around!
So perhaps the editors should either remove this part from the article or replace the < by ~ (approximately equal). (Or I could be wrong; it wouldn't be the first time.)
But if I'm actually right, then Paul Halmos has no right to be so smug about "tools that all students of mathematics have access to"...
Fun,
Sten
- If one quotes from Halmos, one should quote exactly. If Halmos got the math wrong, a separate comment should say so. I think the reasoning can be fixed fairly easily, and I would not do it by saying "approximately equal"; I would use inequalities ("<" or "≤"). Unfortunately I will have to wait until tomorrow to look at Halmos' book. Michael Hardy 18:04, 7 Jul 2004 (UTC)
Notations for logarithms
I'm very surprised with the very last paragraph of the article. I have mathematical education, so I'm not a "non-mathematician". I don't know about logarithms in English tradition, but in Russian `ln' stands for natural logarithm only, `lg' stands for base 10 logarithm and `log' can be a general logarithm with arbitrary base (the base is specified as index; if it is missed then the base typically doesn't matter for instance in expressions like O(log n)).
I think it would be just to remove the reference to "non-mathemations" as offensive. I accept that `log' can mean natural logarithm in English tradition, but I think its fair to take into account that for some (mathematically educated) people it is exactly `ln' that stands for natural logarithm.
- I agree and have amended accordingly. Please feel free comment and/or revert. -- ALoan (Talk) 09:57, 16 Jul 2004 (UTC)
-
- Since it's not part of the Halmos quote I won't be militant about it. If the Halmos quote had gone on a bit longer one would have seen him using "log" to mean natural log. Nowadays both "ln" and "log" are commonly understood by mathematicians to mean natural log. When Halmos wrote his autobiography, only about 20 years ago, he expressed contempt for the practice of many non-mathematicians of using "ln" rather than "log" for natural log, and said no mathematician had ever done that. By 1984, the date of publication, his claim was exaggerated. Nonetheless, it is still not unusual today to find mathematicians using "log" for natural log. Michael Hardy 19:59, 16 Jul 2004 (UTC)
Direct Solution
Does anyone know how to derive a direct solution to the birthday paradox(i.e. the way it is done on the article, the probability that it is not true is derived; then, that is subracted from one; I want to see a method to get the probability of a match without solving for the other one first). I know that the article's method is correct, but would be intrigued by the alternative solution because I attempted(and failed) to solve it that way. If someone knew it, I'd appreciate it if they posted it in the article. Superm401 04:58, 15 Jan 2005 (UTC)
- This method of switching to the complement is really a pretty and quite powerful trick; it's next to impossible to solve the problem without it. You'd have to distinguish between and deal separately with numerous cases: exactly one matching pair, exactly two matching pairs, exactly three matching pairs, ..., exactly one matching triple, exactly two matching triples,..., one matching quadruple,..., exactly one matching pair and one matching triple, ... etc. (and each of those cases is harder than the single case you have to solve when you switch to the complement). In the end, you'd have to do superhuman algebra to simplify your huge sum down to our tiny little formula. AxelBoldt 03:50, 26 Apr 2005 (UTC)
Leap Years
The article states the for a number of 366 or more people, the probability is 100%. But what if someone was born in Feb 29th in a leap year? Should this need a correction? Plus, how should the probability for n=366 be affected because of the probability of 1 or more people be born on a leap year?
- The article makes it clear that a year is assumed to have 365 days, plus other variables (distribution of birth dates through the year, incidence of twins and other multiple births, etc) are ignored. -- ALoan (Talk) 10:47, 17 Feb 2005 (UTC)
-
- Perhaps a section on the complexities introduced by considering leap year would be informative. Actually in a school setting (where students tend to be born in a narrow range relative to the 4-year leap year cycle) you would need to take into acount the demographics of the population. While this would make the problem too messy to solve in any clean way, it would illustrate the fragile nature of a closed form solution. -- Jake 01:22, 18 October 2005 (UTC)
Proposal to exclude Paul Halmos from this article
The entire An (non-fatal) error in Halmos' argument (whose general idea is right) section (and if that explanation is needed, the entire section preceeding it with the argument) is really long-winded, almost seems like original research rather than something that really belongs in an article like this one. I'm very tempted to remove both sections or at least the latter...
I did remove the C program example, though. Completely superflous. Nice programming, perhaps, but just not needed. Daniel Quinlan 08:54, Feb 23, 2005 (UTC)
- It would be unfortunate if the section on Halmos' argument were removed. The reason why that section is there is explained in that section; why don't you attempt to say what you disagree with in that? "Original research" would be things appearing for the first time here in this article, whereas Halmos' comments were published a couple of decades ago, and the article says so, so I find this "original research" comment absurd. Michael Hardy 00:10, 24 Feb 2005 (UTC)
-
- Well, at present, that section says "one guy said this; he was slightly wrong; here is what he should have said" whereas it would be better, I think, to jump to the conclusion and present the mathematical (as opposed to numerical) view, and then explain that the analysis follows that published Halmos, but that he was slightly wrong. It seems rather unbalancing to have such as large section of the article dedicated to such a seemingly unimportant part of the topic, but perhaps I underestimate its importance? Could you explain why it is important? -- ALoan (Talk) 11:42, 24 Feb 2005 (UTC)
- "A seemingly unimportant part of the topic"?? My initial reaction is that the whole topic is unimportant except because of that section.
- But OK, let's look at it from the point of view of those who've never seen it before; such people do exist. The topic does have pedagogical value apart from the section that mentions Halmos. But one does not remain in that early stage of learning forever; the point of Halmos' view is that there's more to this problem than what you learned in childhood when you first saw it; there is such a thing as thinking mathematically, and as Halmos says, the tools involved are things that every student of mathematics should know. Michael Hardy 18:51, 24 Feb 2005 (UTC)
After writing the above I looked at the article with a view toward making that section shorter, along the following lines: I would give the fully correct argument, stating that it was an adaptation of the one from Halmos' autobiography, and quote only those parts of Halmos that say why he considers that point of view important. But the quoted paragraph of Halmos is such a beautiful example of how to write that that would not be easy. Please read Halmos' words CAREFULLY. If after that you don't appreciate it, I'll just diagnose you as a vulgar caveman. Michael Hardy 19:09, 24 Feb 2005 (UTC)
- It is a nice quote, and it says some important things in a clear way, but it is less about the topic of this article (the birthday paradox) and more about mathematics in general. Or perhaps I am just a vulgar caveman... -- ALoan (Talk) 20:25, 24 Feb 2005 (UTC)
It is very directly on the topics of this article; it differs from the merely numerical computation of the probabilities by looking at the topic of this article from the point of view of "mathematics in general". Michael Hardy 20:57, 24 Feb 2005 (UTC)
Well, it's 2 against 1, so in the spirit of "be bold", I nuked the sections. I encourage someone to replace it with a more direct proof. Daniel Quinlan 07:40, Mar 2, 2005 (UTC)
- OK, I hope you won't take it personally if I say this proves illiterates outnumber the rest of us — I'll fix the vandalism when I have time. Maybe I'll make a separate article titled an example of Paul Halmos's beautiful writing and link to it. Michael Hardy 23:01, 2 Mar 2005 (UTC)
Now I've restored that section. I hope I've made it dull enough so that the persons who wanted to delete it will find it at least tolerable. Michael Hardy 02:58, 3 Mar 2005 (UTC)
I also find the Halmos quote, even in this "dull" version, vastly redundant, fairly obfuscated and downright showy. (I didn't even bother reading the original version.) What I don't like about it is:
1. It's too high in the article. If we really have to live with it, it should be made an internal or external link, or at the very least pushed at the bottom, possibly in microscopic size, as a historical/anecdotal note.
2. The title, and the whole section to be honest, is misleading at the very least. "Numerical" can never be opposed to "mathematical". I could accept opposing "numerical" to "analytical", but even so the section does present numerical calculations.
3. The section does not make it clear whether the error in the original argument is the one explained at the bottom of the section, or it has been silently corrected in the derivation.
4. As the last sentence points out, the section is not an alternative derivation of results perviously obtained, which by the way are not "numerical", but perfectly analytical. It's not about having a different view on this problem, and even less a different approach to its solution, it's about picking this problem to illustrate a perhaps interesting, but unrelated concept.
Wouldn't this section be more relevant in Halmos' page?
As an aside, people are entitled to disagree, and calling them illiterates, cavemen, and vandals because of that is not what I expect from an admin. And by the way, it's now three against one. --PizzaMargherita 20:31, 2 October 2005 (UTC)
- Note, the above was posted by User:195.137.39.109 and then edited by User:PizzaMargherita. I'm a bit confused too. — Ambush Commander(Talk) 21:49, 2 October 2005 (UTC)
-
- Yeah, that's still me. I got myself an account in the meantime. By the way, as a result of my recent streamlining of the article, this "Halmos" section has been pushed down because it was in the way. It was not my intention to act on this issue without civilly waiting for feedback on my comment above.
In response to PizzaMargherita's criticisms: I've changed the title of the section. However:
- I think it's too low in the article, and if "it's three against one" I think that's only because the page as a whole has been neglected by most Wikipedians concerned with mathematics.
- I think this attitude is so lame (especially of an administrator aspiring bureaucrat) that I'm considering reporting it. Nobody agrees with me, therefore they are not representative of the entire population (or, in your even less noble words, illiterate cavemen). What next, hiring sock puppets to support your argument? Get over it.--PizzaMargherita 23:49, 2 October 2005 (UTC)
- There is no error explained at the bottom of the section. The article says nothing specific about what the error was. (As you can see for yourself.) In trying to guess what you think is an error explained at the bottom of the section, I can only speculate that you read the last paragraph: Halmos' derivation shows only that at most 23 people are needed to ensure a birthday match with even chance; since we don't know how sharp the given inequalities are, the argument leaves open the possibility that n = 22 could also work. It would not have occurred to me that anyone might think that this explains any error, unless it was because of the word "However". I've deleted that word.
- No, it would not make more sense in the page on Paul Halmos. It's obviously not among his most important writings, and the article about him is short as it stands. When more is added, this little thing should be a long way down the queue. However, don't you think this article is better if the reader finds out something beyond what he or she learned in high school -- that this isn't necessarily only a secondary-school-level topic? Why limit the whole article to things that any freshman who thinks the problem through would figure out?
Michael Hardy 22:33, 2 October 2005 (UTC)
- Because Wikipedia should be complete, but also minimal. I'm sure Paul Halmos has expressend his view on many other things, not only on how the solution of the birthday problem should be taught to students (dear me, I can't believe I just wrote that). He may also have voiced concerns about Disk algebra and even the Archbishopric of Salzburg, but you don't go around the Wikipedia littering those articles with his view, do you?
- Conversely, I'm pretty sure that he was not the only eminent person to have said something vaguely related to this problem. But I can't see anybody else adding those people's 2 cents to this page, do you?
- As you point out, not even Paul Halmo's page would welcome this section, so I stand corrected. I'm afraid its home, if anywhere on the web, is in your private website, where you are more than welcome to have an entire page bragging about how you (think you have) discovered a mistake in Paul Halmos' reasoning.--PizzaMargherita 23:49, 2 October 2005 (UTC)
-
- I agree that the quotation from Halmos is a nice piece of writing, and I strongly sympathise with his point of view, but I don't think it belongs in this article in its entirety. Definitely the mathematical argument should be retained, with appropriate attribution, and in fact I think it should be moved closer to the top of the article. The only part of the quotation which is really relevant is "the inequalities can be obtained in a minute or two, whereas the multiplications would take much longer, and be much more subject to error, whether the instrument is a pencil or an old-fashioned desk computer", and I think something along these lines is definitely worth mentioning. Perhaps "The significance of the following argument is that it enables one to estimate the cut-off point without needing to perform a whole series of multiplications" or something similar instead. Dmharvey Talk 00:25, 3 October 2005 (UTC)
-
-
- I am strongly against moving this section anywhere other than further down or in the bin, as it would break (and confuse) the logical and linear (at least to me) exposition as it stands up to that point.--PizzaMargherita 01:49, 3 October 2005 (UTC)
-
I am not the one who discovered the error. You can see who it was if you read this discussion page.
Your reasoning makes no sense:
-
- He may also have voiced concerns about Disk algebra and even the Archbishopric of Salzburg, but you don't go around the Wikipedia littering those articles with his view, do you?
That is ridiculous!! No, one should not put every eminent person's opinion on every subject into every article. But that is irrelevant here. Halmos' idea was (obviously!) not included simply because he's an eminent person who said something about this. It's included because it's relevant; it's something the reader could be expected to be glad to know, NOT about Halmos, but about the birthday paradox. As I said, it's the only part of the article that takes the math beyond the point where any secondary-school student who thinks it through would take it. One reads an article like this in order to learn more than one would figure out for oneself. Michael Hardy 00:54, 3 October 2005 (UTC)
- Sorry for missing who spotted the error (it was in another section of this page). However, all my other arguments still stand, and I still respectfully disagree.--PizzaMargherita 01:49, 3 October 2005 (UTC)
... oh, and how is it that you find this part "obfuscated"?? Halmos always writes with beautiful clarity. And this is a good example of that. Michael Hardy 01:21, 3 October 2005 (UTC)
- Unfortunately you don't seem to follow the example.--PizzaMargherita 01:49, 3 October 2005 (UTC)
Here's something I didn't even notice the first time through your writing: "it's about picking this problem to illustrate a perhaps interesting, but unrelated concept." "Unrelated"?? That's absurd!!! This is obviously not about picking the birthday paradox to illustrate something unrelated; this is obviously about how some standard topics from the first couple of years of undergraduate mathematics can illuminate a topic that might otherwise appear to be only about concrete numbers. Michael Hardy 01:25, 3 October 2005 (UTC)
- No matter how many exclamation marks you use, how many times you say "obviously", and how many times you say that comments that disagree with you are "absurd", this section remains in my (and others') opinion largely unrelated. Certainly much less related than gazillions of things that could be said about this problem/paradox.--PizzaMargherita 01:49, 3 October 2005 (UTC)
So are you claiming that anything that treats the problem by any means other than secondary-school-level mathematics is "unrelated"? Michael Hardy 02:03, 3 October 2005 (UTC)
- Actually, not necessarily "unrelated", but more like "inappropriate". The mathematics sections of Wikipedia have this problem: how formal do you get before you've gone to far and made the article nonencyclopedic? For instance, the first incarnation of Database normalization was a real pickle to read: all of it was formal definitions of each normal form. As the article progressed, these formal definitions where removed, and more "layman" term explanations where replaced. This article doesn't seem to have this problem: it appears to be clear enough to the ordinary person, but it comes down to the question is the formal proof too burdensome for the purposes of an encyclopedia? — Ambush Commander(Talk) 02:17, 3 October 2005 (UTC)
- "So are you claiming that anything that treats the problem by any means other than secondary-school-level mathematics is " unrelated"? " No. I have absolutely no problems with the level of mathematics in the section, which frankly IMO is not much different from the level of the rest of the article. The problem is that this section is about pedagogy of mathematics. It does pick this problem (like it could have picked another one) to illustrate the point, but it adds virtually no value to this article, and is therefore... how shall I put it... inappropriate (thanks Ambush Commander).--PizzaMargherita 06:50, 3 October 2005 (UTC)
Most Wikipedia articles on mathematics, on which hundreds of mathematicians have worked, are far less "layman"-oriented than any part of this. Would you delete all of those? This article, including the section on Halmnos' observations, is limited to lower-division undergraduate material. So if it goes at all beyond secondary-school level, it should not be here -- it should just get deleted? This should be an encyclopedia of high-school math? Lower-division undergraduate math is insufficiently "layman"-oriented for inclusion in an encyclopedia? Michael Hardy 03:03, 3 October 2005 (UTC)
- I recommend removing the Halmos quote from that section. It appears to me to be simply espousing his POV beliefs about how mathematics should be done (even though I tend to agree with him), rather than adding anything immediately relevant to this article. In this context it does seem "showy", and would better be placed in an article about Halmos himself rather than here. - Gauge 03:17, 3 October 2005 (UTC)
I'm glad I write about topics that most people don't understand. Less controversy that way. ;-) FWIW, Michael Hardy, Dm Harvey and Gauge are some of the senior mathematicians on WP, so I take their opinions seriously. And Paul Halmos was famous, and so surely his ideas count for something too. linas 04:20, 3 October 2005 (UTC)
- You are asked for your opinion here. Do you have one?--PizzaMargherita 06:50, 3 October 2005 (UTC)
Here are some of my thoughts:
- I like the section, and I think it most definitely should stay in the article. It seems important, relevant and appropriate.
- I like the new title "A conceptual rather than computational view" much better than the old one. So thanks to Pizza and Michael for working together successfully on that.
- I have no problem with the Halmos quote. I find it to be a relevant quote explaining the importance of this argument in helping to provide a fuller understanding of the birthday problem.
Paul August ☎ 04:23, 3 October 2005 (UTC)
- So you find it relevant/appropriate/related/important. Could you please explain why it is more so than the "Applications" section (for instance), thus deserving a higher position in the article?--PizzaMargherita 06:50, 3 October 2005 (UTC)
- Well I can't explain why this section is more important than the applications section, since I haven't formed an opinion on that. However I don't think there is necessarily a 1-1 correspondence between position and importance. Paul August ☎ 16:08, 3 October 2005 (UTC)
May I suggest that we are actually debating two separate issues here. We might all be better served by splitting the conversation into two pieces as follows.
On the merits or otherwise of including Halmos's mathematical argument
I've already stated that I think Halmos's argument should stay. Wikipedia can be both a general-audience encyclopaedia, and an encyclopaedia aimed at people with more background (in this case mathematical), as long as we are careful how we organise the material. The relationship between Derivative and Derivative (generalizations) is an excellent example. We would be doing many of our readers a grave disservice by leaving out Halmos's discussion. Dmharvey Talk 13:21, 3 October 2005 (UTC)
- I strongly agree with what Dmharvey has said here. Paul August ☎ 16:08, 3 October 2005 (UTC)
- Agree that the mathematical argument should stay, in one form or another. - Gauge 03:44, 6 October 2005 (UTC)
-
- (Well done for identifying the two separate issues.) I think that "We would be doing many of our readers a grave disservice by leaving out Halmos's discussion" is an overstatement. Ok, I don't rule out completely that there may be a little value in the mathematical argument, but it needs a lot of work for this value to be brought out. In that sense, the title and the opening sentence don't make it clear at all what the section is all about, and (sorry to repeat myself) why it's relevant there. Why would I read it? Why would I skip it? It all sounds insipid and bombastic at the same time. I will try to come up with a more "to-the-point" version that preserves the math.--PizzaMargherita 17:41, 3 October 2005 (UTC)
-
-
- I have mixed feelings about the value of including the argument in this article. I agree with Halmos that it is an opportunity to see some interesting and valuable mathematical techniques. For example, in computational complexity theory such methods are vital; likewise in analysis. My reservations concern whether this is an appropriate place to be doing that teaching. I lean in favor of retaining the material, because I don't think it will hurt the article for a general audience, and it may help some.
- That said, the exposition was a mess. Especially troubling was the mix of equalities and inequalities, and the lack of almost any explanation of the substitutions. I have rewritten the section — at the cost of more length — to try to explain what we're doing, why it's worthwhile, and to do it more cleanly. I hope the mathematicians like it, and am especially interested in PizzaMargherita's response to the rewrite. --KSmrqT 21:09, 3 October 2005 (UTC)
-
-
-
-
- "The exposition was a mess"? Your re-written version is written as if you're trying to explain some really basic parts of advanced secondary-school level math to students at that level, and even if that's appropriate, it hardly means that it's a "mess" if it's presented in a manner intended to be convenient for people who know math well. As it was written, it wasn't so far from the way Halmos wrote it, and Halmos has a well-deserved reputation as a clear expositor. A "mix of equalities and inequalities", such as
-
-
-
-
-
- etc., is a pretty standard way of writing things, because it's efficient and avoids undue wordiness. (The "substitutions" were explained, albeit only after the sequence of equalities and inequalities.) Michael Hardy 23:27, 3 October 2005 (UTC)
-
-
-
-
-
-
- (Indented Hardy response.) Perhaps you'd rather wait and let PizzaMargherita rewrite the "insipid and bombastic" version, or delete it? ;-)
- But let me address one subtlety, because I think it's important. If we mix equalities and inequalities in multiline form,
-
-
-
-
-
-
-
-
-
A = B < C , = D ,
-
-
-
-
-
-
-
-
-
- a quick skim suggests that A = D, especially with many lines and lengthy expressions, so that a reader may look at the first and last lines for a summary. Such a misunderstanding is less likely in the case of brief one-line form,
- A = B < C = D.
- What we saw was a mix of equalities and inequalities, and also a mix of one-line and multiline forms. It was not a considerate exposition even for readers who know mathematics well. Could we slog our way through it? Yes, but that doesn't make it good writing. IMHO. --KSmrqT 01:07, 4 October 2005 (UTC)
- a quick skim suggests that A = D, especially with many lines and lengthy expressions, so that a reader may look at the first and last lines for a summary. Such a misunderstanding is less likely in the case of brief one-line form,
-
-
-
-
-
-
-
- Ah, the joys of original sources, or nearly so. Digging back through the history, I find that Michael Hardy introduced the section purely as a longer quotation from Halmos, but gave it a heading more inflammatory than the text. Later, Halmos' error was pointed out, and various attempts were made to correct it. Except for the error I like Halmos' version even more than mine, and prefer either one over most of the intervening "fixes", which are not so appealing. --KSmrqT 06:31, 4 October 2005 (UTC)
-
-
-
-
-
-
- I don't have time to read the review (or should I call it "preemptive strike"? ;-)) carefully at the moment. Just a few comments.
- The explaination is certainly clearer now. Though at times, I think it's excessive (e.g. (e^a)^b = e^(ab)). Some of you got the idea that I find the section too advanced. It's not that, the maths is not difficult. It was just badly exposed.
- There is considerable overlapping with the previous sections. In particular:
- "Calculating the probability" would do with a more explicit mention of the fact that p and p are probabilities of two complementary events, and this should not be repeated in Halmos' section. The same symbols are used, therefore what they refer to is understood.
- "Calculating the probability" would also do with the "product" formula of the probability p, (i.e. a non-expanded version), which should be referred to in Halmos' section.
- The sections above are generic in d, why should this section not? "Because Halmos..." I don't care what Halmos originally wrote. This is the WP, not his book, and we are writing it. Once we say that this section is adapted from such and such source (and this is all we should write about it IMO), we can write all we want. And by the way, for the same reason, we should use only one form of "log/ln" across the article, no, across the WP. By the way, which convention does WP adopt, if any?
- I'm starting to think that the same result can be readily derived by the Taylor expansion approximation of p in the "Calculating the probability section" above. In other words, the fact that the e^-(n^2-n)/2d is an upper bound of p and not only an approximation, could possibly be derived directly by looking at Taylor's series.
- --PizzaMargherita 07:04, 4 October 2005 (UTC)
- I don't have time to read the review (or should I call it "preemptive strike"? ;-)) carefully at the moment. Just a few comments.
-
-
-
-
-
- Ok, I modified the title and intro to something more to-the-point. Given the feedback on Halmos and the pedagogic aspect, I confined it to the footnote. Comments welcome.--PizzaMargherita 07:04, 4 October 2005 (UTC)
-
-
-
-
-
- Points 2a, 2b and 2c are done (since I didn't hear any objections). As for point 3, it actually applies. I don't know how it escaped me for so long, but the whole argument can be explained by the 1 − x < e^{−x} inequality, given the explainations of the approximation that we have already given in the first section. I find this approach to the same result (the upper bound) more direct and linear than what we have atm. I appreciate that switching to this simpler argument would get rid of most of Halmos' stuff, so please voice your opinion.
-
-
-
-
-
- To be clearer, my proposal is:
- That's all it's needed.--PizzaMargherita 06:59, 6 October 2005 (UTC)
- To be clearer, my proposal is:
-
-
-
-
-
-
- Using generic d seems unnecessary clutter here, so I do object to that. I strongly object to the latest proposal, and also to a prior edit removing introductory sentences, both for the same reason. Namely, the point of this section is not merely to derive a bound (we've already done that).
- No, we haven't.--PizzaMargherita 09:55, 6 October 2005 (UTC)
- If that were the game, we could kill the entire article except for a sentence that says "and the answer is: 23". What is the point being made by Paul Halmos, Michael Hardy, Dmharvey, Paul August, Gauge, and me (and others, I think)? Let's try an experiment: Please state your understanding of why we want this section. (Hint: It was explicitly stated in those sentences you removed.) --KSmrqT 08:24, 6 October 2005 (UTC)
- Using generic d seems unnecessary clutter here, so I do object to that. I strongly object to the latest proposal, and also to a prior edit removing introductory sentences, both for the same reason. Namely, the point of this section is not merely to derive a bound (we've already done that).
-
-
-
-
-
-
-
-
-
- Please do me the courtesy of leaving my talk page words intact, as they were written, with signature. Thanks.
-
-
-
-
-
-
-
-
-
-
-
-
- I just moved the sentence in a new section.--PizzaMargherita 08:53, 8 October 2005 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
- Now, what part of "I strongly object" was not clear?! Why did you ask for opinions on your proposal if you were going to ignore them? Your edit flies directly in the face of my insistence that the point is not merely to derive a bound. (Whether we've done it or not is irrelevant.) I'm quite serious when I ask you to state your understanding, because your behavior indicates that we are not on common ground. --KSmrqT 02:11, 8 October 2005 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
- I have stated my understanding in a concise way in the version that you have removed, as well as in this talk page. I ignored your objection (the only one) because it made it clear that you had completely missed the central mathematical argument - which you may find irrelevant (your missing the argument, not the argument itself), but I do not. So you're right, we were not on common ground, but you seem to have caught up. (Out of interest, was it thanks to my explainations?) I believe that the consensus is to drop the pedagogic part and concentrate on the mathematical contents. So why using an "important inequality" like the one about the arithmetic and geometric means, when it's clearly unnecessary?
- A list of things that I don't like about the section as it stands:
- The title "Implications of inequalities" is too vague. The expression "upprer bound" should appear in it.
- The opening "For variations of the birthday scenario in broader contexts, a different flavor of argument is essential." is even more vague, and IMHO redundant, as it doesn't bring in any value.
- "The general idea is..." to find an upper bound. "If our final expression..." Our writing should be concise.
- As I said, the arithmetic/geometric inequality can go without affecting the mathematical argument
- If we have to have 365 instead of d, then we should keep "2x365", as opposed to "730". We don't like numerical stuff, remember?
- Several explainations are so verbose they are distracting, and so detailed they are patronising. Why don't we explain the meaning of "+" in every article?
- Use of "log" vs "ln" is inconsistent with the rest of the article. Or vice versa.
- --PizzaMargherita 08:53, 8 October 2005 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Concensus is not voting, but coming to a common understanding. Ignoring my objection is not concensus. And if we were voting, we would not count your vote as many times as you have stated your opinion; I count about half-a-dozen voices disagreeing with you throughout this discussion, often vehemently. Most voices support pedagogy; it's yours that persistently opposes it. We love mathematics, we teach it, we understand the importance of the inequalities and the flavor of argument using them. Yet for that view you show only contempt. You did not respect Michael Hardy when he said it, you did not respect several other mathematicians when they said it, and you ignore me when I say "I strongly object". And by the way, I suggest you read what's been said more carefully, because I was not asking about your understanding of the derivation of the bound; I asked "Please state your understanding of why we want this section", which is another matter altogether. The correct answer is, we want it for the pedagogy, which is exactly the opposite of your view of the concensus. Apparently you do not respect our view, so you ignore it and substitute your own. That's not how we like to do things at Wikipedia, nor for that matter in most of the real world.
- If you are willing to accept pedagogy, rather than merely deriving a bound, as a goal of the section, then we can discuss refinements; otherwise, we are wasting our time talking at cross purposes. I suggest you accept, because, frankly, the only real justification for the entire article is pedagogy. --KSmrqT 05:27, 9 October 2005 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- "Ignoring my objection is not concensus." You are right, it's not. And it's not even "consensus", for that matter. It's assuming that people who have actually understood what we have decided to write about (the mathematical argument, see the title of this subsection and the quotations below), and are therefore entitled to write about it, did not have any problems with my proposal.
- "I count about half-a-dozen voices disagreeing with you throughout this discussion, often vehemently." See, if they don't type I have trouble seeing and hearing them from my workstation.
- "Most voices support pedagogy; it's yours that persistently opposes it." Intresting point of view. Who said the following? (Hint: not me.)
- "I don't see any reason that this particular article should discuss pedagogy in this level of generality where most of our mathematics articles do not."
- "I agree that Halmos is talking about pedagogy, and to that extent it is out of place, here."
- "[the whole Halmos section] seems like original research rather than something that really belongs in an article like this one."
- "it would be better, I think, to jump to the conclusion and present the mathematical (as opposed to numerical) view"
- "I encourage someone to replace [the section] with a more direct proof"
- (my favourite) "My reservations concern whether this is an appropriate place to be doing that teaching."
- Rest assured that when most poeple disagree with me, I am (unlike other people) capable of recognising it and accepting it (see generalisation section below).
- "You did not respect Michael Hardy when he said it". Uhm, perhaps because he showed no respect to who was opposing his view? I refer you to the beginning of this talk section.
- "If you are willing to accept pedagogy, rather than merely deriving a bound, as a goal of the section, then we can discuss refinements". Most refinements (my 7 point above) can be discussed even without an agreement in this sense.
- --PizzaMargherita 09:47, 9 October 2005 (UTC)
- 8. Another thing that I think should be mentioned ("essential" for a broader-flavoured understanding with different blah-blah-contexts in variations of arguments, or something like that) is that the formula we end up with is the same as the approximation in the previous section, which therefore is not only an approximation, but also (guess what?) an upper bound.--PizzaMargherita 06:17, 10 October 2005 (UTC)
- Would it be fair to say that nobody objected to the fact the arithmetic/geometric inequality is not needed, and the same result can be achieved without it? If so, I shall proceed and take it out.--PizzaMargherita 11:41, 24 October 2005 (UTC)
- Hoping to meet everyone's tastes, I'll change the title of the section to "An upper bound and a different perspective" and will remove the introductory sentence. PizzaMargherita 23:50, 1 November 2005 (UTC)
- If nobody objects, I'll effect the changes proposed in the points 3, 5, 6, 7, 8 above. The text is now out of sync with the derivation. PizzaMargherita 12:32, 10 November 2005 (UTC)
-
-
-
-
-
-
-
-
On the merits or otherwise of including Halmos's quotation
I've already stated that I don't think Halmos's quotation should remain in full. I don't see any reason that this particular article should discuss pedagogy in this level of generality where most of our mathematics articles do not. Dmharvey Talk 13:21, 3 October 2005 (UTC)
- I think that Halmos' quote is doing two things at once. I agree that Halmos is talking about pedagogy, and to that extent it is out of place, here. But at the same time he is also explaining how his argument helps to provide a fuller understanding of the birthday problem, and I think this latter point is wholly relevant here. I don't really see how we can get rid of the former and keep the latter. Perhaps It might be better to remove this quote to a note at the bottom. I will be bold and do this, and see what folks think. Feel free to revert if anybody doesn't like it. Paul August ☎ 16:08, 3 October 2005 (UTC)
- As far as I'm concerned, the pedagogic part should go. The aspects that are not irrelevant are inappropriate. If we want to attribute the mathematical argument above to Halmos - fine, but I wouldn't spend more than one sentence (or perhaps just a mention of the name) on that. As I said, I'll give it a go. Putting a footnote may be the right direction, thanks Paul.--PizzaMargherita 17:52, 3 October 2005 (UTC)
- The footnote seems to be a good compromise if some people insist on keeping the quotation. Personally I am not bothered by how the article reads now. - Gauge 03:48, 6 October 2005 (UTC)
Generic d-formulae vs. specific 365-formulae
Using generic d seems unnecessary clutter here, so I do object to that.
- Let's see: "d" = 1 character, "365" = 3 characters. Where's the clutter again?--PizzaMargherita 09:55, 6 October 2005 (UTC)
- Wait, I may understand what you mean: you don't like p(n;d), you'd rather see p(n). I agree on that part, it's awkward to carry it around, and it's evident from the formula that d is a parameter. I propose we define it as p(n;d) the first time around and we explicitly say that in what follows it's understood that by p(n) we mean p(n;d). This is notwithstanding the proposal below, to give the 365-specific solution at the very top of the article.--PizzaMargherita 09:55, 6 October 2005 (UTC)
I strongly object to having d instead of 365 in the introduction. The introduction should be completely accessible to anyone who can multiply fractions. The best thing about this problem is its accessibility; changing it to d loses some of that. (It also should not use product notation.) Dmharvey Talk 12:02, 6 October 2005 (UTC)
- The introduction has been that way for some time now... you mean the introduction of the article, right? In that case, I can see where you are coming from, as I agree that we shouldn't scare people away. We could give the first formula for 365 and right after state the more general result and carry on with d. What do you think? I'm sure those interested won't have a lot of trouble substituting d with 365... not any more than understanding Taylor series truncations or the probability of complementary events. And the importance and applications of this problem have nothing to do with birthdays. That's why I think that general d-results will be more useful to people who land here.--PizzaMargherita 12:24, 6 October 2005 (UTC)
Any competent mathematician can generalize the argument from 365 to "d" easily enough themselves. The point is that this article is about the birthday paradox, meaning that the focus should be on 365. If PizzaMargherita wants to write another article on generalizations of this approach and their applications, that is fine with me, but changing 365 to "d" here just adds another hurdle for the interested reader. - Gauge 03:21, 8 October 2005 (UTC)
- Indeed! Paul August ☎ 03:54, 8 October 2005 (UTC)
Fine. So if you like the proposal above I'll do that when I have some time.--PizzaMargherita 09:03, 8 October 2005 (UTC)
Though I disagree with Gauge when he says that "changing 365 to d here just adds another hurdle for the interested reader". The reader may well come from Birthday attack or Hash tables, and she would want to find the generic formula right there. I don't find the "hurdle" of substituting incredibly difficult. I'll do (or rather, restore) a "generalisation" section.--PizzaMargherita 09:03, 8 October 2005 (UTC)
Near mathches
Is the table in Near matches really correct? It doesn't match with my calculated probabilities for near matches, nor my empirical tests. Also, unless I'm doing something seriously wrong, one is not required to use the inclusion/exclusion principle to calculate this.
For k=5, 8 ppl are required (59.3158%), k=6, 7 ppl (55.0188%), k=7, 7 ppl (60.6174%). --Yarin 20:43, 6 August 2005 (UTC)
- Could somebody give the details of these calculations to evaluate? --Neshatian 12:12, 9 May 2006 (UTC)
Editing proposal
I propose moving the computer code to a separate article that would serve as an appendix, to which this article would link. That would help keep this article readable. Michael Hardy 02:58, 3 Mar 2005 (UTC)
- A good idea - please would you resurrect the deleted versions from the edit history when you do. -- ALoan (Talk) 10:35, 3 Mar 2005 (UTC)
- I'd prefer to leave the current programs in the article, I think they're more useful to many readers than the math discussion. Daniel Quinlan 09:47, Mar 5, 2005 (UTC)
-
- I disagree, a pseudocode rendition would be the most useful while a version in each of 5 languages is just a mess. I'm removing all but one. --Gmaxwell 17:31, 6 Jun 2005 (UTC)
-
-
- Well, leaving the python version was obviously going to cause trouble :-) I had a stab at writing a Wikipedia:wikicode equivalent. Apart from not knowing how to correct 'print', it seems correct. Richard W.M. Jones 22:25, 6 Jun 2005 (UTC)
-
-
-
-
- Should the various code versions not be moved to Wikisource - they seem to have adopted that solution at Monty Hall problem. IIRC there are half a dozen versions in various languages in the edit history.-- ALoan (Talk) 30 June 2005 21:47 (UTC)
-
-
blog link
http://inclinedtocriticize.blogdrive.com/archive/240.html
Why is that there? It's neither particularly enlightening nor detailed.
Good now
I like the current article quite a bit, I'd prefer to leave the current computer examples, I think they are at least as accessible by the average reader as the mathematics, if not more accessible. So, nice job, Michael. That being said, I'm not impressed by your repeated insults, though. Not cool. Daniel Quinlan 09:47, Mar 5, 2005 (UTC)
Why 100%
The main page assumes no leap years. Therefore, once all 365 days have been taken, the next person has to be on one of the days already used. Hence, 100%. Superm401 | Talk 01:47, Jun 4, 2005 (UTC)
Merge
I would do the merge, provided there is no copyvio. Superm401 | Talk July 1, 2005 12:13 (UTC)
Mistake in the first equation?
My maths may be poor, but the first equation to calculate the probability p that the n birthdays are different seems wrong. The equation given is:
However according to the BIDMAS rules the addition for the last fraction will happen before the subtraction, so the equivalent equation is:
And I think it should be:
Any comments?
365 − n + 1 means (365 − n) + 1, not 365 − (n + 1). That is universally standard. What in the world is "BIDMAS"?? A programming language? Or is it one of those mnemonics by which children learn math conventions? Michael Hardy 22:48, 15 July 2005 (UTC)
- Hmm, something interesting, why isn't it: :? — Ambush Commander(Talk) 23:48, July 15, 2005 (UTC)
- The article still needs to be changed to remove the ambiguity. The average reader cannot possibly be expected to work out the order of evaluation from a convention that they know nothing about, regardless of whether it is technically correct or not. Lee J Haywood 06:56, 16 July 2005 (UTC)
-
- I've added parentheses, does that resolve the ambiguity? — Ambush Commander(Talk) 13:15, July 16, 2005 (UTC)
-
-
- If adding parentheses, it should be 365 − (n − 1), not (365 − n) + 1 because the latter is not following the logic. And I seriously cannot imagine how a person (even that average reader) cannot know about order of evaluation of plus and minus operations, sorry. --Paul Pogonyshev 17:14, 16 July 2005 (UTC)
-
- Michael, BIDMAS = Brackets, Indices, Division, Multiplication, Addition, Subtraction. That's how people are often taught to evaluate expressions (maybe it's restricted to the UK?), the mistake being that division and multiplication evaluate left to right together, followed by addition and subtraction evaluate left to right together.
-
-
- I've taugh math at five different universities, including one where nearly all students were academically weak and also including MIT, the others being somewhere in between, and I've done quite a lot of private tutoring of both academically weak and fairly gifted students, and I've never hear of BIDMAS or PEMDAS. I have heard that some people use mnemonics in attempting to learn mathematics, but I've always ignored that. One confusion I fear might happen as a result of these rule is that people might think that means (ab)c, whereas in fact Michael Hardy 18:17, 24 July 2005 (UTC)
- First of all, just because someone originally learned with a mnemonic device doesn't mean they will never understand the fundamental concepts. PEMDAS was useful in introducing me to order of operations. I also disagree that it will cause the error described above. On the contrary, PEMDAS is perfectly correct even when taken literally with no thought. We first notice there are no parentheses. Then, we move on to exponents. The exponent of the first power moving left to right, is b^c. Hence, we evaluate that first, getting (b^c). We then see(moving left to right) another power, a^(b^c). We evaluate that, getting the right answer. Superm401 | Talk 04:13, July 25, 2005 (UTC)
- I've taugh math at five different universities, including one where nearly all students were academically weak and also including MIT, the others being somewhere in between, and I've done quite a lot of private tutoring of both academically weak and fairly gifted students, and I've never hear of BIDMAS or PEMDAS. I have heard that some people use mnemonics in attempting to learn mathematics, but I've always ignored that. One confusion I fear might happen as a result of these rule is that people might think that means (ab)c, whereas in fact Michael Hardy 18:17, 24 July 2005 (UTC)
-
Well, on my browser, the a appears (below and) to the left of the b, and the b appears (below and) to the left of the c. And that's how I've always seen it written. So I really don't understand your argument about that. Michael Hardy 20:47, 25 July 2005 (UTC)
- Of course it is. Just as 3 is the base of 3^5 because is below and to the left, a is the base of a^b^c because it is below and to the left. The base is evaluated after the exponent. That's what I'm trying to say. Superm401 | Talk 21:25, July 25, 2005 (UTC)
- Your rule is right if the base is evaluated after the exponent, but you said left-to-right, and left-to-right appears to suggest evaluting the base first. Michael Hardy 23:34, 3 October 2005 (UTC)
Merge
The link in the merge template on the page refers here, but I don't think anyone has actually started to talk about it yet. I don't think that page should be merged. I'm doubtful about whether it should even be kept. It uses obscure abbreviations, doesn't explain itself, and the table may be a copyvio. Furthermore, I don't think the idea of a birthday distribution is nearly as common as the "paradox" itself. I'd like to remove the merge tag. What do others think? Superm401 | Talk 17:26, July 16, 2005 (UTC)
- It looks like a copyvio of http://www.mathcad.com/library/LibraryContent/puzzles/soln28/exact28.html --Audiovideo 14:19, 19 July 2005 (UTC)
Empirical test
Empirical test in article is not a simulation. It is just some ready formula. Here is real test (in c#):
Random rnd=new Random(); int total_pairs=0; int tries=1000; for (int t=0;t<tries;t++) { //pick random birthday for every person ArrayList persons=new ArrayList(); for (int i=0;i<23;i++) persons.Add(rnd.Next(365)); int same_birthdays=0; //check for birthday pairs for (int a=0;a<23;a++) for (int b=0;b<23;b++) { if ((int)persons[a]==(int)persons[b] && a!=b) same_birthdays++; } if (same_birthdays>0) total_pairs++; } Console.WriteLine("chance of same birthday pair for 23 persons: "+(double)total_pairs/tries*100+"%");
Exe 22:05, 31 July 2005 (UTC)
- Your edit has been reverted by User:Richard W.M. Jones. Although he described the edit as User:Exe managed to slip an obvious POV change to using C# - reverted to using wikicode, he failed to drop a notice on the talk page. Perhaps we can compromise and change the title of the section to something else? — Ambush Commander(Talk) 00:30, August 5, 2005 (UTC)
Reverse problem
I have a couple of problems with the "Reverse Problem" section, but I'm not too sure how to fix them.
- I think there is a little notation inconsistency, in that p and n are both functions and variables. I don't know how to put it, but I have the feeling that in the same sentence p is both a given point and a function.
- Two problems are stated, but one solution is given. I think it would be nicer to state only one problem, and perhaps mention that it's a quantile kind of problem. By the way, in my [Mood, Graybill, Boes] the quantile is defined in a completely different way than in the WP article.
-
- The q-th quantile of a random variable X or of the corresponding distribution is defined as the smallest number ξ that satisfies FX(ξ) <= q.
- I guess this is a comment for that other page...--PizzaMargherita 20:27, 3 October 2005 (UTC)
links removed
I just removed two links from the article:
- http://www.teamten.com/lawrence/puzzles/birthday_paradox.html (almost no content)
- http://science.howstuffworks.com/question261.htm (full of ads; not very meaningful, too)
feel free to flame me, if you think that was a bad idea --J.N. 15:48, 24 October 2005 (UTC)
Klamkin (1967)
The birthday problem for such non-constant birthday probabilities was tackled in [Klamkin 1967]. What are the results presented in this paper? In particular, it is reasonable that for non-constant birthday probabilities, the proability of two birthdays on the same date is higher than in the case of constant probabilities. Is this result proved in that paper? Does someone know a reference where this proof can be found? --NeoUrfahraner 08:30, 7 November 2005 (UTC)
- I'm pretty certain that you are right: non-constant birthday probabilities lead to strictly higher collision probabilities (except for a couple of trivial exceptional cases, where the collision probabilities are the same). I haven't seen Klamkin's paper, but it seems likely that he proves this result there. AxelBoldt 16:00, 8 November 2005 (UTC)
-
- I found a proof in D. Blom, AMM, 1973. See references. --NeoUrfahraner
birthday distribution
- "it becomes relevant that due to the way hospitals work, more children are born on Mondays and Tuesdays than on weekends."
Er, really? Do hospitals suppress labor if it occurs on the weekend? -VJ 08:11, 5 January 2006 (UTC)
- I took this to mean that labor is more often induced near the beginning of the week, for some reason. --RCS talk 05:40, 18 January 2006 (UTC)
It does look dubious. Can anybody back that with some reference? I would feel more comfortable if this made its way to the childbirth article first. PizzaMargherita 07:04, 18 January 2006 (UTC)
- Here are some German links: http://www.welt.de/data/2005/09/30/782511.html http://www.kinderwelten.de/cgi-bin/websql?sqlid=18349&nid=1019 They say the reason is e.g. that Caesarean sections become more popular. --NeoUrfahraner 10:24, 19 January 2006 (UTC)
- Here is a better one in English with data from Enland and Wales: http://www.statistics.gov.uk/downloads/theme_health/HSQ9book_V1.pdf There is an obvious seven day cycle, with fewer births on Sundays compared with births on other days of the week. The average number of births on a Sunday during 1979 was 1,373, with a standard deviation of 60, whereas the overall daily average in 1979 was 1,748, with a standard deviation of 211. (Page 7). --NeoUrfahraner 11:24, 19 January 2006 (UTC)
Thanks :) PizzaMargherita 23:41, 21 January 2006 (UTC)
Suggestion for another table
This article could address another very common curiosity: the expected range of birthday matches for certain group sizes. To phrase it another way: present a table showing group size X and the expected collision range N to M, where the odds are even that at least N and at most M birthdates will recur among those X people. Alternately, a link to a "birthday paradox calculator" applet including this feature (among others) would be similarly useful (this is different from the current link to a Mac-OS birthday grapher application). -- Mike 17:15, 11 January 2006 (UTC)
Increment numPeople before or after
A while back User:205.170.235.246 changed the empirical test so that the increment would be performed after all operations, rather than before. I reverted the edit, on the grounds that: revert with reluctance: I that the increment operator is supposed to before the logic. It's a bit weird, I agree. In this edit, the anon has done it again.
So, I hacked out a quick test using PHP this way:
// append ?alt to URL to use anon's method $days = 365; $numPeople = 1; $prob = 0.0; while ($prob < .5) { if (!isset($_GET['alt'])) $numPeople++; $prob = 1 - (1-$prob) * (1-($numPeople-1) / $days); echo "Number of people: $numPeople<br />"; echo "Prob. of same birthday: $prob<br /><hr /><br />"; if (isset($_GET['alt'])) $numPeople++; }
and I determined that whether or not the increment was present didn't make a difference in the final result. However, when it was at the back (anon's method), a probability when there was 1 person was given (obviously 0).
The reason, however, why this is weird, is because in most cases, the increment is indeed performed after the loop, as this for
statement illustrates:
$days = 365; $prob = 0.0; for ($numPeople = 1; $prob < .5; $numPeople++) { $prob = 1 - (1-$prob) * (1-($numPeople-1) / $days); echo "Number of people: $numPeople<br />"; echo "Prob. of same birthday: $prob<br /><hr /><br />"; }
We can also alleviate the aforementioned concerns by beginning the loop with $numPeople = 2, starting with the first "logical" case.
In the end, I am more for a solution that uses for
rather than while
. Any comments? — Ambush Commander(Talk) 22:19, 27 January 2006 (UTC)
- I'm with you. I would change it to a for. Superm401 - Talk 01:44, 28 January 2006 (UTC)
- Changed. — Ambush Commander(Talk) 20:37, 28 January 2006 (UTC)
Make it really empirical
int tries=1000; srand((unsigned)time(0)); int total_pairs=0; for (int t=0;t<tries;t++) { //pick random birthday for every person std::vector<int> persons; for (int i=0;i<23;i++) persons.push_back(365*rand()/(RAND_MAX + 1.0)); int same_birthdays=0; //check for birthday pairs for (int a=0;a<23;a++) for (int b=0;b<23;b++) { if (persons[a]==persons[b] && a!=b) same_birthdays++; } if (same_birthdays>0) total_pairs++; } std::cout<<"chance of same birthday pair for 23 persons: "<<(double)total_pairs/tries*100<<"%";
It is a real test, simulation, using rand(), not just a math formula. Changed to C++ because someone said that writing in c# is pov. exe 15:54, 28 January 2006 (UTC)
- I agree that the current formula is not really an empirical test. It is based on mathematical theory, and while a useful tool to generate the probabilities for certain numbers of people, it is by no means empirical.
- But how is C++ any less POV than C#? — Ambush Commander(Talk) 20:36, 28 January 2006 (UTC)
Whose common intuition?
Whose common intuition does it contraddict? I asked my mother and my sister, neither of whom is particularly good at maths, wheter in their opinion it is more likely that in a 23-people group all birthday are different or at least two are equal. Both of them answered the latter. In addition, nobody on buying 23 trading cards of a set of 365 would expect them to be all different. --Army1987 21:04, 8 February 2006 (UTC)
- Likely the way you phrased the question. Although we should find a survey. — Ambush Commander(Talk) 21:29, 8 February 2006 (UTC)
OS X program
Hmm... is it linkspam? — Ambush Commander(Talk) 20:42, 11 February 2006 (UTC)
Mistake in binomial distribution
The Birthday_paradox#Binomial_distribution calculation is not valid at all. While it somehow approximates the real probability, it lacks the necessary rationality:
1. The paradox is talking about "at least two persons may have same birthday". This includes not only pairs, but also 3 persons groups, 4 persons groups and 23 persons as well. This calculation is just considering pairs. The cumulative distribution of B(X,253,1/265) equals 1 in X=253 while forgetting all other n-groups.
2. The binomial distribution is just valid for independent events. These 253 possible pairs are not independent. For example for people A, B, C, ... having same birthday for pair (A,B) and (B,C) means having same day for (A,C). For instance Pr(X=252) should equals Pr(X=253) in this context while formula doesn't so. --Neshatian 11:50, 3 April 2006 (UTC)
- Thanks to Michael Hardy, the section was removed. --Neshatian 15:05, 23 April 2006 (UTC)
Empirical Test.
I know APPLESOFT BASIC, and some Pascal, so I can recognise this section as a program for a computer.
Consider the non-programmer. How would they kn note.ow what the heck is going on - this should have an explanatory
Corrected Note 1
According to this worfram site (which cites a US census table - wish I could find the origional source) birthday frequencies are actually skewed to the months of July, August and September (in that order). It makes sense intuitivly because those months are roughly 9 months behind the winter holidays (more time spent indoors, cheerful festive atmosphere etc.) Extremely interesting article though, I enjoyed it alot --Cinexero 13:30, 15 June 2006 (UTC)