Talk:Boy or Girl paradox

From Wikipedia, the free encyclopedia

Contents

[edit] Dice Example

Perhaps this example will help make things more clear for the doubters:

Suppose that I have two dice, and each has three of its sides marked with a “1” and three of its sides marked with a “2”. If I throw both dice and sum the result, I could end up with either 2 (if both dice rolled 1), 3 (if one dice rolled 1 and the other 2) or 4 (if both dice rolled 2). If I roll the dice a large number of times and keep track of the sums, I will quickly begin to see that the frequency of the sum being 2, 3, or 4 follows a 1:2:1 distribution.

If (Dice A, Dice B) is the result of each roll, then the following sample space results:

(1,1) (1,2) (2,1) (2,2)

As you can see, there are four possible results of rolling the dice, but two of the possible results will give a sum of 3 while only one possible result could give a sum of 2 or 4. Thus a sum of 3 is twice as likely as a sum of 2 or 4. If you don’t believe me about the 1:2:1 distribution, try it yourself with real dice!

Now, suppose I have rolled my two dice and I tell you “One of my dice rolled a 1. What are the odds that the sum of my two dice will be 3?” You should answer 2/3, because you know that a sum of 3 is twice as likely as a sum of 2. We can eliminate (2,2) from the sample space and are left with only (1,1) (1,2) and (2,1). On the other hand, if I say “Dice A rolled a 1,” then the probability of my sum being 3 is only 50% because now we have eliminated both (2,2) and (2,1), leaving us with only (1,1) and (1,2). -10-21-06


As this is linked from MontyHallProblem we need to be very careful (as described there) as to how the question is phrased:

In a two-child family, one child is a boy. What is the probability that the other child is a girl?

The 'one child is a boy' is ambiguous because it doesn't explicity explain that the other case 'one child is a girl' is being excluded.

For instance if a parent of a two-child family walks into a room accompanied with a boy (one of their children) is the probability that the other child is a boy or girl anything but 50/50? The answer is NO.

If I have a room full of parents of 2 children families (randomly selected) and I ask all those with a boy (implying 1 or more) to step forward - THEN and only then have I skewed the odds that for the parents that have stepped forward to 1/3 2/3.

Again this is a very subtle point, and worth making explicitly. The fact to are making a decision is very important in this problem.

This example from rec.puzzles.faq makes the 'question step' explicit http://www.faqs.org/ftp/faqs/puzzles/faq

2.3. ==> oldest.girl <== [probability] If a person has two children, and truthfully answers yes to the question "Is at least one of your children a girl?", what is the probability that both children are girls?

The answer is 1/3, assuming that it is equally likely that a child will be a boy or a girl. Assume that the children are named Pat and Chris: the three cases are that Pat is a girl and Chris is a boy, Chris is a girl and Pat is a boy, or both are girls. Since one of those three equally likely possibilities have two girls, the probability is 1/3.

You're welcome to clarify the article, but please keep in mind that the "ambiguity" here is just part of a very general phenomenon affecting all probability problems: giving yourself too much information skews the answer from the textbook result. For example, if I see a two-child family with two boys, then they certainly have at least one boy; nonetheless the probability that they have a girl is 0, not 1/2 or 1/3. I don't get 1/3 because I have more information than is in the statement of the question.
Likewise, in your example, if the parent walks into the room accompanied by a boy, I know that "the child here is a boy", which gives me strictly more information than "one of their children is a boy". With the extra information, it is no surprise that I arrive a correct probability different from the textbook result.
My point is that the abstract question "In a two-child family, one child is a boy. What is the probability that the other child is a girl?" is unambiguous, even if it can be confusing. When translating it into a real-world situation, we must be careful not to introduce the wrong information, and the "question step" you identify is a good way of doing that. But strictly speaking, it is not necessary to analyze the problem. Melchoir 19:09, 5 April 2006 (UTC)

[edit] Independence

both questions are the same.

two independent elements can have one of two values each.

in the frist case, one element is "the older child" being "boy", with the question of the probability of the independent, second element "younger child" being "girl".

the second case, one element is "the identified child" being "boy" with the question of the probability of the independent, second element "other child" being "girl"

in both cases the form of the question is identical. in the second question there is no reason to seperate 1 boy, 1 girl into 1 boy 1 older girl and 1 boy 1 younger girl -- it is only the unrelated first question that makes this seperation seem justified.

Also, the sample space is not correct. let character 1 be elder/younger, and let uppercase signify "identified".

Set = {Bb, bB, Bg, bG, Gb, gB, Gg, gG} each with equal probability.

in problem 1, we are left with: Bb, Bg, and asked for P(Bg) in problem 2, we are really left with: {Bb,bB,Bg,gB}, and asked for P(Bg) + P(gB)

these are just 2 disproofs of this "paradox".

You are making the very common mistake of not realizing that many of the elements in your sample space are degenerate. For any couple that has had two children, there are four equally likely possibilities - they could have a boy and then another boy (probability 1/4,) a boy and then a girl (again 1/4), a girl and then a boy (again 1/4) or a girl and then another girl (again 1/4). That is the only sample space they you need to concern yourself with. Your elements Bb and bB are really just degenerate cases of the element "they had one boy and then another boy", and Gg/gG are degenerate representations of "they had a girl and then another girl". Your elements bG and Gb are both really just cases of "they had a girl and then a boy."
Let’s assume that the parents will name their first boy Bob and their second boy Tom, and will name their first girl Jane and their second girl Jill. This should make it more clear that there are only four possible scenarios, each of which has a possibility of 1/4:
-Bob has a younger brother Tom
-Bob has a younger sister Jane
-Jane has a younger brother Bob
-Jane has a younger sister Jill
If you try to add any more elements you will only succeed in rephrasing one of the already-existing elements, and will say something like “Tom has an older brother Bob.” However, that element is already accounted for in the original set.

[edit] Questioning

Well, this page is busy today. The recent addition is mistaken; the intent of the question is not to identify any boy at all. I can get a reference on that, but for now I'll revert and clean up the language. Melchoir 19:46, 5 April 2006 (UTC)

[edit] Small Change

I made a small change to the article to try and clear up a common problem. (One I had myself once.)

[edit] This doesn't seem to work

Given two variables, U and K, each of these being a children, one (U) whose gender is unknown to us, and one (K) whose gender is known to us. This is the only difference between U and K ; U may be older than or younger than K without any relevance to the case.

U can be either a boy, or a girl. K can be either a boy, or a girl.

The following combinations are possible.

U is a boy, K is a boy. U is a boy, K is a girl. U is a girl, K is a boy. U is a girl, K is a girl.

Since we know the gender of K (in this case, let's assume K is a girl), two possibilities are eliminated : the two possibilities where K is a boy. That leaves us two possibilities, one of which has U as a girl, and one of which has K has a boy. Note again that ages is of absolutely no relevance here : U can be either older than or younger than K without changing a thing.

Essentialy the problem appears to be the mistaken notion that, because you don't know the ages of the children, both G-B and B-G are possible. However, this is false : if both G-B and B-G are possible, then G-G need to appear twice : once for G-G where the girl whose gender is known to us is the first G, and one for G-G where the girl whose gender is known to us is the second G.

Just my two cents.--Damian Silverblade 17:44, 13 April 2006 (UTC)

False. G-B and B-G are different, equally likely events that lead to a couple having one boy and one girl.
A couple could have a boy and then a boy, a girl and then a boy, a boy and then a girl, or a girl and then a girl. G-B and B-G are both valid because they both represent valid possibilities. If you had double elements for B-B, you are implying that there's more than one way for a couple to end up with two boys. In fact, there isn't - the only way for a couple to end up with two boys is to have a boy and then another boy. However, there is more than one way for a couple to end up with a girl and a boy - they could have a girl and then a boy, or they could have a boy and then a girl. That is why you have separate elements for B-G and G-B, but only one element for B-B and G-G.

Agreed - as I pointed out earlier how the question is phrased makes all the difference, there is a very strong argument that as the question is *currently* phrased the correct answer is 50/50.

Also agreed (or phrased alternately, the three possibilities listed in the sample space are not equally likely). This article is ridiculous. Can we make a motion for deletion? Topher0128 02:58, 12 July 2006 (UTC)

The article may be phrased badly, but it is indeed a well-known example, and I've seen it discussed in a book describing probability problems from a cognitive perspective. For now, I'll simply add the {{unreferenced}} tag. But it's not a joke. Melchoir 03:04, 12 July 2006 (UTC)

The phrasing of the first formulation of the question says nothing about the set of Boy/Girl being ordered, and yet the sample evaluation assumes that it is important. The maths are correct just for a different formulation. The second formalization does explicitly order the children(older/younger). The first should read something like

  • In a two-child family, one child is a boy. What is the probability that oldest child is a girl?

and the samples would be oldest then youngest {BB, BG, GB, GG} GG is not possible, since one is a boy so three possibilitys remain, {BB, BG, GB} The girl is the oldest in only 1/3 prob.

  • In a two-child family, the older child is a boy. What is the probability that the younger child is a girl?

This one is described correctly. Domhail 04:03, 12 May 2006 (UTC)

[edit] Rules of Conditional Probability?

Precisely. For example in a tree diagram: the way the question is phrased, "In a two-child family, one child is a boy. What is the probability that the other child is a girl?" , we are surely not only excluding the G-G branch from the conditional tree, we are also assuming that the G-B and B-G trees are identical, as the order in this question is irrelevant; therefore these branches should be merged and the answer remains 1/2. —The preceding unsigned comment was added by 15:34, 5 August 2006 (talk • contribs) 89.145.196.3.

That is not a valid line of reasoning. You might as well say that the probability that a two-child family has two boys is 1/3, since there are three possibilities and the question does not care about order. In probability, there are no rules that deal with such notions as irrelevance and "merging branches". Melchoir 16:59, 5 August 2006 (UTC)

[edit] Coin Examples

(A) If I throw 2 coins and let you see one, have I given you any information about the 2nd (hidden) coin? - Obviously not, its probability of heads or tails remains .5/.5

(B) If I throw 2 coins, and I look at them, you ask me is there at least one head, and I answer truthfuly 'Yes' and show you that coin.

We've now arrived at the subtle (and counter-intuitive) case where the probability that the other coin is a tail is now 2/3. The reasoning is explained in the article page and can be verified using a simple computer program (or indeed throwing coins yourself)

But again the atual 'questioning step' is critical in differentiating (A) & (B) and Missing from the article page.

--Pajh 21:33, 21 April 2006 (UTC)


Ok, I just ran through several thousand simulations of B: two random tosses.
check if either of the two is a head.
in cases where one is a head, check if there is a tail present.
the probability comes out as 0.50

If you disagree with my method, can you correct me?

--Wes , 26 September 2006

What happens if you increase the number of coins in your program to ten? If you still get 0.5 I think it's something wrong with your program. If you get something else please make a diagram of all your found probabilities with two coins up to ten coins. Does the diagram make sense? INic 20:53, 26 September 2006 (UTC)

Java Code solution

public class Coins {
    
    public static final boolean HEADS = true;
    public static final boolean TAILS = false;
    
    public Coins() {
    }
    
    public static final void simulate() {
        
        java.util.Random generator = new java.util.Random();
        int pairheads = 0;
        int tailpresent = 0;
        for (int count=0;count < 10000; count++) {
            
            boolean coin1 = generator.nextBoolean();
            boolean coin2 = generator.nextBoolean();
            
            if (coin1 == HEADS || coin2 == HEADS) { // At least one head
                if ( coin1 == HEADS && coin2 == HEADS) // Both heads
                    pairheads++;
                else
                    tailpresent++;
            }
            
        }
        int total = pairheads + tailpresent;
        System.out.println("Pairs of heads = " + pairheads );
        System.out.println("Tail present = " + tailpresent );
        System.out.println("Total = " + total);
    }
    
    public static void main(String[] args) {
        simulate();
    }    
}

Output

Pairs of heads = 2492
Tail present = 5034
Total = 7526

--Pajh 10:13, 28 September 2006 (UTC)

[edit] THIS ARTICLE IS LIES!!!!!

i dont understand how it can be 2/3. its 1/2!!! i swear this article is completely wrong someone delete it pls.

just because there are 3 possibilities doesnt mean a 1 in 3 possibility. someone driving by my house can either shoot me or not, therefore thjeres a 50:50 chance the next car will shoot me. K. I honestly wouldnt be surprized if everyone whos contributed to this article is completely wrong. The intellectual talent of this place is about a 7th grade level. Seriously ive had to compeltely rewrite some engineering articles coz of the morons here.

oh wait i get it. its like the monty hall dealie.

[edit] YES BUT NO

The 2-child family may be either : 2 boys (p = 1/4), 2 girls (p = 1/4), 1 boy and 1 girl (p = 1/2).

The rule is :

    p (A / B) = p (A and B) / p (B)
    Probability that A is true if B is true = Probability that A and B are true at the same time / Probability that B is true

The statement for A is clear :

    A = "One child is a girl"

There are 2 different statements for B :

    Case 1.     B = "(I know one of them,) he is a boy" -> p = 1/2
                        (probability for him to be a boy)
    Case 2.     B = "(I know that) at least one of the two of them is a boy" -> p = 3/4
                        (probability that a 2-child family has at least one boy)

Thus, 2 different statements for (A and B) :

    Case 1.     (A and B) = "The one I know is a boy, the one I don't know is a girl" -> p = (1/2).(1/2) = 1/4
    Case 2.     (A and B) = "One is a boy, one is girl" -> p = 1/2

And 2 different results :

    Case 1.     p (A / B) = (1/4) / (1/2) = 1/2  (= p (A) actually, B has no influence)
    Case 2.     p (A / B) = (1/2) / (3/4) = 2/3

Case 1 sounds ok to me. In case 1, we don't need to know that it's a 2-child family. I still find Case 2 very disturbing. The thing is : a simple, almost automatic, deduction leads from a "Case 1" B to a "Case 2" one. I wonder, is it good to know too much about something ?

--[Strahd] 5:35, 12 August 2006 (Orléans, France)

I guess the simple answer is that you can't use deduction in these problems. Melchoir 17:29, 12 August 2006 (UTC)
I agree --[Strahd] 19:57, 12 August 2006 (France)

[edit] YES BUT NO (II)

Trees !

Case 1 :
First, the child we know, then the other child : BB, BG, GB, GG.
The child we know is a boy : BB, BG.
The other one is a girl : BG.
Probability : 1/2
Case 2 :
First, the elder, then the younger : BB, BG, GB, GG.
One child is a boy : BB, BG, GB.
The other one is a girl : BG, GB.
Probability : 2/3.

In Case 2, I'd use the word "frequency" rather than "probability".

--[Strahd] 6:17, 13 August 2006 (Orléans, France)


[edit] Similar question

My parents only got two children. I'm a man. What is the probability that I have a brother? In other words, what is the probability that my sibling has the same sex that I have? INic 10:32, 25 August 2006 (UTC)

Let's look at the sample space, denoted by {You, Sibling}. Originally, the sample space is {B,B}, {B,G}, {G,B}, and {G,G}. Now that we condition on the fact that you (emphasis intended) are a man, the space has two elements: {B,B} and {B,G}. Thus 50%
The key as to why the 1/3 did not work is that you specified which ("I") was the man. If you had said "I note that (at least) one of my parents' two children is a man", then the 1/3 is correct.
But note if you said "One of my parents' two children picked at random is a man", then we're back to 50% again. Bayes' theorem can show this, but the heuristic here is to note the four possibilities above, and to pick one of the four "B"s at random. You are twice as likely to pick one from the {B,B} group as either from the other two groups (50%, 25%, 25% respectievly respectively). Baccyak4H 20:08, 5 September 2006 (UTC)
An error is in your first statement. It should be: Now that we condition on the fact that you (emphasis intended) are a man, the space has 3 elements: {B,B} and {B,G} and {G,B}. Thus still 33%. [I do not know whether you are the younger or the older]. --Tauʻolunga 23:37, 5 September 2006 (UTC)
No, there is no error. The notation used was {You (INic), Sibling (of iNic)}. Since INic is a man, the {G,B} outcome is indeed ruled out. One is not only limited to age order to distinguish the two. The 50% situation could occur if you said "the taller one is a boy", or "the one whose first name comes first alphabetically is a boy", etc., or in this case, "the one with the Wikiname INic is a man."
The way to distinguish the two cases is: The 50% scenario occurs whenever the original statement about being a boy allows that if one knew only that and then were to meet the two children, and if the two children were indeed both boys, then it would be possible in principle to tell which of the two boys was referred to originally. Thus the older one, or "Aaron", or the taller one, or INic, or... (of course, if one was a girl the distinction is trivial). You might need some more info (ages, names, etc.), but in principle you can determine which one was mentioned. If this distinction is impossible in principle ("one is a boy"), then we are in the 1/3 case. Note the potential for ambiguity: when one says "one is a boy", it is easy to picture the problem poser looking at one particular child at random and making such an observation. This is not supposed to be the case, and is the reason for much of the language discussion/disambiguation in the article. Which is why I added "at least" to "one [child is a boy]."
But don't worry; there is a reason this paradox topic is included in WP: there is some very important subtlety that is not always easy to grasp. Baccyak4H 03:21, 6 September 2006 (UTC)

OK, let's say that when I say I'm a man, I only by that mean that "I note that (at least) one of my parents' two children is a man." Then the probability that I have a brother is 1/3 you say, right? But say that I by I'm a man only mean "My name is INic and that's a male name". Then the probability that I have a brother is 1/2 according to you, correct? Is the probability really dependent on how I say that I'm a man? INic 01:28, 7 September 2006 (UTC)

Of course it is. If you say "both of my parents' children are men" then the probability that you have a brother is 1. Melchoir 01:32, 7 September 2006 (UTC)
"Is the probability really dependent on how I say that I'm a man?"
Your rephrasing of the your original statement does not say "you" are a man at all. "I note that (at least) one of my parents' two children is a man." allows for you being a woman and having a brother.
So long as one particular individual is identified somehow as being the man, in this case, the writer here at 01:28, 7 September 2006 (UTC) we are in the 1/2 case. In the case where you had a brother, I could tell in principle which man was you because you would be the one posting here at that time. But if you had said "I note that (at least) one of my parents' two children is a man.", there is no information there even in principle to distinguish whether you were referring to yourself or your brother, in that case.
The probability is dependent solely on whether one particular individual is described as "the man". If so, the prob. is 1/2, since we now can speak of the other particular individual. The chance that any particular individual (and the "other" particular individual in particular ;-)), is a man, is 1/2. The description can be of any sort, but it has to identify (somehow) a particular individual.

Aha, I think I see what you mean: the siblings should be viewed as an ordered pair according to some ordering criterion. Which criterion we use doesn't really matter, we can use whatever we want, right? Well, this is wrong I'm afraid. It does in fact matter what ordering criteria we use. INic 15:46, 8 September 2006 (UTC)

For example, you say that The 50% situation could occur if you said "the taller one is a boy". This is clearly false as it's far more likely that the boy is the tallest one in families with mixed siblings. (For adult siblings with common parents I'd guess this rate is close to 100%.) The property of being the tallest is simply not independent of gender. This means that if you know that the tallest one is a man the probability that he has a sister is close to 2/3. INic 15:46, 8 September 2006 (UTC)

In the example we discuss the ordering criterion is "the sibling that stated the question above." And I'm not sure if the event that I stated my question above is independent from the fact that I'm a man. Your answer implies that it's totally uncorrelated. I don't think it is. INic 15:46, 8 September 2006 (UTC)

You are right about my height example. Strictly apeaking, the assignment of any criteria used to identify one individual needs to be independent (probabilistically) of gender. I suppose one could claim that most any criteria used would fail to meet this requirement in the strictest, most exacting sense (e.g., the distribution of child names). But many can be close, and if we state the problem making the obvious approximation, all is well.
About independence of your posting (verb) and your gender, I refer to the above technicality and say yes, if men are more (or less) likely to edit WP than women, then they are not independent. But this is starting to become nitpicky. Please assume probabilistic independence, and again, all is well.
"All models are wrong. But some are useful" - George Box
Baccyak4H 16:31, 8 September 2006 (UTC)

Nitpicky or not, independence makes all the difference—as you now somewhat reluctantly admit yourself. But the problems doesn't end here I'm afraid. How would we test if the ordering in question is inbependent of gender or not? You say we should "assume probabilistic independence," but why? We could as well assume probabilistic dependence, right? INic 10:54, 11 September 2006 (UTC)

To test independence you propose to estimate how often WP is edited by men. That is a good estimate if someone picked me at random from any WP contribution and asked me if I had only one sibling. But that wasn't what happened. Instead I picked myself. From what group I picked myself I have no idea! And yet we need to know the group to have a probabilistic model. It seems that whatever model we chose it's as far from the truth as any other... INic 10:54, 11 September 2006 (UTC)

"independence makes all the difference" -- Not necessarily, one could have "mild" dependence (or more specifically, correlation), of some types, and still observe that having an individual identified in principle changes the probability of the genders for the pair. They may not be exactly 1/3 vs 1/2, but they would be different. But this inaccuracy is no more problematic than (say) using a gaussian model for data rounded to say three decimal places. Strictly speaking this is absurd. The entire sample space has probability zero. But you know what? In many many situations, it works! Imagine that.
Yes, all values between 1/3 and 1/2 are possible depending on how strong the correlation is. This inaccuracy is problematic here just because it allows for the whole range of possibilties. It's not at all obvious which value between, and including, 1/3 and 1/2 should be the correct answer. INic 02:20, 16 September 2006 (UTC)
"you propose to estimate how often WP is edited by men" Um, I proposed no such thing. I am indeed enlightened that you draw that conclusion.
I'm sorry if I misinterpreted you. May I ask you to enlighten me what you meant by talking about how common it's that men edit WP in this context, and how that fact is connected to the question of independence according to you? INic 02:20, 16 September 2006 (UTC)
"But that wasn't what happened." Of course not. The original problem can be described by a \{\Omega, \mathcal{F}, P \}, while yours by a particular (and now degenerate) \omega \in \Omega \,\!.
Yes correct, to talk about probabilities we must have a probability space defined \{\Omega, \mathcal{F}, P \}. However, all we have here is \{\omega, \mathcal{F}, P \} where we have no idea to what Ω our ω belongs. The conclusion must be that, not only is the probability lost somewhere between 1/3 and 1/2, we are not even allowed to talk about probabilities in this case as the probability space is undefined. INic 02:20, 16 September 2006 (UTC)
Just like if one were to look, in the original problem, at what the gender of the sibling is, and after observing it, asking what it was. It being a boy happens with either probability 1, or 0. But (as you know) this is a different problem, so of course the answers could be different; no problem there.
No, this is not the same situation. INic 02:20, 16 September 2006 (UTC)
Anyway, onwards...I am a golfer, not a fisherman... Baccyak4H


OK, let's look at it differently.

Suppose, in a particular town, it is the fashion for parents to keep their babies' first pairs of bootees; and it also the tradition that bootees for boy babies are blue, bootees for girl babies are pink.

We ask the mothers of all two-child familes to enter a room, and bring the first pair of bootees for each of their children

Now, we ask all mothers to hold up a pair of blue bootees. Those that do not have blue bootees to hold up are asked to leave the room.

Next, we ask all mothers to hold up a pair of pink bootees. The question is, what proportion od mothers still in the room do hold up a pair of pink bootees.

The answer is two thirds.

I think this is equivalent to the original question.

Tim

Yes, it is. Baccyak4H 04:50, 8 October 2006 (UTC)

The question As phrased (even after a change) is still wrong. Wrong as in the answer to both questions as phrased is 50/50. The problem is still the phrasing 'at least one boy' This still doesn't make it clear that an explicit step has been carried out excluding GG.

For example, I know my neighbour has 2 kids, she sends one round and it's a boy. Now I know she has 'at least one boy', what's the probability that the other child is a girl/boy = 50/50!

Read the rec.puzzles.faq this has been done to death and the questioning step is a necessity, otherwise change both answers on the mainpage to 50/50 as it is currently WRONG.

2.3. ==> oldest.girl <== [probability] If a person has two children, and truthfully answers yes to the question "Is at least one of your children a girl?", what is the probability that both children are girls?

http://www.faqs.org/ftp/faqs/puzzles/faq Pajh 15:46, 15 October 2006 (UTC)

Your discussion makes it clear why this article is needed to cover this paradox in the first place. It's easy to let one's intellectual guard down.
Simply put, your analysis is wrong. While you know your neighbor has at least one boy, you also know more than that. You know the particular child sent around to you is a boy. Thus, while you are indeed looking at the 50% scenario, it is not the same as the "at least one boy" scenario, as written (and intended).
The 2/3 answer to the bootie formulation is indeed correct (do a simulation if you want). The fact that those two scenarios are indeed different, over what might appear to be a very minor point, is why this is a paradox. Baccyak4H 18:33, 15 October 2006 (UTC)
I would like to comment on the question posed earlier in this chat.

"My parents only got two children. I'm a man. What is the probability that I have a brother? In other words, what is the probability that my sibling has the same sex that I have?"

I believe that this question is more like saying the elder child is a boy. Saying that "I am a man" is referring to a specific child and not to the other; the probability that the other child is a boy is 1/2. Saying that one of the two children is a boy could apply to either child, the the probability that both are boys is 1/3 for reasons stated on the main page. So I believe that the statement of this particular question does not lead to the latter of the two questions stated on the main page.

I'm new to Wikipedia, so I'm sorry if my ettiquette is out of line. Thanks. blahb31 Blahb31 19:53, 24 December 2006 (UTC)

[edit] Rebuttal to Solution

Looking at the original problems (and original solutions), I have no difficulty understanding the first scenario. In the first problem, time is introduced as an initial condition. It says 'the older child is a boy'. Therefore it is acceptable to list the scenarios as:

(BB, BG, GB, GG)

This is because (using time only) the possibilities are: 1. a boy was born and then a boy was born 2. a boy was born and then a girl was born 3. a girl was born and then a boy was born 4. a girl was born and then a girl was born. Since you are acknowledging that the first child is a boy, you select the scenarios with B as the first letter, and that leaves you with a 50/50 chance of the second child being a girl.

For the second scenario, you have eliminated time from the initial condition (no one knows whether the boy is older or younger). You have two choices here, either to introduce time, or keep everything timeless. If you introduce time, you have the following scenarios (as previously used, the first letter will be the older child, and the letter that is upper case is the child that has been randomnly mentioned in the statement 'has at least one...'):

(Bb, bB, Bg, bG, Gb, gB, Gg, gG)

The reason there are 8 scenarios here instead of 4 is because of one thing. It is because of the words 'has at least one...'. Since in problem 2 we are randomnly pointing to one child, we have to include all possibilities of pointing as well as timing, which increases our scenarios from 4 to 8. The reason this was not done in problem 1 is because we already knew what we were pointing at (the first child is a boy). We could use these same 8 scenarios in problem one, but the information from the problem simply reduces it in the same way to the solution 50/50.

Back to problem 2: After selecting the scenarios where there is an uppercase B (as mentioned in the problem), we are left with:

(Bb, bB, Bg, gB)

This leaves the chances of having a girl at 50/50, which is counter to the original solution to problem 2. If you wish to keep time removed from the situation, then the order of birth does not matter. Therefore the highest possible scenarios are that the parents will have 2 boys, 2 girls, or one of each.

(BB, BG, GG)

Again, since problem 2 says 'has at least one boy', we must remove the scenarios without a B, and so we are left with:

(BB, BG)

This again shows that the solution to problem 2 is 50/50, but is reached by keeping the problem independent of time. In conclusion it is important to note that the difference between the two problems is the inclusion of time and the inclusion of acknowledging a random child, which then dictates how you approach the solution. Either always include time, or never include it, but mixing it up will give you skewed answers, such as the 2/3. For example in the original solution to problem 2, only one selection was given for BB. This is an error because the problem says 'has at least one boy', which could be addressing either the first boy, or the second boy. Therefore you must include those possibilities in your scenarios, giving us the 8 listed above. This is why the original solution to problem 2 is flawed. AFpilot157 12:34, 31 October 2006 (UTC)

Hi AFpilot157. I'm glad you are thinking about these problems for yourself. Working on them is exceptionally helpful, and will increase your understanding of probability. However from the point of view of the encyclopedia I can assure you that the solution given in the article is the correct one. There are many sites that discuss mathematical probability and I would strongly suggest that you post your comments on one of those. They will be more than willing to discuss them. DJ Clayworth 21:10, 31 October 2006 (UTC)

Thank you for your response. Yes, I have studied probability at times, and I just thought I would throw my two cents in on this interesting problem. In all honesty, I would have no problem accepting the fact that I am wrong, but I truly would like to see in what way I am wrong. I know it is a long shot, but is it possible for the encyclopedia to be wrong? All I ask is for a counter argument, one that takes it all into perspective (which is interesting, because probability at times can change based on perspective). To further illustrate my point for problem 2 of the Boy or Girl scenario, I will go into more detail into what I mean by my solution. If we say that 'a family has 2 children, and has at least one boy' then we have to either introduce age into the scenario (giving us 8 possibilities), or remove age from the scenario (giving us 3 possibilities). In either case, the end result becomes 50/50. My question is about the following statement from the original solution:

"The main reason is that the second question does not assume anything about the age of the boy, he might be the older and he might be the younger sibling. Therefore the loose thought that there are only 3 possibilities (2 boys {BB}, 2 girls {GG} or a mix) does not take into account that the latter is twice as likely than the formers, because it can be either {GB} or {BG}."

Why is the latter (one of each) more likely than BB or GG? By accepting both orientations (GB and BG) is that not the same as putting them in order of birth? Since the children had to come one after the other (assuming no twins), you are saying that either the first one is a boy and the second one is a girl, or the first one is a girl and the second one is a boy. By doing that, are you not introducing age (time) into a problem that does not have age or time associated with it? I can see how many people would think that 'one of each' is twice as likely as 'two of a kind', but I believe there is a flaw in that thinking. The flaw is that we cannot look at this scenario like we look at genetics of plants and animals. Going back to an old highschool lesson about tall plants and short plants, they always talked about TT, TS, ST, SS, when it comes to genetics, and the probability of getting a particular combination. The thing here is that when it comes to genetics, the tall and short traits joined all at the same time, with no particular order in which one came first. TS and ST were twice as likely as TT and SS, because everything happened all at once, not in a linear sequence like the birth of children (also, both parents of plants and animals contributed a T or an S to the offspring, unlike the B or G of children). Since time is naturally ingrained in this problem of boy and girl (because they occur one after the other), if we want to truly remove time from the situation, there really are only 3 choices: 2 boys, 2 girls, or one of each. If you take the set listed in the original solution (BB, GB, BG), the words come out as: (one boy and then one boy, one boy and then one girl, one girl and then one boy). No matter what, you cannot truly remove time from the situation (without reducing yourself two the 3 sets I listed in my solution), because the birth of children (which is sequential in this problem) is a fixed trait of the problem. AFpilot157 21:07, 31 October 2006 (UTC)

It doesn't really mattter what variable you use to distinguish the two genders (time or otherwise); the time of birth is simply a tool used to apply the calculus of conditional probability to find the correct answer. It turns out that to enumerate the sample space you will have to make some distinction anyway, somehow, in the sense of writing {BG} states that B comes after"{" and G comes before "}". Seems pedantic (maybe it is; I could do better with more time maybe), but try simulating the experiment: flip two coins 50 (say) times. You will get about twice as many flips where the two coins differ than when they are both heads (or tails). Here note that the "times" the two coins become readable is arbitrary, and can be assumed to be irrelevant. DJ Clayworth was right; the explanations here are, as counterintuitive as they may seem, correct.
For an example which operates under the same principal, consider a deal in the game of bridge. The game typically has some suit distribution among the four players where each has between 1 and 5 cards of any suit in their hand. No one would a priori be suspicious of that, yet the same logic that says that two boys is one outcome as is one of each gender, and no probabilistic difference can be inferred, would have you believe that a perfect deal (each player gets 13 cards of the same suit) is just as likely as whatever distribution of cards one typically sees. But do we ever see a perfect hand?Baccyak4H 03:37, 1 November 2006 (UTC)

I now understand exactly what you mean with your coin flip example. It is true that the solution seems counterituitive, but it becomes easily acceptable once it is understood. It is only counterintuitive because we at first glance do not see a difference in what problem 1 and problem 2 is asking us. The fact is that there is a difference, however slight, in the wording that changes the whole meaning of the problem, and doing the mathematics/experiment can aid us in understanding that. AFpilot157 07:00, 1 November 2006 (UTC)

[edit] Two-stage game

There is nothing odd about this. It is basically just a two-stage game. In the first example the outcome at stage 1 is still to be determined. Hence the outcome is 1/2. In the second example the outcome at stage 1 has been determined hence reducing the set of possible outcomes. What happens at stage 2 is independent (or assumed to be in the paradox) of what happened at stage 1. The confusion arises from the fact that the puzzle does not mention in which order the boy and girl is born. Assuming that the boy had been born first the possible outcomes at stage 2 would be {BB} or {BG} which would result in a 1/2 probability just as most people would assume. It is a trick question, the secret lies in the fact that the order has not been determined. It is as easy as that. No need for any long explanation. This is not original research, it is a well-known fact from simple game theory. It is a classic example of the difference between a 1 stage game and a 2 stage game. I am removing the original research tag. MartinDK 10:18, 10 November 2006 (UTC)

[edit] What's actually going on

The first situation: In a family with 2 children, the older one is a boy. So, you setup the list using the variable B first, to represent that he is older. Your possibilities are then {BB BG}. Therefore, there is onle a 1/2 probability that the other child will be a girl.

The second situation: In a family with 2 children, one is a boy. Now, you have to set this list up the same as in the previous situation, or there is no way to compare them. Lets first look at all the possibilities: {BB BB BG GB GG GG} Why are there two {BB}'s and two {GG}'s? Because you know the gender of the child but not the age. You can have two boys, but there are two possibility's when you have that. The boy that is known can be older, or he can be younger. And since you don't know, you have to consider both of the options. Since you know that one of the children is a boy, you can eliminate the two {GG}'s. This leaves you with {BB BB BG GB}. The probability that the other child is a girl, is 2/4, which is the same as 1/2. —The preceding unsigned comment was added by 168.103.88.39 (talk) 00:15, 9 December 2006 (UTC).

In analyzing the first situation, you have defined the first letter to be the oldest child. But you have violated that definition in the second situation by allowing one of the two "BB"s to have the youngest first. So you're analysis is invalid (and in fact wrong): there should only be one "BB" in the second scenario, but we do not know whether the first (oldest) or second (youngest) was referred to. Thus the answer to the second situation is 1/3 and not 1/2. Baccyak4H (Yak!) 02:34, 9 December 2006 (UTC)
Actually, no (in response to the immediately preceding response from Baccyak4H). Consider tagging the second situation X_s to denote which specific individual was seen. This gives the set of possible combinations: {BB_s B_sB B_sG GB_s GG GG}. Note that the only two options that have no _s must be removed; they are inadmissable because the one thing we really do know is that we've seen something. This leaves {BB_s B_sB B_sG GB_s}. And exactly half of these have girls. P(G)=1/2. Al —The preceding unsigned comment was added by 82.41.204.120 (talk) 15:16, 13 December 2006 (UTC). UberPuppy 15:24, 13 December 2006 (UTC)
Amendment of the above, for completeness: technically {BB_s B_sB B_sG GB_s GG GG} should be {BB_s B_sB B_sG GB_s GG GG GB BG} where you DID NOT see the boy in the last two, but they are necessary for the identical question asked from the perspective of seeing a girl. In other words, before you know which gender you saw, there are actually 8 candidate combinations, and then the 4 inadmissable ones are removed once you know which gender you saw. UberPuppy 15:24, 13 December 2006 (UTC)
(changing your notation for convenience) {BBs BsB BsG GBs} is a correct enumeration of the sampling space. But in doing it that way, the four elements no longer have equal probability; the first two have exactly half the probability of either of the last two (remember {BB BG GB} have equal probs; you just split BB into equal halves). If you add up the relevant probs, you will get 1/3. Baccyak4H (Yak!) 15:45, 13 December 2006 (UTC)

[edit] A correction to the "Mistakes" section

The following is a quote from the mistakes section:

  The error here is that the first two statements are counted double. We do not know which brother is the older, 
  as that was not stated in the question. Call the brothers Tom and Harry.
  1. Harry has an elder brother Tom
  2. Tom has a younger brother Harry
  The second statement repeats the first and therefore should be removed.

The logic is flawed here. It assumes that whoever was making the argument knew the ages when making the argument, which makes no sense. I have a better version of that argument:

In this situation, let's call the known-male child Jeff and call the unknown child Pat. There are four possibilities:

Jeff has an older brother named Pat

Jeff has a younger brother named Pat

Jeff has an older sister named Pat

Jeff has a younger sister named Pat

Each is equally likely to happen, and the "statement that repeats the first and needs to be removed" is nowhere to be found. Therefore, it can only be half. E946 12:12, 4 January 2007 (UTC)


[edit] Fundamental flaw

Although I understand both parties' reasoning, the following fundamental flaw exists in the problem:

Let us work from the assumtion that there are four possible permutations for the siblings' gender: BB BG GB GG

According to the most simple interpretation of the problem, if you go to a park and ask boys if they are from a two-child family, those answering 'yes' have a 2/3 chance of having a sister, because only the GG probability is eliminated.

Now, by asking the boy whether he is the oldest or youngest child, his chances of having a sister change to 1/2, regardless of his answer, because either BG or GB will be eliminated along with GG.

While both reasonings are mathematically correct, only te 1/2 solution has any bearing on reality. —The preceding unsigned comment was added by 196.216.16.10 (talk • contribs) 14:52, February 22, 2007 (UTC)

Nope, the 1/2 solution has no basis in reality. The arguments presented are clear. — Arthur Rubin | (talk) 15:23, 22 February 2007 (UTC)
In the park version, the correct answer is 1/2, even before you ask about age. When we list the four possibilities BB BG GB GG, we assume there is a way to define which child we mention first, and which one we mention last. In the original problem, it's natural to do this based on age. In the park problem, it is more natural to mention (say) the boy you ask first, and his sibling last. Then, of the four possibilities BB BG GB GG, two can be discarded as soon as we observe the gender of the boy, leaving us with BB and BG, and a 50% chance the sibling is a boy. The important thing is symmetry arguments, and there is no symmetry argument making 196.216.16.10's three possibilities BB BG GB equally likely in the park version.
If someone can express this more clearly, I'll be very happy to read it here!--Niels Ø (noe) 19:50, 22 February 2007 (UTC)
I think the fundmanetal problem is that the 1/2 people are looking at thiat individual family, while the 2/3 people are trying to look at two-child families as a whole. That other chold has a 1/2 chance of being a girl no matter how many other families there are being tested. E946 03:53, 7 March 2007 (UTC)

[edit] Simple Explanation of Flaw in Logic

There is a 50% chance that the child who is known to be a boy is either older or younger. This represents itself as [B,x] or [x,B]. There is a 50% chance that the unknown child is a girl. This represents itself as ([B,g] or [B,b]) or ([g,B] or [b,B]). There are two equal chances for there to be two boys. They do not equal one another and one should not be thrown out. The Harry statement should be as follows:

  1. Harry has an elder brother Tom
  2. Harry has a younger brother Tom

These two statements are not the same and should not be thrown out.

Here is a Python script that correctly calculates the percentage that will be girls.

def boyorgirl():
        import random
        # Boy == 1
        # Girl == 0
        boys = 0
        girls = 0
        pairsOfChildren = 10000
        for k in range(pairsOfChildren):
                children = [-1,-1]
                boyPosition = random.randrange(0,2)
                if boyPosition == 0:
                        otherPosition = 1
                elif boyPosition == 1:
                        otherPosition = 0
                children[boyPosition] = 1
                children[otherPosition] = random.randrange(0,2)
                if children[otherPosition] == 0:
                        girls += 1
                elif children[otherPosition] == 1:
                        boys += 1
        print "If one child is a boy, what is the likely hood that the other child is a girl?"
        print "Testing a sample of " + str(pairsOfChildren) + " pairs of children reveals that " + str(girls) + " will be girls and " + str(boys) + " will be boys." 
        print "The percentage of girls is " + str(float(girls)/(float(girls) + float(boys))*100.) + "%."

AlexEagar 02:57, 21 March 2007 (UTC)

I agree the explanation in the section "Mistakes" in the article is rather weak, but I cannot see what you want to replace it by, or whether you want to change the conclusion too. I do not read Python. You may have misunderstood the logic in that section: We cannot give "the boy" a name - we're not told about a particular boy in this problem; we're just told they have at least one boy. I will not vote below; I think you should clarify the change you have in mind.--Niels Ø (noe) 09:48, 21 March 2007 (UTC)
Either you take into account the position of the other child in which case you have to take into account the known child {Bg, Bb, gB, bB} or you don't take into account the position of the other child {Bg, Bb} but you can't just choose to take into account the position of the other child if it is a girl, but not if it is a boy {Bg, gB, Bb}. If you are going to allow for {Bg, gB}, you have to also allow for {Bb, bB} which are not the same. Really the age doesn't matter. There is one position that is already filled with a boy and there is one position open for either a boy or a girl and that second position has a 50% chance of being either a boy or a girl regardless of who is older or younger. And that's not mentioning the fact that the worldwide ratio of boys/girls is 1.05 at birth and for children under 15 and 1.03 for people 15-64 which further throws off any estimates of actual children. And if you are not comparing children, but simply the statistics, then there is no question about age, or which occurred in which order, all you know is that given two possibilities {x, x}, one is known, be it {B, x} or {x, B}, which is really the same, and one is not known. Given one variable at 50%, you only have 50%. All the possibilities or children if you don't take age into account are {BB, BG, GG}. This is the same as {BB, GB, GG} because age is not a factor, thus which came firs is not a factor. Take out the GG and you've got {BB, GB} or {BB, BG} which is the same thing because age is not a factor. Only existence is a factor. The only way for age to become a factor is to state the age relative to another child. So long as no age has been stated, it is not a factor, only existence is a factor. The problem states that there exists two children and one is boy. Thus there are two possibilities with no relation relative to position. One of the possibilities is taken, thus there is only one possibility available. -- AlexEagar 17:56, 23 March 2007 (UTC)
Wrong. In the original statement of the problem, we don't have a known child for that analysis to work. We know that one of the children is a boy, or equivalently, we eliminate GG from {BB, BG, GB, GG}. — Arthur Rubin | (talk) 18:53, 23 March 2007 (UTC)

[edit] Vote to Change Logic in this Article

I propose that the logic of this page be changed. Voting will be held from 03/20/2007 to 03/27/2007. The change, if approved, will be made on 03/28/2007. Vote by entering:
~~~~ Your Title or Position<br />

Clarification of the change to be made as requested by Niels Ø: When I first posted my explanation, my opinion was that the 2/3 answer should be completely removed. But although I still don't agree with the answer, perhaps it would be best to say that there are two answers to the second question. One answer is that there is a 2/3 chance that the other child is a girl because the choices are {BG, GB, BB} and the other answer is that there is a 1/2 chance that the other child is a girl because the choices are {Bg, Bb, gB, bB} where the capital B is the boy who is known to be a boy. The 'Conclusion' and 'Mistakes' sections should be combined into a 'Opposing Views' section where both the 2/3 and 1/2 perspectives are compared. Let the proponents of each view state their arguments. The reader can decide whether the question is truly paradoxical such as the 2/3 answer suggests or whether it is not a paradox such as the 1/2 answer suggests. AlexEagar 16:43, 22 March 2007 (UTC)


All Those In Favor of the Change

  1. AlexEagar 02:57, 21 March 2007 (UTC) Software Engineer


All Those Who Oppose the Change

  1. Niels Ø (noe) 21:28, 22 March 2007 (UTC) -- there is only one correct solution, as presently given in the article.
  2. Arthur Rubin | (talk) 21:34, 22 March 2007 (UTC). (Use of "titles" or "positions" is frowned upon here in Wikipedia, but I have a Ph.D. in mathematics and have recently been working on some complicated statistical problems.) There is only one correct solution to the problem as presented. If the the elder child is known to be a boy, the answer clearly becomes 1/2.
  3. AlexEagar 19:26, 23 March 2007 (UTC) Software Engineer -- Ok, I change my vote. I just changed how I'm calculating the percentage and sure enough, it is 2/3 chance assuming a 1:1 ratio of boys to girls. And about 65.5% chance given the 1.06:1 boy to girl ratio of children as stated on the CIA Factbook [1]. I still don't like how you present the argument, I'll offer another explanation when I get a chance.

[edit] I Can Admit I Was Wrong

Here's some python code to calculate the 2/3 correctly.

def boyorgirl(isOneToOneRatio=True,pairsOfSiblings=100000):
        import random
        # Boy == 1
        # Girl == 0
        twoGirls = 0
        twoBoys = 0
        olderBoyYoungerGirl = 0
        olderGirlYoungerBoy = 0
        for k in range(pairsOfSiblings):
                pair = []
                for i in range(2):
                        if isOneToOneRatio == True:
                                if random.randrange(0,2) == 1:
                                        pair.append(1)
                                else:
                                        pair.append(0)
                                if pair == [0,0]:
                                        twoGirls += 1
                                elif pair == [1,0]:
                                        olderBoyYoungerGirl += 1
                                elif pair == [0,1]:
                                        olderGirlYoungerBoy += 1
                                elif pair == [1,1]:
                                        twoBoys += 1
                        else:
                                if random.randrange(0,207) < 106:
                                        pair.append(1)
                                else:
                                        pair.append(0)
                                if pair == [0,0]:
                                        twoGirls += 1
                                elif pair == [1,0]:
                                        olderBoyYoungerGirl += 1
                                elif pair == [0,1]:
                                        olderGirlYoungerBoy += 1
                                elif pair == [1,1]:
                                        twoBoys += 1
        print "If one child is a boy, what is the likely hood that the other child is a girl?"
        print "For every " + str(pairsOfSiblings) + " pairs of siblings, " + str(twoGirls) + " will be both girls, " + str(twoBoys) + " will be both boys, " + str(olderBoyYoungerGirl) + " will be split with an older brother and a younger sister, and " + str(olderGirlYoungerBoy) + " will be split with an older sister and younger boy."
        print "The percentage of pairs that have girls out of those that also have boys is " + str(((float(olderBoyYoungerGirl) + float(olderGirlYoungerBoy))/(float(olderBoyYoungerGirl) + float(olderGirlYoungerBoy) + float(twoBoys)))*100.) + "%."

AlexEagar 19:38, 23 March 2007 (UTC)