Talk:Boy or Girl paradox

From Wikipedia, the free encyclopedia

WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.
Mathematics rating: Start Class Low Priority  Field: Probability and statistics

Contents

[edit] Dice Example

Perhaps this example will help make things more clear for the doubters:

Suppose that I have two dice, and each has three of its sides marked with a “1” and three of its sides marked with a “2”. If I throw both dice and sum the result, I could end up with either 2 (if both dice rolled 1), 3 (if one dice rolled 1 and the other 2) or 4 (if both dice rolled 2). If I roll the dice a large number of times and keep track of the sums, I will quickly begin to see that the frequency of the sum being 2, 3, or 4 follows a 1:2:1 distribution.

If (Dice A, Dice B) is the result of each roll, then the following sample space results:

(1,1) (1,2) (2,1) (2,2)

As you can see, there are four possible results of rolling the dice, but two of the possible results will give a sum of 3 while only one possible result could give a sum of 2 or 4. Thus a sum of 3 is twice as likely as a sum of 2 or 4. If you don’t believe me about the 1:2:1 distribution, try it yourself with real dice!

Now, suppose I have rolled my two dice and I tell you “One of my dice rolled a 1. What are the odds that the sum of my two dice will be 3?” You should answer 2/3, because you know that a sum of 3 is twice as likely as a sum of 2. We can eliminate (2,2) from the sample space and are left with only (1,1) (1,2) and (2,1). On the other hand, if I say “Dice A rolled a 1,” then the probability of my sum being 3 is only 50% because now we have eliminated both (2,2) and (2,1), leaving us with only (1,1) and (1,2). -10-21-06


As this is linked from MontyHallProblem we need to be very careful (as described there) as to how the question is phrased:

In a two-child family, one child is a boy. What is the probability that the other child is a girl?

The 'one child is a boy' is ambiguous because it doesn't explicity explain that the other case 'one child is a girl' is being excluded.

For instance if a parent of a two-child family walks into a room accompanied with a boy (one of their children) is the probability that the other child is a boy or girl anything but 50/50? The answer is NO.

If I have a room full of parents of 2 children families (randomly selected) and I ask all those with a boy (implying 1 or more) to step forward - THEN and only then have I skewed the odds that for the parents that have stepped forward to 1/3 2/3.

Again this is a very subtle point, and worth making explicitly. The fact to are making a decision is very important in this problem.

This example from rec.puzzles.faq makes the 'question step' explicit http://www.faqs.org/ftp/faqs/puzzles/faq

2.3. ==> oldest.girl <== [probability] If a person has two children, and truthfully answers yes to the question "Is at least one of your children a girl?", what is the probability that both children are girls?

The answer is 1/3, assuming that it is equally likely that a child will be a boy or a girl. Assume that the children are named Pat and Chris: the three cases are that Pat is a girl and Chris is a boy, Chris is a girl and Pat is a boy, or both are girls. Since one of those three equally likely possibilities have two girls, the probability is 1/3.

You're welcome to clarify the article, but please keep in mind that the "ambiguity" here is just part of a very general phenomenon affecting all probability problems: giving yourself too much information skews the answer from the textbook result. For example, if I see a two-child family with two boys, then they certainly have at least one boy; nonetheless the probability that they have a girl is 0, not 1/2 or 1/3. I don't get 1/3 because I have more information than is in the statement of the question.
Likewise, in your example, if the parent walks into the room accompanied by a boy, I know that "the child here is a boy", which gives me strictly more information than "one of their children is a boy". With the extra information, it is no surprise that I arrive a correct probability different from the textbook result.
My point is that the abstract question "In a two-child family, one child is a boy. What is the probability that the other child is a girl?" is unambiguous, even if it can be confusing. When translating it into a real-world situation, we must be careful not to introduce the wrong information, and the "question step" you identify is a good way of doing that. But strictly speaking, it is not necessary to analyze the problem. Melchoir 19:09, 5 April 2006 (UTC)

[edit] Independence

both questions are the same.

two independent elements can have one of two values each.

in the frist case, one element is "the older child" being "boy", with the question of the probability of the independent, second element "younger child" being "girl".

the second case, one element is "the identified child" being "boy" with the question of the probability of the independent, second element "other child" being "girl"

in both cases the form of the question is identical. in the second question there is no reason to seperate 1 boy, 1 girl into 1 boy 1 older girl and 1 boy 1 younger girl -- it is only the unrelated first question that makes this seperation seem justified.

Also, the sample space is not correct. let character 1 be elder/younger, and let uppercase signify "identified".

Set = {Bb, bB, Bg, bG, Gb, gB, Gg, gG} each with equal probability.

in problem 1, we are left with: Bb, Bg, and asked for P(Bg) in problem 2, we are really left with: {Bb,bB,Bg,gB}, and asked for P(Bg) + P(gB)

these are just 2 disproofs of this "paradox".

You are making the very common mistake of not realizing that many of the elements in your sample space are degenerate. For any couple that has had two children, there are four equally likely possibilities - they could have a boy and then another boy (probability 1/4,) a boy and then a girl (again 1/4), a girl and then a boy (again 1/4) or a girl and then another girl (again 1/4). That is the only sample space they you need to concern yourself with. Your elements Bb and bB are really just degenerate cases of the element "they had one boy and then another boy", and Gg/gG are degenerate representations of "they had a girl and then another girl". Your elements bG and Gb are both really just cases of "they had a girl and then a boy."
Let’s assume that the parents will name their first boy Bob and their second boy Tom, and will name their first girl Jane and their second girl Jill. This should make it more clear that there are only four possible scenarios, each of which has a possibility of 1/4:
-Bob has a younger brother Tom
-Bob has a younger sister Jane
-Jane has a younger brother Bob
-Jane has a younger sister Jill
If you try to add any more elements you will only succeed in rephrasing one of the already-existing elements, and will say something like “Tom has an older brother Bob.” However, that element is already accounted for in the original set.

[edit] Questioning

Well, this page is busy today. The recent addition is mistaken; the intent of the question is not to identify any boy at all. I can get a reference on that, but for now I'll revert and clean up the language. Melchoir 19:46, 5 April 2006 (UTC)

[edit] Small Change

I made a small change to the article to try and clear up a common problem. (One I had myself once.)

[edit] This doesn't seem to work

Given two variables, U and K, each of these being a children, one (U) whose gender is unknown to us, and one (K) whose gender is known to us. This is the only difference between U and K ; U may be older than or younger than K without any relevance to the case.

U can be either a boy, or a girl. K can be either a boy, or a girl.

The following combinations are possible.

U is a boy, K is a boy. U is a boy, K is a girl. U is a girl, K is a boy. U is a girl, K is a girl.

Since we know the gender of K (in this case, let's assume K is a girl), two possibilities are eliminated : the two possibilities where K is a boy. That leaves us two possibilities, one of which has U as a girl, and one of which has K has a boy. Note again that ages is of absolutely no relevance here : U can be either older than or younger than K without changing a thing.

Essentialy the problem appears to be the mistaken notion that, because you don't know the ages of the children, both G-B and B-G are possible. However, this is false : if both G-B and B-G are possible, then G-G need to appear twice : once for G-G where the girl whose gender is known to us is the first G, and one for G-G where the girl whose gender is known to us is the second G.

Just my two cents.--Damian Silverblade 17:44, 13 April 2006 (UTC)

False. G-B and B-G are different, equally likely events that lead to a couple having one boy and one girl.
A couple could have a boy and then a boy, a girl and then a boy, a boy and then a girl, or a girl and then a girl. G-B and B-G are both valid because they both represent valid possibilities. If you had double elements for B-B, you are implying that there's more than one way for a couple to end up with two boys. In fact, there isn't - the only way for a couple to end up with two boys is to have a boy and then another boy. However, there is more than one way for a couple to end up with a girl and a boy - they could have a girl and then a boy, or they could have a boy and then a girl. That is why you have separate elements for B-G and G-B, but only one element for B-B and G-G.

Agreed - as I pointed out earlier how the question is phrased makes all the difference, there is a very strong argument that as the question is *currently* phrased the correct answer is 50/50.

Also agreed (or phrased alternately, the three possibilities listed in the sample space are not equally likely). This article is ridiculous. Can we make a motion for deletion? Topher0128 02:58, 12 July 2006 (UTC)

The article may be phrased badly, but it is indeed a well-known example, and I've seen it discussed in a book describing probability problems from a cognitive perspective. For now, I'll simply add the {{unreferenced}} tag. But it's not a joke. Melchoir 03:04, 12 July 2006 (UTC)

The phrasing of the first formulation of the question says nothing about the set of Boy/Girl being ordered, and yet the sample evaluation assumes that it is important. The maths are correct just for a different formulation. The second formalization does explicitly order the children(older/younger). The first should read something like

  • In a two-child family, one child is a boy. What is the probability that oldest child is a girl?

and the samples would be oldest then youngest {BB, BG, GB, GG} GG is not possible, since one is a boy so three possibilitys remain, {BB, BG, GB} The girl is the oldest in only 1/3 prob.

  • In a two-child family, the older child is a boy. What is the probability that the younger child is a girl?

This one is described correctly. Domhail 04:03, 12 May 2006 (UTC)

[edit] Rules of Conditional Probability?

Precisely. For example in a tree diagram: the way the question is phrased, "In a two-child family, one child is a boy. What is the probability that the other child is a girl?" , we are surely not only excluding the G-G branch from the conditional tree, we are also assuming that the G-B and B-G trees are identical, as the order in this question is irrelevant; therefore these branches should be merged and the answer remains 1/2. —The preceding unsigned comment was added by 15:34, 5 August 2006 (talk • contribs) 89.145.196.3.

That is not a valid line of reasoning. You might as well say that the probability that a two-child family has two boys is 1/3, since there are three possibilities and the question does not care about order. In probability, there are no rules that deal with such notions as irrelevance and "merging branches". Melchoir 16:59, 5 August 2006 (UTC)
"In a two-child family, one child is a boy. What is the probability that the other child is a girl?" Technically, the answer to this question depends on how the information "one child is a boy" was obtained. A precise, unambiguous question would be:
  • Suppose that, from all families with exactly two children who are not twins, you select one parent at random, and you ask the parent: "Is at least one child a boy?" If the parent answers, "Yes," what is the probability that both children are boys?"
I posted this version in the article on 4 Dec 2007, and it was deleted the same day by Dorftrottel. Italus (talk) 23:34, 4 December 2007 (UTC)

[edit] Coin Examples

(A) If I throw 2 coins and let you see one, have I given you any information about the 2nd (hidden) coin? - Obviously not, its probability of heads or tails remains .5/.5

(B) If I throw 2 coins, and I look at them, you ask me is there at least one head, and I answer truthfuly 'Yes' and show you that coin.

We've now arrived at the subtle (and counter-intuitive) case where the probability that the other coin is a tail is now 2/3. The reasoning is explained in the article page and can be verified using a simple computer program (or indeed throwing coins yourself)

But again the atual 'questioning step' is critical in differentiating (A) & (B) and Missing from the article page.

--Pajh 21:33, 21 April 2006 (UTC)


Ok, I just ran through several thousand simulations of B: two random tosses.
check if either of the two is a head.
in cases where one is a head, check if there is a tail present.
the probability comes out as 0.50

If you disagree with my method, can you correct me?

--Wes , 26 September 2006

What happens if you increase the number of coins in your program to ten? If you still get 0.5 I think it's something wrong with your program. If you get something else please make a diagram of all your found probabilities with two coins up to ten coins. Does the diagram make sense? INic 20:53, 26 September 2006 (UTC)

Java Code solution

public class Coins {
    
    public static final boolean HEADS = true;
    public static final boolean TAILS = false;
    
    public Coins() {
    }
    
    public static final void simulate() {
        
        java.util.Random generator = new java.util.Random();
        int pairheads = 0;
        int tailpresent = 0;
        for (int count=0;count < 10000; count++) {
            
            boolean coin1 = generator.nextBoolean();
            boolean coin2 = generator.nextBoolean();
            
            if (coin1 == HEADS || coin2 == HEADS) { // At least one head
                if ( coin1 == HEADS && coin2 == HEADS) // Both heads
                    pairheads++;
                else
                    tailpresent++;
            }
            
        }
        int total = pairheads + tailpresent;
        System.out.println("Pairs of heads = " + pairheads );
        System.out.println("Tail present = " + tailpresent );
        System.out.println("Total = " + total);
    }
    
    public static void main(String[] args) {
        simulate();
    }    
}

Output

Pairs of heads = 2492
Tail present = 5034
Total = 7526

--Pajh 10:13, 28 September 2006 (UTC)

[edit] THIS ARTICLE IS LIES!!!!!

i dont understand how it can be 2/3. its 1/2!!! i swear this article is completely wrong someone delete it pls.

just because there are 3 possibilities doesnt mean a 1 in 3 possibility. someone driving by my house can either shoot me or not, therefore thjeres a 50:50 chance the next car will shoot me. K. I honestly wouldnt be surprized if everyone whos contributed to this article is completely wrong. The intellectual talent of this place is about a 7th grade level. Seriously ive had to compeltely rewrite some engineering articles coz of the morons here.

oh wait i get it. its like the monty hall dealie.

[edit] YES BUT NO

The 2-child family may be either : 2 boys (p = 1/4), 2 girls (p = 1/4), 1 boy and 1 girl (p = 1/2).

The rule is :

p (A / B) = p (A and B) / p (B)
Probability that A is true if B is true = Probability that A and B are true at the same time / Probability that B is true

The statement for A is clear :

A = "One child is a girl"

There are 2 different statements for B :

Case 1. B = "(I know one of them,) he is a boy" -> p = 1/2 (probability for him to be a boy)
Case 2. B = "(I know that) at least one of the two of them is a boy" -> p = 3/4 (probability that a 2-child family has at least one boy)

Thus, 2 different statements for (A and B) :

Case 1. (A and B) = "The one I know is a boy, the one I don't know is a girl" -> p = (1/2).(1/2) = 1/4
Case 2. (A and B) = "One is a boy, one is girl" -> p = 1/2

And 2 different results :

Case 1. p (A / B) = (1/4) / (1/2) = 1/2 (= p (A) actually, B has no influence)
Case 2. p (A / B) = (1/2) / (3/4) = 2/3

Case 1 sounds ok to me. In case 1, we don't need to know that it's a 2-child family. I still find Case 2 very disturbing. The thing is : a simple, almost automatic, deduction leads from a "Case 1" B to a "Case 2" one. I wonder, is it good to know too much about something ?

--[Strahd] 5:35, 12 August 2006 (Orléans, France)

I guess the simple answer is that you can't use deduction in these problems. Melchoir 17:29, 12 August 2006 (UTC)
I agree --[Strahd] 19:57, 12 August 2006 (France)

[edit] YES BUT NO (II)

Trees !

Case 1 :
First, the child we know, then the other child : BB, BG, GB, GG.
The child we know is a boy : BB, BG.
The other one is a girl : BG.
Probability : 1/2
Case 2 :
First, the elder, then the younger : BB, BG, GB, GG.
One child is a boy : BB, BG, GB.
The other one is a girl : BG, GB.
Probability : 2/3.

In Case 2, I'd use the word "frequency" rather than "probability".

--[Strahd] 6:17, 13 August 2006 (Orléans, France)

[edit] Similar question

My parents only got two children. I'm a man. What is the probability that I have a brother? In other words, what is the probability that my sibling has the same sex that I have? INic 10:32, 25 August 2006 (UTC)

Let's look at the sample space, denoted by {You, Sibling}. Originally, the sample space is {B,B}, {B,G}, {G,B}, and {G,G}. Now that we condition on the fact that you (emphasis intended) are a man, the space has two elements: {B,B} and {B,G}. Thus 50%
The key as to why the 1/3 did not work is that you specified which ("I") was the man. If you had said "I note that (at least) one of my parents' two children is a man", then the 1/3 is correct.
But note if you said "One of my parents' two children picked at random is a man", then we're back to 50% again. Bayes' theorem can show this, but the heuristic here is to note the four possibilities above, and to pick one of the four "B"s at random. You are twice as likely to pick one from the {B,B} group as either from the other two groups (50%, 25%, 25% respectievly respectively). Baccyak4H 20:08, 5 September 2006 (UTC)
An error is in your first statement. It should be: Now that we condition on the fact that you (emphasis intended) are a man, the space has 3 elements: {B,B} and {B,G} and {G,B}. Thus still 33%. [I do not know whether you are the younger or the older]. --Tauʻolunga 23:37, 5 September 2006 (UTC)
No, there is no error. The notation used was {You (INic), Sibling (of iNic)}. Since INic is a man, the {G,B} outcome is indeed ruled out. One is not only limited to age order to distinguish the two. The 50% situation could occur if you said "the taller one is a boy", or "the one whose first name comes first alphabetically is a boy", etc., or in this case, "the one with the Wikiname INic is a man."
The way to distinguish the two cases is: The 50% scenario occurs whenever the original statement about being a boy allows that if one knew only that and then were to meet the two children, and if the two children were indeed both boys, then it would be possible in principle to tell which of the two boys was referred to originally. Thus the older one, or "Aaron", or the taller one, or INic, or... (of course, if one was a girl the distinction is trivial). You might need some more info (ages, names, etc.), but in principle you can determine which one was mentioned. If this distinction is impossible in principle ("one is a boy"), then we are in the 1/3 case. Note the potential for ambiguity: when one says "one is a boy", it is easy to picture the problem poser looking at one particular child at random and making such an observation. This is not supposed to be the case, and is the reason for much of the language discussion/disambiguation in the article. Which is why I added "at least" to "one [child is a boy]."
But don't worry; there is a reason this paradox topic is included in WP: there is some very important subtlety that is not always easy to grasp. Baccyak4H 03:21, 6 September 2006 (UTC)

OK, let's say that when I say I'm a man, I only by that mean that "I note that (at least) one of my parents' two children is a man." Then the probability that I have a brother is 1/3 you say, right? But say that I by I'm a man only mean "My name is INic and that's a male name". Then the probability that I have a brother is 1/2 according to you, correct? Is the probability really dependent on how I say that I'm a man? INic 01:28, 7 September 2006 (UTC)

Of course it is. If you say "both of my parents' children are men" then the probability that you have a brother is 1. Melchoir 01:32, 7 September 2006 (UTC)
"Is the probability really dependent on how I say that I'm a man?"
Your rephrasing of the your original statement does not say "you" are a man at all. "I note that (at least) one of my parents' two children is a man." allows for you being a woman and having a brother.
So long as one particular individual is identified somehow as being the man, in this case, the writer here at 01:28, 7 September 2006 (UTC) we are in the 1/2 case. In the case where you had a brother, I could tell in principle which man was you because you would be the one posting here at that time. But if you had said "I note that (at least) one of my parents' two children is a man.", there is no information there even in principle to distinguish whether you were referring to yourself or your brother, in that case.
The probability is dependent solely on whether one particular individual is described as "the man". If so, the prob. is 1/2, since we now can speak of the other particular individual. The chance that any particular individual (and the "other" particular individual in particular ;-)), is a man, is 1/2. The description can be of any sort, but it has to identify (somehow) a particular individual.

Aha, I think I see what you mean: the siblings should be viewed as an ordered pair according to some ordering criterion. Which criterion we use doesn't really matter, we can use whatever we want, right? Well, this is wrong I'm afraid. It does in fact matter what ordering criteria we use. INic 15:46, 8 September 2006 (UTC)

For example, you say that The 50% situation could occur if you said "the taller one is a boy". This is clearly false as it's far more likely that the boy is the tallest one in families with mixed siblings. (For adult siblings with common parents I'd guess this rate is close to 100%.) The property of being the tallest is simply not independent of gender. This means that if you know that the tallest one is a man the probability that he has a sister is close to 2/3. INic 15:46, 8 September 2006 (UTC)

In the example we discuss the ordering criterion is "the sibling that stated the question above." And I'm not sure if the event that I stated my question above is independent from the fact that I'm a man. Your answer implies that it's totally uncorrelated. I don't think it is. INic 15:46, 8 September 2006 (UTC)

You are right about my height example. Strictly apeaking, the assignment of any criteria used to identify one individual needs to be independent (probabilistically) of gender. I suppose one could claim that most any criteria used would fail to meet this requirement in the strictest, most exacting sense (e.g., the distribution of child names). But many can be close, and if we state the problem making the obvious approximation, all is well.
About independence of your posting (verb) and your gender, I refer to the above technicality and say yes, if men are more (or less) likely to edit WP than women, then they are not independent. But this is starting to become nitpicky. Please assume probabilistic independence, and again, all is well.
"All models are wrong. But some are useful" - George Box
Baccyak4H 16:31, 8 September 2006 (UTC)

Nitpicky or not, independence makes all the difference—as you now somewhat reluctantly admit yourself. But the problems doesn't end here I'm afraid. How would we test if the ordering in question is inbependent of gender or not? You say we should "assume probabilistic independence," but why? We could as well assume probabilistic dependence, right? INic 10:54, 11 September 2006 (UTC)

To test independence you propose to estimate how often WP is edited by men. That is a good estimate if someone picked me at random from any WP contribution and asked me if I had only one sibling. But that wasn't what happened. Instead I picked myself. From what group I picked myself I have no idea! And yet we need to know the group to have a probabilistic model. It seems that whatever model we chose it's as far from the truth as any other... INic 10:54, 11 September 2006 (UTC)

"independence makes all the difference" -- Not necessarily, one could have "mild" dependence (or more specifically, correlation), of some types, and still observe that having an individual identified in principle changes the probability of the genders for the pair. They may not be exactly 1/3 vs 1/2, but they would be different. But this inaccuracy is no more problematic than (say) using a gaussian model for data rounded to say three decimal places. Strictly speaking this is absurd. The entire sample space has probability zero. But you know what? In many many situations, it works! Imagine that.
Yes, all values between 1/3 and 1/2 are possible depending on how strong the correlation is. This inaccuracy is problematic here just because it allows for the whole range of possibilties. It's not at all obvious which value between, and including, 1/3 and 1/2 should be the correct answer. INic 02:20, 16 September 2006 (UTC)
"you propose to estimate how often WP is edited by men" Um, I proposed no such thing. I am indeed enlightened that you draw that conclusion.
I'm sorry if I misinterpreted you. May I ask you to enlighten me what you meant by talking about how common it's that men edit WP in this context, and how that fact is connected to the question of independence according to you? INic 02:20, 16 September 2006 (UTC)
"But that wasn't what happened." Of course not. The original problem can be described by a  \{\Omega, \mathcal{F}, P \} , while yours by a particular (and now degenerate)  \omega \in \Omega \,\!.
Yes correct, to talk about probabilities we must have a probability space defined  \{\Omega, \mathcal{F}, P \} . However, all we have here is  \{\omega, \mathcal{F}, P \} where we have no idea to what Ω our ω belongs. The conclusion must be that, not only is the probability lost somewhere between 1/3 and 1/2, we are not even allowed to talk about probabilities in this case as the probability space is undefined. INic 02:20, 16 September 2006 (UTC)
Just like if one were to look, in the original problem, at what the gender of the sibling is, and after observing it, asking what it was. It being a boy happens with either probability 1, or 0. But (as you know) this is a different problem, so of course the answers could be different; no problem there.
No, this is not the same situation. INic 02:20, 16 September 2006 (UTC)
Anyway, onwards...I am a golfer, not a fisherman... Baccyak4H

OK, let's look at it differently.

Suppose, in a particular town, it is the fashion for parents to keep their babies' first pairs of bootees; and it also the tradition that bootees for boy babies are blue, bootees for girl babies are pink.

We ask the mothers of all two-child familes to enter a room, and bring the first pair of bootees for each of their children

Now, we ask all mothers to hold up a pair of blue bootees. Those that do not have blue bootees to hold up are asked to leave the room.

Next, we ask all mothers to hold up a pair of pink bootees. The question is, what proportion od mothers still in the room do hold up a pair of pink bootees.

The answer is two thirds.

I think this is equivalent to the original question.

Tim

Yes, it is. Baccyak4H 04:50, 8 October 2006 (UTC)

The question As phrased (even after a change) is still wrong. Wrong as in the answer to both questions as phrased is 50/50. The problem is still the phrasing 'at least one boy' This still doesn't make it clear that an explicit step has been carried out excluding GG.

For example, I know my neighbour has 2 kids, she sends one round and it's a boy. Now I know she has 'at least one boy', what's the probability that the other child is a girl/boy = 50/50!

Read the rec.puzzles.faq this has been done to death and the questioning step is a necessity, otherwise change both answers on the mainpage to 50/50 as it is currently WRONG.

2.3. ==> oldest.girl <== [probability] If a person has two children, and truthfully answers yes to the question "Is at least one of your children a girl?", what is the probability that both children are girls?

http://www.faqs.org/ftp/faqs/puzzles/faq Pajh 15:46, 15 October 2006 (UTC)

Your discussion makes it clear why this article is needed to cover this paradox in the first place. It's easy to let one's intellectual guard down.
Simply put, your analysis is wrong. While you know your neighbor has at least one boy, you also know more than that. You know the particular child sent around to you is a boy. Thus, while you are indeed looking at the 50% scenario, it is not the same as the "at least one boy" scenario, as written (and intended).
The 2/3 answer to the bootie formulation is indeed correct (do a simulation if you want). The fact that those two scenarios are indeed different, over what might appear to be a very minor point, is why this is a paradox. Baccyak4H 18:33, 15 October 2006 (UTC)
I would like to comment on the question posed earlier in this chat.

"My parents only got two children. I'm a man. What is the probability that I have a brother? In other words, what is the probability that my sibling has the same sex that I have?"

I believe that this question is more like saying the elder child is a boy. Saying that "I am a man" is referring to a specific child and not to the other; the probability that the other child is a boy is 1/2. Saying that one of the two children is a boy could apply to either child, the the probability that both are boys is 1/3 for reasons stated on the main page. So I believe that the statement of this particular question does not lead to the latter of the two questions stated on the main page.

I'm new to Wikipedia, so I'm sorry if my ettiquette is out of line. Thanks. blahb31 Blahb31 19:53, 24 December 2006 (UTC)

[edit] Rebuttal to Solution

Looking at the original problems (and original solutions), I have no difficulty understanding the first scenario. In the first problem, time is introduced as an initial condition. It says 'the older child is a boy'. Therefore it is acceptable to list the scenarios as:

(BB, BG, GB, GG)

This is because (using time only) the possibilities are: 1. a boy was born and then a boy was born 2. a boy was born and then a girl was born 3. a girl was born and then a boy was born 4. a girl was born and then a girl was born. Since you are acknowledging that the first child is a boy, you select the scenarios with B as the first letter, and that leaves you with a 50/50 chance of the second child being a girl.

For the second scenario, you have eliminated time from the initial condition (no one knows whether the boy is older or younger). You have two choices here, either to introduce time, or keep everything timeless. If you introduce time, you have the following scenarios (as previously used, the first letter will be the older child, and the letter that is upper case is the child that has been randomnly mentioned in the statement 'has at least one...'):

(Bb, bB, Bg, bG, Gb, gB, Gg, gG)

The reason there are 8 scenarios here instead of 4 is because of one thing. It is because of the words 'has at least one...'. Since in problem 2 we are randomnly pointing to one child, we have to include all possibilities of pointing as well as timing, which increases our scenarios from 4 to 8. The reason this was not done in problem 1 is because we already knew what we were pointing at (the first child is a boy). We could use these same 8 scenarios in problem one, but the information from the problem simply reduces it in the same way to the solution 50/50.

Back to problem 2: After selecting the scenarios where there is an uppercase B (as mentioned in the problem), we are left with:

(Bb, bB, Bg, gB)

This leaves the chances of having a girl at 50/50, which is counter to the original solution to problem 2. If you wish to keep time removed from the situation, then the order of birth does not matter. Therefore the highest possible scenarios are that the parents will have 2 boys, 2 girls, or one of each.

(BB, BG, GG)

Again, since problem 2 says 'has at least one boy', we must remove the scenarios without a B, and so we are left with:

(BB, BG)

This again shows that the solution to problem 2 is 50/50, but is reached by keeping the problem independent of time. In conclusion it is important to note that the difference between the two problems is the inclusion of time and the inclusion of acknowledging a random child, which then dictates how you approach the solution. Either always include time, or never include it, but mixing it up will give you skewed answers, such as the 2/3. For example in the original solution to problem 2, only one selection was given for BB. This is an error because the problem says 'has at least one boy', which could be addressing either the first boy, or the second boy. Therefore you must include those possibilities in your scenarios, giving us the 8 listed above. This is why the original solution to problem 2 is flawed. AFpilot157 12:34, 31 October 2006 (UTC)

Hi AFpilot157. I'm glad you are thinking about these problems for yourself. Working on them is exceptionally helpful, and will increase your understanding of probability. However from the point of view of the encyclopedia I can assure you that the solution given in the article is the correct one. There are many sites that discuss mathematical probability and I would strongly suggest that you post your comments on one of those. They will be more than willing to discuss them. DJ Clayworth 21:10, 31 October 2006 (UTC)

Thank you for your response. Yes, I have studied probability at times, and I just thought I would throw my two cents in on this interesting problem. In all honesty, I would have no problem accepting the fact that I am wrong, but I truly would like to see in what way I am wrong. I know it is a long shot, but is it possible for the encyclopedia to be wrong? All I ask is for a counter argument, one that takes it all into perspective (which is interesting, because probability at times can change based on perspective). To further illustrate my point for problem 2 of the Boy or Girl scenario, I will go into more detail into what I mean by my solution. If we say that 'a family has 2 children, and has at least one boy' then we have to either introduce age into the scenario (giving us 8 possibilities), or remove age from the scenario (giving us 3 possibilities). In either case, the end result becomes 50/50. My question is about the following statement from the original solution:

"The main reason is that the second question does not assume anything about the age of the boy, he might be the older and he might be the younger sibling. Therefore the loose thought that there are only 3 possibilities (2 boys {BB}, 2 girls {GG} or a mix) does not take into account that the latter is twice as likely than the formers, because it can be either {GB} or {BG}."

Why is the latter (one of each) more likely than BB or GG? By accepting both orientations (GB and BG) is that not the same as putting them in order of birth? Since the children had to come one after the other (assuming no twins), you are saying that either the first one is a boy and the second one is a girl, or the first one is a girl and the second one is a boy. By doing that, are you not introducing age (time) into a problem that does not have age or time associated with it? I can see how many people would think that 'one of each' is twice as likely as 'two of a kind', but I believe there is a flaw in that thinking. The flaw is that we cannot look at this scenario like we look at genetics of plants and animals. Going back to an old highschool lesson about tall plants and short plants, they always talked about TT, TS, ST, SS, when it comes to genetics, and the probability of getting a particular combination. The thing here is that when it comes to genetics, the tall and short traits joined all at the same time, with no particular order in which one came first. TS and ST were twice as likely as TT and SS, because everything happened all at once, not in a linear sequence like the birth of children (also, both parents of plants and animals contributed a T or an S to the offspring, unlike the B or G of children). Since time is naturally ingrained in this problem of boy and girl (because they occur one after the other), if we want to truly remove time from the situation, there really are only 3 choices: 2 boys, 2 girls, or one of each. If you take the set listed in the original solution (BB, GB, BG), the words come out as: (one boy and then one boy, one boy and then one girl, one girl and then one boy). No matter what, you cannot truly remove time from the situation (without reducing yourself two the 3 sets I listed in my solution), because the birth of children (which is sequential in this problem) is a fixed trait of the problem. AFpilot157 21:07, 31 October 2006 (UTC)

It doesn't really mattter what variable you use to distinguish the two genders (time or otherwise); the time of birth is simply a tool used to apply the calculus of conditional probability to find the correct answer. It turns out that to enumerate the sample space you will have to make some distinction anyway, somehow, in the sense of writing {BG} states that B comes after"{" and G comes before "}". Seems pedantic (maybe it is; I could do better with more time maybe), but try simulating the experiment: flip two coins 50 (say) times. You will get about twice as many flips where the two coins differ than when they are both heads (or tails). Here note that the "times" the two coins become readable is arbitrary, and can be assumed to be irrelevant. DJ Clayworth was right; the explanations here are, as counterintuitive as they may seem, correct.
For an example which operates under the same principal, consider a deal in the game of bridge. The game typically has some suit distribution among the four players where each has between 1 and 5 cards of any suit in their hand. No one would a priori be suspicious of that, yet the same logic that says that two boys is one outcome as is one of each gender, and no probabilistic difference can be inferred, would have you believe that a perfect deal (each player gets 13 cards of the same suit) is just as likely as whatever distribution of cards one typically sees. But do we ever see a perfect hand?Baccyak4H 03:37, 1 November 2006 (UTC)

I now understand exactly what you mean with your coin flip example. It is true that the solution seems counterituitive, but it becomes easily acceptable once it is understood. It is only counterintuitive because we at first glance do not see a difference in what problem 1 and problem 2 is asking us. The fact is that there is a difference, however slight, in the wording that changes the whole meaning of the problem, and doing the mathematics/experiment can aid us in understanding that. AFpilot157 07:00, 1 November 2006 (UTC)

[edit] Two-stage game

There is nothing odd about this. It is basically just a two-stage game. In the first example the outcome at stage 1 is still to be determined. Hence the outcome is 1/2. In the second example the outcome at stage 1 has been determined hence reducing the set of possible outcomes. What happens at stage 2 is independent (or assumed to be in the paradox) of what happened at stage 1. The confusion arises from the fact that the puzzle does not mention in which order the boy and girl is born. Assuming that the boy had been born first the possible outcomes at stage 2 would be {BB} or {BG} which would result in a 1/2 probability just as most people would assume. It is a trick question, the secret lies in the fact that the order has not been determined. It is as easy as that. No need for any long explanation. This is not original research, it is a well-known fact from simple game theory. It is a classic example of the difference between a 1 stage game and a 2 stage game. I am removing the original research tag. MartinDK 10:18, 10 November 2006 (UTC)

[edit] What's actually going on

The first situation: In a family with 2 children, the older one is a boy. So, you setup the list using the variable B first, to represent that he is older. Your possibilities are then {BB BG}. Therefore, there is onle a 1/2 probability that the other child will be a girl.

The second situation: In a family with 2 children, one is a boy. Now, you have to set this list up the same as in the previous situation, or there is no way to compare them. Lets first look at all the possibilities: {BB BB BG GB GG GG} Why are there two {BB}'s and two {GG}'s? Because you know the gender of the child but not the age. You can have two boys, but there are two possibility's when you have that. The boy that is known can be older, or he can be younger. And since you don't know, you have to consider both of the options. Since you know that one of the children is a boy, you can eliminate the two {GG}'s. This leaves you with {BB BB BG GB}. The probability that the other child is a girl, is 2/4, which is the same as 1/2. —The preceding unsigned comment was added by 168.103.88.39 (talk) 00:15, 9 December 2006 (UTC).

In analyzing the first situation, you have defined the first letter to be the oldest child. But you have violated that definition in the second situation by allowing one of the two "BB"s to have the youngest first. So you're analysis is invalid (and in fact wrong): there should only be one "BB" in the second scenario, but we do not know whether the first (oldest) or second (youngest) was referred to. Thus the answer to the second situation is 1/3 and not 1/2. Baccyak4H (Yak!) 02:34, 9 December 2006 (UTC)
Actually, no (in response to the immediately preceding response from Baccyak4H). Consider tagging the second situation X_s to denote which specific individual was seen. This gives the set of possible combinations: {BB_s B_sB B_sG GB_s GG GG}. Note that the only two options that have no _s must be removed; they are inadmissable because the one thing we really do know is that we've seen something. This leaves {BB_s B_sB B_sG GB_s}. And exactly half of these have girls. P(G)=1/2. Al —The preceding unsigned comment was added by 82.41.204.120 (talk) 15:16, 13 December 2006 (UTC). UberPuppy 15:24, 13 December 2006 (UTC)
Amendment of the above, for completeness: technically {BB_s B_sB B_sG GB_s GG GG} should be {BB_s B_sB B_sG GB_s GG GG GB BG} where you DID NOT see the boy in the last two, but they are necessary for the identical question asked from the perspective of seeing a girl. In other words, before you know which gender you saw, there are actually 8 candidate combinations, and then the 4 inadmissable ones are removed once you know which gender you saw. UberPuppy 15:24, 13 December 2006 (UTC)
(changing your notation for convenience) {BBs BsB BsG GBs} is a correct enumeration of the sampling space. But in doing it that way, the four elements no longer have equal probability; the first two have exactly half the probability of either of the last two (remember {BB BG GB} have equal probs; you just split BB into equal halves). If you add up the relevant probs, you will get 1/3. Baccyak4H (Yak!) 15:45, 13 December 2006 (UTC)

[edit] A correction to the "Mistakes" section

The following is a quote from the mistakes section:

The error here is that the first two statements are counted double. We do not know which brother is the older, as that was not stated in the question. Call the brothers Tom and Harry.

1. Harry has an elder brother Tom
2. Tom has a younger brother Harry

The second statement repeats the first and therefore should be removed.

The logic is flawed here. It assumes that whoever was making the argument knew the ages when making the argument, which makes no sense. I have a better version of that argument:

In this situation, let's call the known-male child Jeff and call the unknown child Pat. There are four possibilities:

Jeff has an older brother named Pat

Jeff has a younger brother named Pat

Jeff has an older sister named Pat

Jeff has a younger sister named Pat

Each is equally likely to happen, and the "statement that repeats the first and needs to be removed" is nowhere to be found. Therefore, it can only be half. E946 12:12, 4 January 2007 (UTC)

[edit] Fundamental flaw

Although I understand both parties' reasoning, the following fundamental flaw exists in the problem:

Let us work from the assumtion that there are four possible permutations for the siblings' gender: BB BG GB GG

According to the most simple interpretation of the problem, if you go to a park and ask boys if they are from a two-child family, those answering 'yes' have a 2/3 chance of having a sister, because only the GG probability is eliminated.

Now, by asking the boy whether he is the oldest or youngest child, his chances of having a sister change to 1/2, regardless of his answer, because either BG or GB will be eliminated along with GG.

While both reasonings are mathematically correct, only te 1/2 solution has any bearing on reality. —Preceding unsigned comment added by 196.216.16.10 (talk • contribs) 14:52, February 22, 2007

Nope, the 1/2 solution has no basis in reality. The arguments presented are clear. — Arthur Rubin | (talk) 15:23, 22 February 2007 (UTC)
In the park version, the correct answer is 1/2, even before you ask about age. When we list the four possibilities BB BG GB GG, we assume there is a way to define which child we mention first, and which one we mention last. In the original problem, it's natural to do this based on age. In the park problem, it is more natural to mention (say) the boy you ask first, and his sibling last. Then, of the four possibilities BB BG GB GG, two can be discarded as soon as we observe the gender of the boy, leaving us with BB and BG, and a 50% chance the sibling is a boy. The important thing is symmetry arguments, and there is no symmetry argument making 196.216.16.10's three possibilities BB BG GB equally likely in the park version.
If someone can express this more clearly, I'll be very happy to read it here!--Niels Ø (noe) 19:50, 22 February 2007 (UTC)
I think the fundmanetal problem is that the 1/2 people are looking at thiat individual family, while the 2/3 people are trying to look at two-child families as a whole. That other chold has a 1/2 chance of being a girl no matter how many other families there are being tested. E946 03:53, 7 March 2007 (UTC)

[edit] Simple Explanation of Flaw in Logic

There is a 50% chance that the child who is known to be a boy is either older or younger. This represents itself as [B,x] or [x,B]. There is a 50% chance that the unknown child is a girl. This represents itself as ([B,g] or [B,b]) or ([g,B] or [b,B]). There are two equal chances for there to be two boys. They do not equal one another and one should not be thrown out. The Harry statement should be as follows:

1. Harry has an elder brother Tom
2. Harry has a younger brother Tom

These two statements are not the same and should not be thrown out.

Here is a Python script that correctly calculates the percentage that will be girls.

def boyorgirl():
        import random
        # Boy == 1
        # Girl == 0
        boys = 0
        girls = 0
        pairsOfChildren = 10000
        for k in range(pairsOfChildren):
                children = [-1,-1]
                boyPosition = random.randrange(0,2)
                if boyPosition == 0:
                        otherPosition = 1
                elif boyPosition == 1:
                        otherPosition = 0
                children[boyPosition] = 1
                children[otherPosition] = random.randrange(0,2)
                if children[otherPosition] == 0:
                        girls += 1
                elif children[otherPosition] == 1:
                        boys += 1
        print "If one child is a boy, what is the likely hood that the other child is a girl?"
        print "Testing a sample of " + str(pairsOfChildren) + " pairs of children reveals that " + str(girls) + " will be girls and " + str(boys) + " will be boys." 
        print "The percentage of girls is " + str(float(girls)/(float(girls) + float(boys))*100.) + "%."

AlexEagar 02:57, 21 March 2007 (UTC)

I agree the explanation in the section "Mistakes" in the article is rather weak, but I cannot see what you want to replace it by, or whether you want to change the conclusion too. I do not read Python. You may have misunderstood the logic in that section: We cannot give "the boy" a name - we're not told about a particular boy in this problem; we're just told they have at least one boy. I will not vote below; I think you should clarify the change you have in mind.--Niels Ø (noe) 09:48, 21 March 2007 (UTC)
Either you take into account the position of the other child in which case you have to take into account the known child {Bg, Bb, gB, bB} or you don't take into account the position of the other child {Bg, Bb} but you can't just choose to take into account the position of the other child if it is a girl, but not if it is a boy {Bg, gB, Bb}. If you are going to allow for {Bg, gB}, you have to also allow for {Bb, bB} which are not the same. Really the age doesn't matter. There is one position that is already filled with a boy and there is one position open for either a boy or a girl and that second position has a 50% chance of being either a boy or a girl regardless of who is older or younger. And that's not mentioning the fact that the worldwide ratio of boys/girls is 1.05 at birth and for children under 15 and 1.03 for people 15-64 which further throws off any estimates of actual children. And if you are not comparing children, but simply the statistics, then there is no question about age, or which occurred in which order, all you know is that given two possibilities {x, x}, one is known, be it {B, x} or {x, B}, which is really the same, and one is not known. Given one variable at 50%, you only have 50%. All the possibilities or children if you don't take age into account are {BB, BG, GG}. This is the same as {BB, GB, GG} because age is not a factor, thus which came firs is not a factor. Take out the GG and you've got {BB, GB} or {BB, BG} which is the same thing because age is not a factor. Only existence is a factor. The only way for age to become a factor is to state the age relative to another child. So long as no age has been stated, it is not a factor, only existence is a factor. The problem states that there exists two children and one is boy. Thus there are two possibilities with no relation relative to position. One of the possibilities is taken, thus there is only one possibility available. -- AlexEagar 17:56, 23 March 2007 (UTC)
Wrong. In the original statement of the problem, we don't have a known child for that analysis to work. We know that one of the children is a boy, or equivalently, we eliminate GG from {BB, BG, GB, GG}. — Arthur Rubin | (talk) 18:53, 23 March 2007 (UTC)

[edit] Vote to Change Logic in this Article

I propose that the logic of this page be changed. Voting will be held from 03/20/2007 to 03/27/2007. The change, if approved, will be made on 03/28/2007. Vote by entering:
~~~~ Your Title or Position<br />

Clarification of the change to be made as requested by Niels Ø: When I first posted my explanation, my opinion was that the 2/3 answer should be completely removed. But although I still don't agree with the answer, perhaps it would be best to say that there are two answers to the second question. One answer is that there is a 2/3 chance that the other child is a girl because the choices are {BG, GB, BB} and the other answer is that there is a 1/2 chance that the other child is a girl because the choices are {Bg, Bb, gB, bB} where the capital B is the boy who is known to be a boy. The 'Conclusion' and 'Mistakes' sections should be combined into a 'Opposing Views' section where both the 2/3 and 1/2 perspectives are compared. Let the proponents of each view state their arguments. The reader can decide whether the question is truly paradoxical such as the 2/3 answer suggests or whether it is not a paradox such as the 1/2 answer suggests. AlexEagar 16:43, 22 March 2007 (UTC)

All Those In Favor of the Change

  1. AlexEagar 02:57, 21 March 2007 (UTC) Software Engineer


All Those Who Oppose the Change

  1. Niels Ø (noe) 21:28, 22 March 2007 (UTC) -- there is only one correct solution, as presently given in the article.
  2. Arthur Rubin | (talk) 21:34, 22 March 2007 (UTC). (Use of "titles" or "positions" is frowned upon here in Wikipedia, but I have a Ph.D. in mathematics and have recently been working on some complicated statistical problems.) There is only one correct solution to the problem as presented. If the the elder child is known to be a boy, the answer clearly becomes 1/2.
  3. AlexEagar 19:26, 23 March 2007 (UTC) Software Engineer -- Ok, I change my vote. I just changed how I'm calculating the percentage and sure enough, it is 2/3 chance assuming a 1:1 ratio of boys to girls. And about 65.5% chance given the 1.06:1 boy to girl ratio of children as stated on the CIA Factbook [1]. I still don't like how you present the argument, I'll offer another explanation when I get a chance.

[edit] I Can Admit I Was Wrong

Here's some python code to calculate the 2/3 correctly.

def boyorgirl(isOneToOneRatio=True,pairsOfSiblings=100000):
        import random
        # Boy == 1
        # Girl == 0
        twoGirls = 0
        twoBoys = 0
        olderBoyYoungerGirl = 0
        olderGirlYoungerBoy = 0
        for k in range(pairsOfSiblings):
                pair = []
                for i in range(2):
                        if isOneToOneRatio == True:
                                if random.randrange(0,2) == 1:
                                        pair.append(1)
                                else:
                                        pair.append(0)
                                if pair == [0,0]:
                                        twoGirls += 1
                                elif pair == [1,0]:
                                        olderBoyYoungerGirl += 1
                                elif pair == [0,1]:
                                        olderGirlYoungerBoy += 1
                                elif pair == [1,1]:
                                        twoBoys += 1
                        else:
                                if random.randrange(0,207) < 106:
                                        pair.append(1)
                                else:
                                        pair.append(0)
                                if pair == [0,0]:
                                        twoGirls += 1
                                elif pair == [1,0]:
                                        olderBoyYoungerGirl += 1
                                elif pair == [0,1]:
                                        olderGirlYoungerBoy += 1
                                elif pair == [1,1]:
                                        twoBoys += 1
        print "If one child is a boy, what is the likely hood that the other child is a girl?"
        print "For every " + str(pairsOfSiblings) + " pairs of siblings, " + str(twoGirls) + " will be both girls, " + str(twoBoys) + " will be both boys, " + str(olderBoyYoungerGirl) + " will be split with an older brother and a younger sister, and " + str(olderGirlYoungerBoy) + " will be split with an older sister and younger boy."
        print "The percentage of pairs that have girls out of those that also have boys is " + str(((float(olderBoyYoungerGirl) + float(olderGirlYoungerBoy))/(float(olderBoyYoungerGirl) + float(olderGirlYoungerBoy) + float(twoBoys)))*100.) + "%."

AlexEagar 19:38, 23 March 2007 (UTC)

[edit] About the 2nd question and it's vagueness

Assuming simple 50/50 G/B probabilities...

A boy living in a two-child family has a 50% chance of having a brother, or in other words if one child is a boy there's a 50% chance of having a second boy depending on the selection method.

In even division {BB BG GB GG} situation it's easy to notice the simple fact that 50% of the boys live in BB families, while BG GB families do indeed outnumber the BB famlies 2:1 ie. there are two times more such (GB, BG) families, but those families have 50% less boys per family. In other words out of, say, a 600 two child families with at least one boy you'd get (approx.) 400 BG/GB families and 200 BB families with 400 boys living in both BG/GB and BB families.

Back to my original point...If the selection is done by:
A) Choosing a random boy living in a two child family (with at least one boy) there's a 50% chance he has a brother

Older child is a boy: BB, BG (but not GB or GG)
Younger child is a boy: BB, GB (but not BG or GG)

B) Choosing a random two-child family with at least one boy there's a 33% chance of a boy having a brother.

Older or younger child is a boy: BB, BG, GB

The article's question - A two-child family has at least one boy. What is the probability that it has a girl? - is bit too vague to give it a specific answer as it depends entirely on how the selection is made. For example, if you see one child of a two-child family and it happens to be a boy there's a 50% chance he has a brother (50% of boys live in BB families, case A: you 'choose' a random boy), on the other hand if you, for example, meet two child families you'll notice that 66% of those families with a boy also have a girl (BB vs BG & GB, case B: you 'choose' a random family).

- G3, 01:21, 27 April 2007 (UTC)

Here's an illustration of the problem, from a top-down view, let's suppose there are 4 two child families living in some small village with 4 boys distributed among them in a 'perfect' configuration.
Families with two children:
-Jones (BB)
-Parker (BG)
-Smith (GB)
-Walker (GG)
Boys living in two-child families:
-Jack Jones (BB)
-James Jones (BB)
-Jules Parker (BG)
-John Smith (GB)
Now, depending on which list you choose your two-child family with at least one boy from you get two different probabilities...
A) Choosing a random boy first results in 4 boys - Jack, James, Jules & John - of which 50%, Jack & James, have a brother.
B) Choosing the family first you get three families - the Smiths, Parkers and Jones - of which 66% have a boy and a girl configuration.
-G3, 11:44, 27 April 2007 (UTC)

[edit] Sample space makes sense

So, I am not a mathematician, nor have I studied logic or probability. The reason I understand this is because of how the first question explains the sample space. Our sample space consists of 4 possibilities (given certain assumptions of course, i.e. no twins, or intersexuals), BB, BG, GB, and GG.

Question #1: Since the older child is a boy we can reduce our sample space. This makes sense to most people, and of course this may be that the paradox is how people interpret a priori knowledge. I think that for the most part everyone is getting that part.

Question #2: Age no longer matters, only that one of the children is a boy. Our new sample space is reduced those sets (maybe the wrong term) that contain a boy, BB, BG, and GB.

What I am seeing a lot here is that people are not accepting the same sample space for the second question. I think that has to do with the way the article is written, because it immediately goes into people's wrong assumptions, and to be honest that was a little confusing to me. I had it, and then I began questioning it because I saw that people often got it wrong. As I was reading this discussion page Baccyak4H gave an example of tossing two coins in response to the Rebuttal to Solution above. I think that is a good way to expand on this in both questions.

We have two coins, both have a heads and a tails. They were minted in different years, and they always land flat.

Question #1: I flip two coins. The older coin is heads. What is the probability that the other coin is tails?

The possible outcomes of the two coins are (HH, HT, TH, TT). Even without the age of the coins taken into account, the possible outcomes are the same. Applying the age is also arbitrary, and this is important. If we decide that the older coin is the "first" coin is a set, then we get HH and HT. If we decide the older coin is the "second" coin we get HH and TH. It becomes apparent that TH and HT are not the same (I hope).

The first question is answered, but again, we mostly agree with that.

Question #2: Two coins are flipped, and at least one of them is heads. What is the probability that the pair contains a tails?

The possible outcomes for this are again (HH, HT, TH, TT). We don't worry about the age of the coins this time, so we just look at which pairs have a heads. We get HH, HT, and TH. TT does not have a heads. Of the three we have left (HH, HT, and TH), two of them have a tails.

I think a reason this is confusing can also be chalked up the first question preceding it without examining why TH and HT are not the same (we use age as a convenient way to skim over that). The first question, in the format given, appears to impress upon the viewer prior knowledge, and really just skews things.

Question #3: I flipped a coin, it is heads. If I flip another coin, what is the chance that it will be tails? Our sample set here has only two possibilities (HH and HT [assuming we denote the first H as being the heads, of course]).

I am not sure (meaning I can't talk with authority) about first stage and second stage games, but when INic says he is a man, he is eliminating the possibilities, since we are not determining the probability at the same time. It is very similar to the first question in that it gives us extra info to work with.

Hmmm, I think that is all in order. I am looking forward to corrections and such. I am going to think about how to make the article easier for people to understand, since I personally don't know a lot of what is being said here (I don't know Bayesian math). Also, what does everyone think of presenting the two questions in a different way, by means of formatting? Maybe give more explanations to the first question before jumping into the second one. Maybe even not assume that the majority of the viewers are wrong, which tends to rub people the wrong way. maiki 11:55, 1 May 2007 (UTC)

About your question #2: That depends on how the roll with a head is selected. Let's say you do 4 rolls with two coins: HH, HT, TH & TT
If you catalogue all rolls like this..
Heads - 1, 1, 2, 3
Tails - 2, 3, 4, 4
...and you choose a random heads flip there's a 1/2 chance it's in a HH roll
If you catalogue rolls like this..
Roll 1 - HH
Roll 2 - HT
Roll 3 - TH
Roll 4 - TT
...and you choose a flip with a heads you get that 2/3rds of such rolls also have tails.
The primary issue here is selection method:
-In case of a chance encounter - you flip a coin, you see a two child family with one child being gender identifiable (eg. the other child is at home) - the chance of the other party being of the same 'type' is 1/2: H* -> HT, HH or *H -> TH, HH.
-In case of selecting the favoured outcome first - at least one tails - and choosing the result from a list of results of wanted outcome - list of tails rolled - the chance of the other roll being the same is again 1/2: TT rolls have equal amount of tails to HT & TH rolls combined
-In case of selecting the favoured outcome first - at least one tails - and choosing the result from a list of results with wanted outcome - list of results with tails rolled - the chance of other roll being the same is 1/3: HT & TH rolls outnumber the TT rolls by 2:1.
More accurately, the question "A two-child family has at least one boy. What is the probability that it has a girl?" can only be accurately answered if we also know the answer to question "How do we know a two child family has at least one boy?" - While it is *absolutely* true that 2/3s of two-child families with boy also have a girl (assuming simple division) it is also equally true that 1/2 of boys in two-child families live with a brother (ie. another boy). - G3, 01:27, 3 May 2007 (UTC)

[edit] I disagree with the solution

I'm am not going to say that the answer "2/3" is incorrect - because it isn't. However, it isn't the best answer to the problem, either. It makes a hidden assumption that cannot be made in general, and that leads to a logical contradiction if applied unilaterally.

The problem statement I will work with is: "Mrs. Jones has two children, at least one of which is a boy. What is the probability that she also has a girl?" The solution presented in the encyclopedia article is that there are four possible 2-child families, based on gender and birth order: {BB}, {BG}, {GB}, and {GG}. We'll assume each is equally likely. The additional fact that at least one of Mrs. Jones' children is a boy means that one of those families, {GG}, is not possible for Mrs. Jones. Since the other three family types exist in equal numbers, and two of them include a girl, the probability is 2/3 that Mrs. Jones also has a girl.

As I said, that answer is not incorrect. But I'm purposely using a double negative, because not being incorrect in one case is not the same as being correct in general. Suppose another question is asked: "Mrs. Smith has two children, at least one of which is a girl. What is the probability that she also has a boy?" If you answer 2/3, following the same logic, then there is something incorrect in your solutions. The only answer that is consistent with the first solution is 0; that there is no chance that Mrs. Smith has a boy.

To see that it is wrong (or at least that there is something wrong if you assume it is right), ask a third question: "Mrs. Grey has two children. What is the probability that her two children have different gender?" This has a trivial solution: 1/2. But, if 2/3 is the correct answer for both the Mrs. Jones problem and the Mrs. Smith problem, and if I write the gender of one of Mrs. Grey's children on a piece of paper WITHOUT showing it to you, then you have to answer 2/3 for the Mrs. Grey problem. Regardless of what I actually wrote, the above logic - if applied unilaterally - says that probability that the other child has a different gender from what I wrote is 2/3.

The hidden assumption being made in the solution concerns how you decide which child to tell about when they have different genders. While the family types {BB} and {BG} may EXIST in equal numbers, that does not mean that for every {BG} family you will say "Mrs. Jones has at least one boy." But in order to get the 2/3 answer, you have to assume that in every {BG} or {GB} case you will say "Mrs. Jones has at least one boy." Which means that if you say "Mrs. Smith has at least one girl," I can be 100% certain that Mrs. Smith actually has two girls.

Oh, and using the answer 0 for the Mrs. Smith problem means that the answer for the Mrs. Grey problem is: P({BG} or {GB}) = P({BG} or {GB}|described a boy)*P(described a boy) + P({BG} or {GB}|described a girl)*P(described a girl) = (2/3)*(3/4)+(0)*(1/4) = 1/2. The right answer.

The best solution to the Mrs. Jones problem can be derived by including, in the solution, a factor for how you selected which gender to describe. There is a 1/4 chance that a random Mrs. X is from any of the four possible families: {BB}, {BG}, {GB}, and {GG}. If she is from a {BB} family, there is a 100% chance that she will be described as having at least one boy. Similarly, if she is from a {GG} family, there is a 100% chance that she will be described as having at least one girl. But for the remaining two family types, unless you make the unfounded assumption that one gender is favored over the other, there is a 50% chance that either description will be used. A {BG} family may exist as often as any of the others, but if our Mrs. Jones is from a {BG} family, only half of the time will she be described as having at least one boy. The other half, she will be described as having at least one girl. If she is so described, the odds of her other child being a girl will be P({BB})*0 + P({BG})*1/2+P({GB})*1/2 = 1/4*0 + 1/4*1/2 + 1/4*1/2 = 1/2.

This is the best answer for the problem. It isn't the only answer - because the question, as it is posed, is ambiguous. You can solve that ambiguity by making an assumption about how to choose whether tell about boys or girls. This answer is best, in the Bayesian Probability sense, because it doesn't assume you are biased toward boys or girls. - JeffJor 17:44, 8 June 2007 (UTC)

Do you suggest that for the Bayesian probabilist the correct answer is 1/2, but for the rest of us the correct answer is 2/3? If this is correct this would be a great test to determine if a person is a Bayesian or not. Very interesting. I'm not a Bayesian and I think 2/3 is the correct answer, so this test works for me at least. ;-) iNic 21:04, 8 June 2007 (UTC)

No. I'm saying that for the frequentist, the problem statement is ambiguous and there is no valid answer. That 2/3 can't be the answer to both the Mrs. Jones problem and the Mrs. Smith problem because, if it is, the probability of any two-child family having a boy and a girl must be 2/3. The only legitimate way to answer the quesiton is to take a Bayesian approach, and that answer is 1/2. - JeffJor 00:27, 9 June 2007 (UTC)

Very interesting. I make pretty much the opposite analysis. In a frequentist setting everything works fine, as it should. The correct answer is 2/3 for both Mrs. Jones and Mrs. Smith, of course. No contradiction arises because it's two different experiments that we can't mix. However, in a Bayesian setting it's easy to derive a contradiction because we don't know what the experiment is. Bayesians typically claim they don't care about experiments anyway. In an attempt to avoid the paradoxes that naturally arise when you don't know what the experiment is, the typical Bayesian start to analyze the psyche of the experimenter instead. This is what you do above. As long as you know how the psyche of the experimenter works in different situations you are all game. But as the experimenter obviously can have infinitely many states of mind, potentially, this problem isn't unambiguously defined unless the probabilistic preferences of the experimenter ("prior") is included as part of the problem statement. But in the way the problem is currently stated it's ambiguous and there is no valid answer—for the Bayesian. iNic 02:40, 9 June 2007 (UTC)

No. This issue does not involve "mixing" experiments, it involves how you interpret an ambiguous statement in the problem in each individual experiment. It revolves around the conditional probability that a mother of two has a a boy and a girl, given that she has at least one boy. Write this P(B&G|B).

By your arguments, a frequentist will say P(B&G|B)=2/3. That same frequentist will say P(B&G|G)=2/3. It is a well known, provable result that P(B&G)=P(B&G|B)*P(B) + P(B&G|G)*P(G). If both conditional probabilioties are 2/3, this reduces to P(B&G)=2/3*[P(B) + P(G)]. The contradiction does not depend on how you "mix" the probabilities P(B) and P(G), because no matter how you do it, they have to add up to 1.

The problem statement is ambiguous. Different experiments that all end up with the description "Mrs. Jones has at least one boy" can have different answers. If you met a Mrs. Jones a social function for parents at a boys' prep school, then 2/3 is the right answer. If it is a girls' prep school and Mrs. Jones mentions she has a son, the answer is 100%. If another Mrs. Jones simply tells you "My oldest child is a boy," the correct answer is 1/2. There are lots of different experiments. A fequentist CANNOT GIVE AN ANSWER without knowing how tit was determined that Mrs. Jones has at least one son. A Bayesian can, and that ansewert is 1/2. - JeffJor 13:09, 9 June 2007 (UTC)

No, the problem statement isn't ambiguous at all. At least not for a frequentist. The issue does not involve conditional probabilities as you think. It's much simpler than that. It is all about finding the sample space Ω for the problem and we're done. I never said that P(B&G|B)=2/3 as you claim I did. I simply said that P(B&G)=2/3 on Ω = {{B,G}, {G,B}, {B,B}} which is a completely different thing. And I also said that P(B&G)=2/3 when Ω = {{B,G}, {G,B}, {G,G}}. If you read the Kolmogorov axioms for probability theory you will notice that they are only valid when you have a probability space defined, and one of the key components in a probability space is the sample space Ω. This means that probability theorems are valid only within a specified probability space. To mix sample spaces (and thus probability spaces) in a formula the way you do above isn't allowed, and it's no wonder you get into absurdities when you do. iNic 02:30, 12 June 2007 (UTC)

I don't know how you can claim it isn't ambiguous, and at the same time claim the statement of the problem is not "What is P(B&G|B)?" That is how the problem statement "What is the probability that a two-child family has a boy and a girl, given that it has at least one boy?" is expressed in a formula. By denying the two are the same, you are admitting the ambiguity. Also, this problem has been called ambigous by many people, including Martin Gardner when he wrote about it his Mathematical Games column of Scientific American in May and October of 1959.

Yes, I know that probability theorems are only valid when you have a probability space defined for them. That's the ambiguous part here. Is the space described by the problem {{B,B}, {B,G}, {G,B}}, or {{B,B}, some subset of {B,G}, some subset of {G,B}}? The statement "A two-child family has at least one boy" is a necessary condition for that space to be {{B,B}, {B,G}, {G,B}}; but it is not a sufficient condition. It also desribes {{B,B}, some subset of {B,G}, some subset of {G,B}}. The exact makeup of the space you need to answer the problem can not be determined from the problem statement. You have to assume something additional - in this case, that the entirety of {G,B} and {B,G} are included.

Finally, although I breifly described the single experiment that leads to the contradiction before, I'll write it out more formally now. The probability that a two-child family has a boy and a girl, given no other information about it, is 1/2. Say I write down a gender that "at least one child" has, but don't show it to you. You have no additional information, and can't change your answer. It is still 1/2. But following the logic you have ascribed to, you could assume that if W is the gender I wrote down (you don't know what it is, but you know it is a discrete value), and N is the gender I did not write down, that the family is from the space Ω = {{W,W}, {W,N}, {N,W}}. If this assumption were true, the probability of a boy and a girl would be 2/3. But we know that probability is 1/2. Therefore, the assumption is not true.

I propose that this encyclopedia article be rewritten to reflect the fact that the problem is ambiguous, nad cannpot be made without assuming information not presented. It can include references to cases where peopel have made such assumptions, biut those are not true solution. - JeffJor 12:26, 16 June 2007 (UTC)

No, a conditional probability isn't what you think it is. P(A|B) simply means the probability we pick A from a sample space given that we picked B from the same space. Both A and B have to be events defined on the sample space. You claim that B = 'has at least one boy' is an event, but it clearly isn't. It's a boundary condition. We get two pieces of information that combined gives us our sample space ('at least one boy' & 'a two-child family') and none of these pieces can be treated as an event. You pick one of them as being a condition for your sample space ('a two-child family') and you treat the other one as an event. But why didn't you pick the last one as a condition and treated the first one as an event instead? Your sample space would have been all families with at least one boy, and you would condition on the "event" 'has exactly two children.' To use conditional probabilities in this way is of course nonsense but unfortunately a common mistake among Bayesians. If Martin Gardner at one time did the same mistake as you do here (as your comment implies) I feel sorry for him too. iNic 17:41, 17 June 2007 (UTC)
I honestly don't understand your talk about the subsets of {B, G} in your second paragraph. The subsets of {B, G} are {B, G}, {B}, {G} and Ø. They are of course all ruled out except {B, G} itself. iNic 17:41, 17 June 2007 (UTC)
In your third paragraph you make it very clear where your mistake is. You evidently do not discriminate between a boundary condition and an event (as explained above). If this is a general disease among Bayesians you might in fact have invented yet another paradox within Bayesian probability philosophy. But as this is WP:OR I'm afraid you can't add this discovery under the Bayesian solution section. iNic 17:41, 17 June 2007 (UTC)

I'm trying not to let this degenerate into name calling; but I know exactly what a conditional probability is. From the link you created, it is "the probability of some event A, given the occurrence of some other event B." The two events do have to be from a joint sample space; but B is not limited to being from the same subset of that space that A is from, which is what you are trying to do.

A "boundary condition," as you put it, is not as well defined a term as the others you have used here. If it were, I'm sure you would have linked to its definition like you did the others. But I can provide a definition I'm sure will satisfy you. It is some condition that defines a subset of one sample space to be used itself as a sample space. And since an event is a SET of outcomes, not just a single outcome - read the link you created - that subset can be called an event on the larger space. You are trying to work on the smaller space, and I am working the larger one that includes yours as a subset. In fact, the space I use is {{B,B}, {B,G}, {G,B}, and {G,G}}. It is called "the original sample space" in the article. If we can establish what set is meant by "a two child family with at least one boy," it clearly - by the definition you linked to - is an event in that space.

But, I agree you can call it a boundary condition that defines a smaller sample space. It is the exact content of that smaller sample space that is ambiguous, since it depends on how you applied that condition. If "at least one boy" is a proiri knowledge, then the probability we seek is 2/3. But if it is a posteriori knowledge, then we have to know how it was obtained. And the point of my example is that assuming it is a priori knowledge is what causes the paradox. Because to be a priori knowledge, you have to know that you are treating "boy" differently from "girl," and that is not clear in the wording.- JeffJor

"But why didn't you pick the last one as a condition and treated the first one as an event instead?" Because that doesn't help toward identifying a solution. It could be done, but isn't very interesting here.

Please understand this: I am not a Bayesian. I am not a frequentist, either. Both schools of thought can be useful for different kinds of problems. And because this problem does not identify the condition "at least one boy" as a proiori or a posteriori knowledge, the frequentist approach cannot work. That is the point of my paradox. That assuming it is a priori knowlegde means that for a similarly-worded problem for girls would have to make the same assumption. That leads to the invalid conclusion that there is a 2/3 probability of a boy+girl family on the space {{B,B}, {B.G}, {G.B}, {G,G}}. It would also help if you wouydl read what you criticize. I sad Martin Gardner said the problem statment was ambiguous. To my knowledge, he never applied conditional priobability in this fashion, and nothing in what I wrote implied it. - JeffJor 21:36, 17 June 2007 (UTC)

"I honestly don't understand your talk about the subsets of {B, G} in your second paragraph." Then you clearly don't understand the definition of an event, which is itself a set and can have subsets based on other conditions. I was trying to not be to wordy, and omitted those other conditions; but it was clearly implied. But, if you want, defined a more specific set of events in the form BG1, where the "B" in the first position means the older child is a boy, the "G" in the second means the younger is a girl, and the "1" means the older child's gender is reported.

The EVENT that you are assuming is your smaller sample space is {{BB1}, {BB2}, {BG1}, {GB2}}. But it isn't clear that both {BG1} and {GB2} MUST be included. They CAN be, but we need to know how "at least one boy" is determined to say it MUST be. In other words, the statement is amniguous.

The key word in the definition of an event in this case is occurrence. You constantly assume that one of the boundary conditions is equivalent to a reporting of the gender of one of the children as an event in space and time, i.e., something that occur. It isn't. None of the boundary conditions are random events. The only randomness involved is biological in nature and the outcomes of these random events determines if a family has at least one boy or not.
You might not know you are a Bayesian but you surely reason like a Bayesian. A Bayesian attach a probability measure to any statement whatsoever, whether the statement describes a random event (that occur) or not. This approach easily leads to paradoxes and there are different Bayesian schools just because there are different ideas on how to cope with this. One of the ideas the Bayesians have invented is the notions of a priori-distributions and a posteriori-distributions. These concepts are exclusively Bayesian and have no meaning for a frequentist. As you claim that the frequentist approach cannot work because we don't know what knowledge is "a priori" or not, shows that you confuse frequentism with Bayesianism completely. iNic 21:06, 21 June 2007 (UTC)


I would like to point out some nonsense in the previous disscussion...
  1. P(B&G)=P(B&G|B)*P(B) + P(B&G|G)*P(G). is clearly wrong, as there is no assertion that G = ~B.
  2. There is no difference in results, either for a frequentist or a Bayesian, between calculating P(A|B) as \frac {P(A \cap B)}{P(B)} or as taking "B" as a "boundary condition".
But I agree there's an ambiguity as to whether the statement is "at least one child is a boy" (P=2/3) or "you are told at least one is a boy" (P is undetermined, but probably between 0 and 2/3, as you don't know the conditional probabilities (Bayesian) or frequencies (frequentist) of how the teller selects what to say, even given the underlying assumption that he/she is telling the truth.) — Arthur Rubin | (talk) 22:47, 17 June 2007 (UTC)
Let's say we have Ω = {{B, B}, {B, G}, {G, B}, {G, G}} and we define the event B as being the observation of a boy, and event A as the observation of a family with a boy and a girl. Then P(B) = ½ and P(A&B) = ¼ which makes P(A|B) = ½. This is clearly different from when we treat all the information we get as boundary conditions on the sample space (which is the correct thing to do), as we then have Ω = {{B, B}, {B, G}, {G, B}} where P(A) = ⅔. I do not understand the difference between "at least one child is a boy" and "you are told at least one is a boy." And I think we assume that if you're not a boy, you are a girl. iNic 00:21, 18 June 2007 (UTC)
Ah, yet another interpretation. You observe one child to be a boy, which gives the corresponding probablility of 1/2 as you note. — Arthur Rubin | (talk) 13:46, 18 June 2007 (UTC)
Exactly. And to not clearly distinguish between being told that at least one is a boy and observe that at least one is a boy is the cause of so much confusion here. (In the first case it is just a piece of information on the same level as the information that the family has exactly two children. It is not an event, no one picks anything from any sample space, no randomness involved, nothing happens in space or time. It is simply a boundary condition for the sample space. In the second case, however, someone observes that one of the children in a family is a boy. This is an event, defined on a sample space, an event in space and time, with randomness involved. After all, we could as well have observed a girl instead. In this case the boy/girl thing is not a boundary condition for the sample space. It is not on the same level as the information given that the family has two children.) In a frequentist setting this distinction is very clear. However, in a Bayesian setting this distinction is very hard to maintain. If possible at all. iNic 15:52, 18 June 2007 (UTC)

[edit] Added a query

Ms. Smith has two children; I'll call them Whitney and Leslie. One of them is a boy. What is the probability that the other is a boy?
Ms. Smith has two children; I'll call them Whitney and Leslie. Whitney is a boy. What is the probability that Leslie is a boy?
Is the first version ambiguous? Does it give you any additional information? I'm just asking. Jackaroodave 16:23, 16 June 2007 (UTC)

For the second version, the probability is 1/2. The problem statement is unambiguous, and the probability that Leslie is a boy is independent of Whitney's gender. The first version is ambiguous, and the answer depends on how you obtained the information that "one of them is a boy." If you only know the gender of one specific child, and know nothing of "the other," the answer is 1/2. Even if you don't tell me the name of that specific child.

Some people even claim that the fact you asked about "the other child" rather than "does the family have a girl" is important. The encyclopedia article says this is a "confusing" form of the question. The only thing confusing about it is that it illustrates the ambiguity by implying you might have the information about only one specific child. Are you asking about a specific "other child," in which case the probability must be 1/2; or are you asking about where the family fits in the space {{B,B}, {B,G}, {G,B}}, in which case you need to know how you obtained the information that "one of them is a boy." - JeffJor 17:25, 17 June 2007 (UTC)

Thanks for your clarifying answer. It reminds me of the Monte Hall problem, where the answer hinges on what Monte knows and chooses to reveal. It seems reasonable to me to assume that someone who torments people with probability questions knows the whole score, but perhaps not. How's this? "I'm a pediatrician. Ms. Smith brought in her two children for routine physicals. I'll call them Whitney and Leslie. One of them is a boy. What is the probability that the other is a boy?" Jackaroodave 13:45, 18 June 2007 (UTC)

[edit] Editted for clarity

I editted this so that the original question is no longer unclear. (Albeit at the risk of being less "paradoxical...") I also removed the assumption that "the majority of people will be confused," which tends to, as somebody else said, "rub people the wrong way."

At first I added this but then I was worried about its accuracy: (in the mistakes section)

The answer to this question is very sensitive to how it is asked. The first mistake is as follows.

When posing the second question, one might often state, "You see a family out for a walk with one of their two children. The child walking is a boy. What is the probability that the other child is a girl?" The poser of this question may expect an answer of 2/3. However, this is not technically the correct answer.

The question posed in these terms seems to ask, "A random family goes for a walk with a random one of their two children. The child randomly chosen is a boy. What is the probability that the family's other child is a girl?" A random family among the four possible family types is indeed chosen. However, a random child is also chosen, which means the family with two boys is much more likely to be walking with a boy than either individual family with a boy and a girl.

Whenever the family with two girls is walking with a random child, it will not be walking with a boy. Thus, this family never meets the criteria for the question and may be discarded. The remaining families are

{BG, GB, BB}

However, only 1/2 of the time, when chosen randomly, will family BG or GB be walking with their boy. Thus, among random families walking with a random one of their two children, the BB family will be the family the observer sees 1/2 of the time, and the GB and BG family only 1/4 of the time each. In the BB family, the boy seen walking will, of course, have a brother. Thus 1/2 of the time the other child will be a boy. And the remaining 1/2 of the time, the boy seen walking will belong to a family with a girl. The answer to the question posed, then, is 1/2.

There are of course many possible mistakes relating to this question. ..and then on with mistakes section.. Milkshakeiii 20:30, 29 July 2007 (UTC)

[edit] A Simplified Answer?

This contribution offers a contradiction to the answer for question 2. For question 1 order is important and allows for only 2 possibilities, either BB or BG so 1/2 of the time a girl will be part of the family. With question 2 order does not matter which leaves only three possible combinations, not four as implied in the answer for question 1; either both are girls (GG) or both are boys (BB) or one of each (GB/BG). Therefore with the GG possibility removed, it leaves only the BB or GB/BG resulting in 1/2 as well. Seriously, am I missing something here? —Preceding unsigned comment added by Retepris (talkcontribs) 03:23, 23 November 2007 (UTC)

Yes, you are missing something: The GB/BG-case is twice as probable as the BB case.--Niels Ø (noe) (talk) 06:56, 23 November 2007 (UTC)

[edit] Boy or Girl paradox - capitalization

Why is the word girl capitalized in the title of this article? — Carl (CBM · talk) 23:26, 4 December 2007 (UTC)

[edit] An easier way of looking at it

1st one
Two child family of unknown genders (gender of child 1, gender of child 2) = {BB, BG, GB, GG}
They are going to reveal one of the two children (child revealed, genders) = {1BB, 1BG, 1GB, 1GG, 2BB, 2BG, 2GB, 2GG}
The child revealed is a boy (sample space reduced) = {1BB, 1BG, 2BB, 2GB}
The chances of the family having a girl = {1BG, 2GB} / {1BB, 1BG, 2BB, 2GB} = 2/4 = 1/2

2nd one
Two child family of unknown genders (gender of child 1, gender of child 2) = {BB, BG, GB, GG}
They reveal they don't have 2 girls (sample space reduced) = {BB, BG, GB}
The chances of the family having a girl = {BG, GB} / {BB, BG, GB} = 2/3

Note that my version of the 1st doesn't matter which child is revealed, or why they were revealed (age,name,size,etc), only thing that matters is that the child was revealed and that the children do not suddenly change genders, {BG} to {GB} or vice versa. It also takes into account the possibility that the child revealed could have been a girl.

My version of the second is the equivalent of what it means, that there is no possibility of {GG}.

All the same I find these to be a helpful ways of presenting the problems. I've seen, so far, two times that some one has phrased the problem as the 1st one but said the answer is the 2nd, and did so as a way of expressing that the first is an flaw of intuition when in fact it was a flaw in the way they phrased it.

[edit] Deleted References

On 4 Dec. 2007, I posted the following references that were deleted the same day by Dorftrottel.

"One of the earliest discussions of this problem, by Martin Gardner, appeared in Scientific American [October 1959, p. 180]. On July 27, 1997, this problem appeared for the sixth time in Marilyn vos Savant's Parade magazine column [p. 6]. Previously, it had been discussed in her columns of March 30, 1997 [p. 16], December 1, 1996 [p. 19], and May 26, 1996 [p. 17]. This problem, involving two baby beagles instead of two children, had appeared originally in her columns of October 13, 1991 [p. 24], with a follow-up on January 5, 1992 [p. 22].

"Following the controversy of the Monty Hall problem, Ed Barbeau prepared two lengthy lists of references to the handful of paradoxes that are used to teach the concept of conditional probability. These references were published in The College Mathematics Journal [March 1993, pp.149-154; March 1995, pp. 132-134]. The fact that Marilyn recycled this paradox and toyed with her readers, without providing any references, is a glaring example of her unethical conduct."

The way in which Marilyn toyed with her readers is discussed at the following links:

http://www.geocities.com/SiliconValley/Circuit/1308/question.html

http://www.wiskit.com/marilyn/boys.html Italus (talk) 23:11, 4 December 2007 (UTC)

I would like to add that in the above articles, Ed Barbeau referred to this problem as the Second Sibling Paradox. Italus (talk) 21:11, 5 December 2007 (UTC)

[edit] Title query

Is this concept known by another name? I've looked for references via Google and, eliminating Wikipedia from the search, the main site discussing a "boy or girl paradox" is h2g2, which isn't exactly where I'd look for mathematics info. --Tom Tresser (talk) 16:13, 22 December 2007 (UTC)

[edit] Bold Boys

I don't understand the significance of some occurrences of Boy having been made bold in the tables in sections First & Second question.  --Lambiam 21:05, 7 March 2008 (UTC)

[edit] Third Question + Mistakes Major correction

I sat here and I stared and stared and read and worked out problems, I immediately ran into a huge pitfall with the second question. I assumed I was doing exactly what the mistakes was saying people did, but it turns out the mistakes is a mistake.

First, like someone stated above:

  1. Harry has an elder brother Tom
  2. Tom has a younger brother Harry

Is way wrong. It almost needs to be

  1. Harry has an older brother 
  2. Harry has a younger brother(in addition to:)
  3. Harry has a younger sister
  4. Harry has an older sister

But now as you can see there is a problem, all four hold the same weight bringing the odds down to 1/2, which most would say is incorrect, however in this case it is correct. Turns out simply naming the known gender changes a whole lot, so in addition to just completely wiping that section, I'd like to introduce the third question:

"A random two-child family with at least one boy named Harry is chosen. What is the probability that it has a girl?"

Now here is where things get really sketchy. The first obvious problem is what if someone else is named Harry in the family? Adding the clause "Given that no one else is named Harry" simplifies things and at least should be an example taken care of before tackling the question sans-clause. Turns out the answer to this particular question is 1/2, as briefly demonstrated in my correction of the mistakes.

On the flipside adding the clause "Given that if another brother is present, his name is also Harry" nullifies the uniqueness bringing it back to the same as question 2; which is 2/3."

Now someone will probably have to check my math but I worked this out:

Suppose the probability that a boy is named Harry is b.

A family has: Two sons named Harry with probability x=(.5b)^2 A son named Harry and a son not named Harry with probability y=2(.5b)(.5(1-b)) A son named Harry and a daughter with probability z=2(.5b)(.5) No son name Harry with probability 1-the sum of the above three numbers.

Thus the probability that the family has a daughter given that they have a son named Harry is z/(x+y+z)=1/(2-(b/2)).

This matches our qualitative prediction above. If b=1 (i.e. all sons are named Harry so we are in the situation of Question 2) then the answer is 2/3. If b is close to 0, then the answer is close to 1/2.

Finally, a more detailed link regarding all this:

http://members.cox.net/srice1/random/child4answer.html

The site has a 4th question, but I find it completely unnecessary, personally. Hopefully someone can look all this over and we can look at getting it on the page soon. 70.63.193.180 (talk) 09:09, 6 April 2008 (UTC)

I've deleted every Tom, Dick and Harry from the text of the article. May they rest in peace.  --Lambiam 21:32, 6 April 2008 (UTC)