Chance News 36
Come, said Slartibartfast, you are to meet the mice. Your arrival on
the planet has caused considerable excitement. It has already been hailed, so I gather, as the third most improbable event in the history of the Universe. What were the first two? Oh, probably just coincidences.
The Hitchhiker's Guide to the Galaxy
Sex and Cereal
A sure way to obtain column inches in the media for biological/medical research is to study sex selection and relate it to something unexpected. In this instance, 740 pregnant women were asked to fill out a survey which dealt with their physical characteristics, habits and most of all, daily dietary intakes.
The Independent: "Big breakfast is most important meal -- if you want a baby boy."
Reuters: "Skipping breakfast may mean your baby is a girl."
New Scientist: "Breakfast cereals boost chances of conceiving boys."
CNN.com: "Study shows bananas make baby boys."
New York Times: "Boy or Girl? The Answer May Depend on Mom's Eating Habits."
Choosing a provocative title doesn't hurt "You are what your mother eats: evidence for maternal preconception diet influencing foetal sex in humans." According to the lead author, "If you want a boy, eat a healthy diet with a high calorie intake, including breakfast." From the New Scientist, "When the researchers divided the women into groups with high, medium and low intake of energy, they found that 56% of women in the high-energy group had boys, compared with 45% in [the] lowest group." Further, "Cereal intake had a bigger effect," producing 59% boys when eating one or more bowlfuls per day, "compared with only 43% who bore boys in the group eating less than a bowlful per week." The researcher tested many foods and found only cereal "significantly associated with infant sex."
1. Here is a wiki which looks at a different study-on mice-which also claims that nutrition affects the percentages of males and females. Which of the two is an experimental study and which is an observational study?
2. The current study was done in England and of the 740 mothers-to-be, 301 (approximately 40%) said they currently were smokers. Why would this fact cast doubt on the conclusions being applied to the United States?
3. Eating cereal for breakfast is a very American habit, duplicated in few countries; even those other countries, such as England where cereal is eaten for breakfast, have nowhere near the selection possibilities obtainable in the United States. Many industrialized countries eat little or no breakfast at all. What then should the male/female ratio be for these countries?
4. It is often said that many cereals are really candies in disguise. If so, should the mother-to-be "cut to the chase" and just have a candy bar for breakfast? If not, why not?
5. Instead of the customary .05 level, the researchers chose a p-value < 01 for determining statistical significance. Why did they lower the p-value?
6. The researchers keep referring to a "bowl of cereal." Why is this an exceedingly inexact measure?
Submitted by Paul Alper
Monty Hall Psychology
Professors Craig R. Fox and Jonathan Levav have written an interesting paper Partition-Edit-Count: Naïve Extensional Reasoning in Judgment of Conditional Probability, They conducted several experiments to see how Duke University students interpreted and performed on probability problems especially when given alternative phrasing. For example, (a distant version of the Monty Hall problem) they considered three pharmaceutical companies, A, B, and C. Half the students were told "the FDA will publish a report in which it will reveal which of the three drugs is most effective." The other half were told "the FDA will publish a report in which it will rank the three drugs from the most effective to least effective." All the students were then informed that an independent lab definitively found that A is more effective than C.
The first group of students was asked to find "the probability that the FDA will identify A as the most effective of the three." The second group was asked to find "the probability that the FDA's rankings will list A ahead of both B and C." The correct answer is 2/3 rather than 1/2 irrespective of the wording. In the first group, 10% of 67 obtained the correct answer; in the second group, 23% of 62 obtained the correct answer. Fox and Levav present an explanation of the thought processes at work based on partitioning, editing and counting.
1. Near the end of the paper the authors state: "Moreover, despite the fact that participants could have solved all three puzzles computationally by invoking Bayes theorem or the definition of conditional probability, a very small proportion of these respondents seemed to attempt a computational answer, and none of the participants who explicitly invoked a formula arrived at the correct solution." Use Bayes theorem to obtain the correct answer.
2. The allegation is that this particular problem is "a distant version of the Monty Hall problem." Show how A, B, and C relate to the goats, doors and car.
3. Fox and Levav offered prize money for participation in these problems. In particular, an MBA student was offered $20 for the above problem. For some other problems, $1 was offered to anyone in the Duke University student center. Explain the discrepancy.
4. The authors claim that it makes sense that the second wording, the one with the word "rank," would more likely lead to a correct six-fold partitioning (ordering of events such as ABC, ACB, etc.) and easy editing and counting. The first wording, emphasizes "most effective," which has a three-fold partitioning (A most effective, B most effective, C most effective). Edit and count the second wording to come up with the correct 2/3. Edit and count the first wording to come up with the wrong answer, 1/2.
5. Comparing the difference in the proportion of successes for the two different wordings, the authors claim via a chi-square test that the value of chi-square is 3.5 leading to a p-value of about .06. Perform a chi-square test to duplicate their result. Perform a difference of proportions test using Fisher's exact test and show that the p-value is closer to .093. Why is the authors' p-value result of .06 incorrect?
A currently popular medical model is that physical exercise by the elderly may help to prevent Alzheimer's disease. From the Minneapolis Star Tribune, April 17, 2008 we learn that "Each year about 15 percent of the people with MCI [mild cognitive impairment] develop Alzheimer's, compared with 1 to 2 percent of all people age 65 and older." The article discusses a Mayo Clinic study of 868 randomly chosen "people ages 70 to 89 [who] were asked to record their exercise habits when they were between 50 and 65." There were 740 normal people, 20 % of whom "said they exercised one to two times per week" and 128 who had MCI, 13.4 % of whom said they exercised one to two times per week.
1. Assume that 13.4 % of 128 is 17. Do a chi-square test to obtain a p-value. Ignoring Fisher's exact test, do a test for the difference of proportions with and without assuming pooling. Compare the three p-values obtained and explain the discrepancies.
2. Fisher's exact test yields yet a different p-value. Is the customary "statistical significance" obtained? Why is this test better than the chi-square or the naïve difference of proportions with or without pooling?
3. What lurking (hidden) variables might exist in this study?
4. The results of this study were presented at a conference of the American Academy of Neurology but not in a peer reviewed journal. How does this affect your view of the worthiness of the paper? If the study were related to a commercial product promoting exercise, how would this affect your view of the worthiness of the study?
Submitted by Paul Alper
When a lower prize was bigger than the Jackput in the UK 6/49 Lottery
A comment on the item in Chance_News_35, about when 106 people matched 5 numbers only, yet 239 matched 5 + Bonus in the Canadian 6/49 lottery, resulting in a larger prize for the lower tier winners. The winning numbers were 23, 40, 41, 42, 44, 45 with Bonus 43, so we can all see how this came about!
In the UK 6/49 Lottery, there have been two occasions when the Match5+Bonus prize has exceeded the Jackpot prize: first, on Saturday 29 June 1999, the winning numbers were 2, 17, 18, 23, 30, 40, Bonus 43; here 46 shared the jackpot, each winning £152,431, while only 13 shared the Bonus prize, each getting £165,961.
Second, on Wednesday 30 August 2006, the winning numbers were 19, 21, 22, 38, 44, 49, Bonus 45; here 4 shared the jackpot, each winning £669,219, while ONE person scooped the entire Bonus pool, winning £862,395.
I offer separate explanations for both of these unusual phenomena: for the first, a very relevant piece of extra information is that for many years, Britain's bookmakers have operated their own "lottery", again of 6/49 format, but punters are offered fixed odds for matching 1, 2, 3 etc numbers. And just three days before 29 June 1999, the six winning numbers in the bookmakers' draw were EXACTLY the same as those 29 June winning numbers. Plainly, too many UK punters thought that, by choosing that set of winning numbers from the bookmakers' lottery, they were making a suitable "random" choice for the following Lotto!
For the second, I believe this is pure random chance at work! Sales are lower on Wednesdays than on Saturdays, allowing a little more scope for such random effects. On average, 6 times as many tickets will share the Bonus prize as share the Jackpot, but given enough data, with smallish numbers (the average number sharing a Wednesday jackpot is 1-2), random chance will occasionally throw up more jackpot winners than Bonus winners - and sometimes, so many more that this "prize anomaly" may arise. If you wait long enough, even rare events are sure to happen!
(The same thing nearly happened on Wed 14 July 1999, when just two tickets shared the jackpot, but NO-ONE won the Bonus prize. The Bonus Pool of £1,327,021 got added to the original jackpot pool of £4,312,821 to give a total of £5,639,842, so both jackpot winners got £2,819,921, some £663,510 more because of this rule that rewards the already fortunate! To date, there have been more tickets sharing the Jackpot than the Bonus just 11 times, 7 of them on Wednesdays.)
Submited by John Haigh
The Isle Royale predator prey study
The Ecological Study of Wolves on Isle Royale, now in its 50th year, is the longest running large mammal predator-prey study in the world. The researchers go to the Island every winter, when no one else is there, to estimate the number of wolves and moose currently on the Island and their findings are presented in their annual Moose-Wolf Report. Their 2007-8 Report has just been made available here. You will find their estimate of the current moose-wolf populations and graphics of the history of the wolf-moose populations. The authors also give an explanation of the statistical method used to estimate the number of moose and much more. This study provides a good example to test the Lotka-Volterra Predator-pray model. You will also find here an educational video of the history of the Isle Royale Moose-Wolf study that is fun to see and would be nice to show in a classroom.
Submitted by Laurie Snell