# Chance News 32

## Contents

## Quotation

You can prove any silly hypothesis by running a statistical test on tons of data.

Jim Albert

The Numbers Guy

Wall Street Journal. 7 December, 2007

## Forsooth

The following Forsooths are from the Dec 2007 issue of RSS NEWS.

The methodology behind the ICS survey is flawed. There were only 2000 respondents, a small number for any statistical survey, who were asked to nominate which firms of services they used and how that rated them.The Times

22 October 2007

Presenter:

'In statistics, in data which are binomially distributed, individual values may be placed in one of two mutually exclusive categories such that the sum of the probabilities of occurring in the categories is what value?'

Answer given: 'Unity'

'No, it's one, or a hundred percent'University ChallengeBBC2

22 October 2007

This Forsooth was suggested by Paul Alper

The fact is, analysts say, that for all that it has a secular constitution, Turkey remains a relatively conservative country. The word atheist has only recently appeared in Turkish, but "godless" still remains an insult here. "Only 2% of the people we interviewed said they didn't believe in God", says Ali Carkoglu, co-author of a 2006 study of religious attitudes.

"Given that we had a 2% margin of error that could mean nobody", he added. "In any case it takes considerable courage for a Turk to admit to a stranger that they are atheists."

*The London Independent*

30 November 2007

In Chance News 31 we had the forsooth:

Of Italy's 151 Series A players, 52 are non-white, with Inter fielding, 19,

Juventus 12, AC Milan 13, AS Roma 12 and Udinese 10. Messina has eight.The Times

30 November 2005

Marcello Pagano comments that in Italy there is a saying: L'aritmetica non è un'opinione

## Math too hard for this lottery

John Haigh, author of one of our favorite chance books:"Taking Chances: winning with probability" suggested this item.

The following story appeared in the December issue of the London Mathematical Society Newsletter.

From the Manchester Evening News 3 Nov 2007

A Lottery scratchcard – the Cool Cash game – was taken out of shops yesterday after some players failed to grasp whether or not they had won.

To qualify for a prize, users had to scratch away a window to reveal a temperature lower than the figure displayed on each card. As the game had a winter theme, the temperature was usually below freezing. But the concept of comparing negative numbers proved too difficult for some. Camelot received dozens of complaints on the first day from players who could not understand how, for example, –5 is higher than –6.

Tina Farrell, from Levenshulme, called Camelot after failing to win with several cards. The 23-year-old, who said she had left school without a maths GCSE, said: "On one of my cards it said I had to find temperatures lower than –8. The numbers I uncovered were –6 and –7 so I thought I had won, and so did the woman in the shop. But when she scanned the card the machine said I hadn't. I phoned Camelot and they fobbed me off with some story that –6 is higher – not lower – than –8 but I’m not having it. I think Camelot are giving people the wrong impression – the card doesn’t say to look for a colder or warmer temperature, it says to look for a higher or lower number. Six is a lower number than 8. Imagine how many people have been misled."

A Camelot spokeswoman said the game was withdrawn after reports that some players had not understood the concept.

Submitted by Laurie Snell

## David Kendall 1918-2007

The London Mathematical December Newsletter also reported the sad news that David Kendall, one of this centuries greatest probabilist, has died at age 90. You can read about his contributions in the Times of London's Obituary .

## What do economists know that lawyers don't

Does Death Penalty Save Lives? A New Debate Adam Liptak, The New York Times, November 18, 2007.

Recently a dozen or so studies by economists have shown that the death penalty has a deterrent effect. In one study, each execution was estimated to save five lives.

To economists, it is obvious that if the cost of an activity rises, the amount of the activity will drop.

The legal profession is not so sure.

But not everyone agrees that potential murderers know enough or can think clearly enough to make rational calculations. And the chances of being caught, convicted, sentenced to death and executed are in any event quite remote. Only about one in 300 homicides results in an execution.

The modles used by economists are typically a multiple linear regression model with adjustment for key covariates.

The studies try to explain changes in the murder rate over time, asking whether the use of the death penalty made a difference. They look at the experiences of states or counties, gauging whether executions at a given time seemed to affect the murder rate that year, the year after or at some other later time. And they try to remove the influence of broader social trends like the crime rate generally, the effectiveness of the criminal justice system, economic conditions and demographic changes.

Can you use a regression model here? The answers vary.

Critics say the larger factors are impossible to disentangle from whatever effects executions may have. They add that the new studies’ conclusions are skewed by data from a few anomalous jurisdictions, notably Texas, and by a failure to distinguish among various kinds of homicide.

The recent studies are, some independent observers say, of good quality, given the limitations of the available data. “These are sophisticated econometricians who know how to do multiple regression analysis at a pretty high level,” Professor Weisberg of Stanford said. The economics studies are, moreover, typically published in peer-reviewed journals, while critiques tend to appear in law reviews edited by students. The available data is nevertheless thin, mostly because there are so few executions.

There is additional commentary about death penalty deterrence on the Freakonomics blog.

### Questions

1. The bulk of executions in the United States have occurred in Texas. Why might this clustering of executions raise difficulty for the regression model?

2. Can a regression model remove the effect of all of the potential confounding variables that also influence crime rate? Can they remove enough of the effect of confounding to provide a plausible answer?

3. At the end of the Times article, a researcher speculates on how random assignment could allow a caerful study of the deterrence effect of the death penalty. What would a randomized study of death penalty effects look like? What are some of the practical and ethical barriers to such a study?

Submitted by Steve Simon

## The Numbers Guy

Carl Bialik writes a column called "The Numbers Guy" for the *Wall Street Journal* where he "examines the way numbers are used, and abused". He also has a Blog where he discusses his articles and readers comment on them. In the December 3, 2007, of the *Wall Street Journal* The Numbers Guy's WSJ Column was titled "Is a Carl Doomed to be a C Student?, We Don't Think So". Here the Numbers Guy discussed a study purporting to show a Name-Letter-Effect which was discussed here in Chance News 31 by Paul Alper. As the Numbers Guy's title suggests, he shares Alper's skepticism of the results of the study.
On his website Bialik, also discusses the use of the phrase “statistical ties” or “statistical dead heats” in The *Wall Street Journal*, *The New York Times*, CNN, and other media in reporting on the Republican and Democratic contests. Bialik comments that statisticians do not like these terms and explains why.

### Questions

(1) What do you think news reporters mean by the expressions "statistical ties" and "statistical dead heat"?

(2) Why do you think statisticians do not like these terms?

Submitted by Laurie Snell

## Of Mice and Males

Authors are not responsible for what journalists write about a research article. Lacking knowledge of statistics, reporters tend to act like stenographers when they aren't extrapolating far beyond the limits of the research. Take a look at what the lay press had to say about Experimental alteration of litter sex ratios in a mammal which appeared in the Proceedings of the Royal Society (B).

The Daily Mail:

Red meat and salty snacks are said to lead to boys while chocolate is thought to help to produce girls. Now science suggests the stories may be true: mice with low blood-sugar levels - a good indicator of a sugar-rich diet - produce more female than male offspring.

The Independent:

Boy or girl? Battle of the sexes Are you desperate for a daughter or dying for a son? The solution could lie in a mother's diet - before she even conceives.

New Scientist:

Findings lend credence to traditional beliefs that eating certain foods can influence the sex of offspring.

Discover:

The Biology of . . . Sex Ratios. Want a boy at all costs? The secret may lie in your glucose levels.

FoxNews.com:

Can what a mother-to-be eats influence the sex of her unborn baby? Maybe, says new research.

The research itself looks at a very important issue in biology: the influence of nutrition on reproductive strategy and the ensuing evolutionary advantage. To carry out their research, they had 20 female mice in a control group and 20 female mice in the treatment group which was given "a steroid [DEX] that inhibits glucose transport and reduces plasma glucose concentrations." The original paper does not give a table whereby for each of the 40 mice is recorded the number in the litter, number of males and which arm of the study it was in. Instead, we have to relay on the given summary data: average litter size for control is 10.45 with a standard error of .60, and the average litter size for the treatment is 9.17 with a standard error of .62.

According to the article, "The sex ratio differed significantly between the treatment and control groups (rank-sum test: Z= -2.18, p=0.03), with DEX females giving birth to fewer sons (41.9%) than control females (53.5%)." With this information, it would appear that the control group produced a total of 10.45 * 20 = 209 mice resulting in 209*.535 = 112 males. The treatment group is more difficult to determine because two of the 18 "failed to conceive;" thus, if only 18 are relevant, then the treatment group has 9.17 * 18 = 165 mice and 165 * .419 = 69 males. Using these numbers, a Minitab printout yields a (Fisher exact because of the relatively small samples) p-value of .029 which is close to the "p=.03" mentioned in the article.

Test and CI for Two Proportions

Sample |
X |
N |
Sample p |

1 |
112 |
209 |
.0.535885 |

2 |
69 |
165 |
0.418182 |

Difference = p (1) - p (2)

Estimate for difference: 0.117703

95% CI for difference: (0.0165306, 0.218876)

Test for difference = 0 (vs not = 0): Z = 2.28 P-Value = 0.023

Fisher's exact test: P-Value = 0.029

### Discussion

1. No confidence interval for the difference in proportion of males is given in the article itself. Does the 95% CI suggest any guarantee for reduction in male mice? Male humans?

2. Regarding the treatment arm, the article states : "42%, two-tailed binomial test, p=.04." Using the summary data, Minitab reports

Test and CI for One Proportion

Test of p = .05 vs p not = 0.5

Sample |
X |
N |
Sample p |
95% CI |
P-Value |

1 |
69 |
165 |
0.418182 |
(0.341979,0.497378) |
0.043 |

Does this 95% CI suggest any guarantee for reduction in the number of male mice? Male humans?

3. Thus far, offspring production has been treated as a Bernoulli process. That is, each offspring is considered to be independent. In other words, no use has been made of the number of female parents (20 in the control and 18 in the treatment arm). Using the summary data given in the article, Minitab obtains for the difference in means of males a somewhat different p-value, .05 rather than the .03 mentioned in the article and thus a wider interval.

Two-Sample T-Test and CI

Sample |
N |
Mean |
StDev |
SE Mean |

1 |
20 |
5.59 |
2.68 |
0.60 |

2 |
18 |
3.84 |
2.63 |
0.62 |

Difference = mu (1) - mu (2)

Estimate for difference: 1.750

95% CI for difference: (-0.000, 3.500)

T-Test of difference = 0 (vs not =): T-Value = 2.03 P-Value = 0.050 DF = 35

Ask a biologist whether or not the Bernoulli assumption is valid.

4. All of the above is from a frequentist point of view. What would Bayesians add to the discussion and why?

5. As noted, two of the 18 in the treatment arm failed to conceive while all 20 in the control arm did conceive. How does this affect your view of the results?

Submitted by Paul Alper

## A coincidence

An almighty coincidence

Magazine, Issue 45, December 2007

John Haigh and Rob Eastaway

Coincidences can crop up in the strangest places. While Peter Hughes, an accountant from Birkenhead, was at a church service, his eye fell on the board showing the list of hymn numbers for the service. They were 16, 37, 428 and 590.

Mathematicians naturally look on numbers as a goldmine for distraction, maybe seeking to combine them to form an equation, or noting how many are primes. In this case, Peter observed something particularly curious: all ten digits appear exactly once! How likely is that, he wondered? His initial back-of-the-envelope estimate indicated that the chance might be about 1 in 6,000. He then sought our comments.

The authors first make a rather simplistic calculation which led to an estimate of 1 in 40000 for the probability all ten digits to appear exactly once. Then they refine the calculation to take into account for example:

(1) The number zero will arise less frequently than other digits, maybe 2/3 as often;

(2) Some hymns are far more, or less, popular than others;

(3) Christmas carols, doubtless numbered consecutively in a hymnbook, will never be selected for most of the year. Other blocks of hymns will be used only at baptisms, Easter, Harvest etc.;

(4) For hymnbooks in which , the frequencies of 1, 2 or 3 digit numbers will be different from those assumed;

(5) The numbers chosen can't be completely independent, as they are drawn from a finite population without replacement — and other factors may affect the independence assumption too.

They observe that these refinements effect the estimate in different ways and conclude that their estimate of 1 in 40,000 is still pretty good.

Finally they did a simulation using for data the Methodist Hymnbook (1933). They selected four hymns at random with the carols deleted, and repeated this exercise ten million times. This led to the estimate of 1 in 40,800 not too difference from their calculation. They remark:

This suggests that, if you attend 40 "ordinary" services per year, you might expect to wait about 1000 year before sharing Peter's experience.

Finally that point out that Peter's email would have been just as interesting if he had said: "All four numbs were prime numbers" or some other interesting pattern of numbers. The authors say that this would increase the odds a great deal and close with the remark:

Indeed, as David Wells' famousDictionary of curious and interesting numbersmakes clear, the vast majority of numbers between 1 and 1,000 are curious or interesting in their different ways, making the chance of a mathematician finding something of interest in next Sunday's hymn list something close to 100%!

Submitted by Laurie Snell