# Chance News 57

## Quotations

An undefined problem has an infinite number of solutions.
Robert A. Humphrey

In the space of one hundred and seventy-six years the Lower Mississippi has shortened itself two hundred and forty-two miles. That is an average of a trifle over one mile and a third per year. Therefore, any calm person, who is not blind or idiotic, can see that in the "Old Oolitic Silurian Period," just a million years ago next November, the Lower Mississippi River was upwards of one million three hundred thousand miles long, and stuck out over the Gulf of Mexico like a fishing-rod. And by the same token any person can see that seven hundred and forty-two years from now the Lower Mississippi will be only a mile and three-quarters long, and Cairo and New Orleans will have joined their streets together, and be plodding comfortably along under a single mayor and a mutual board of aldermen. There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact."

Mark Twain, Life on the Mississippi (Chapter 17), 1883

## Forsooths

According to a November 4, 2009, Wall Street Journal article, “Crisis Compels Economists To Reach for New Paradigm”, University of Chicago economist Robert Lucas stated in 2003:

[The] central problem of depression-prevention has been solved ... for many decades.

A 2009 blogger responded [1] to the article:

These guys build a Rube Goldberg machine, then come back to the brilliant decision that maybe common sense is best after all. That deserves a Nobel prize, minimum.

Submitted by Margaret Cibes

## Vampirical

The following quotation can be found here in an article by Gelman and Weakliem entitled, "Of beauty, sex and power: Statistical challenges in estimating small effects":

This ability of the theory to explain findings in any direction is also pointed out by Freese (2007), who describes this sort of argument as "more 'vampirical' than 'empirical'--unable to be killed by mere evidence."

Gelman and Weakliem are criticizing research which putatively detects an effect merely because statistical significance is obtained on either side of zero or, in the case of ratio of females to males, 50%. In particular, they contest the results of studies which claim that “beautiful parents have more daughters, violent men have more sons and other sex-related patterns.” They also analyze so-called Type M (magnitude) errors and Type S (sign) errors.

This is a Type M (magnitude) error: the study is constructed in such a way that any statistically-significant finding will almost certainly be a huge overestimate of the true effect. In addition there will be Type S (sign) errors, in which the estimate will be in the opposite direction as the true effect.

Discussion

1. As a long-term research project, determine via literature and art how the notion of “beautiful” has changed through the ages and across cultures.

2. The imbalance between baby daughters and baby sons produced by beautiful people somehow went from the original article’s (not statistically significant) 4.7% to 8% when dealing with the largest comparison (the most beautiful parents on a scale of 1 to 5) to 26% and finally to 36% via a typo in the New York Times.

3. The authors, based on their analysis, say “There is no compelling evidence that “Beautiful parents produce more daughters.” Nevertheless, why did the original paper have so much appeal?

4. As a check, the authors used People magazine’s “list of the fifty most beautiful people” from 1995 to 2000 to find the offsprings. There were “157 girls out of 329 children, or 47.7% girls (with a standard error 2.8%).” Instead of more females, fewer were produced.

5. The authors note “the structure of scientific publication and media attention seem to have a biasing effect on social science research.” Explain what they mean by a “biasing effect.”

Submitted by Paul Alper for Halloween.

## How anyone can detect election fraud

Why Russians Ignore Ballot Fraud Clifford J. Levy, The New York Times, October 24, 2009.

Russian Election Fraud? Steven D. Levitt, Freakonomics Blog, The New York Times, April 16, 2008.

All it takes is a bit of common sense and a careful review of the data to expose election fraud, at least in Russia.

Soon after polls closed in regional elections this month, a blogger who refers to himself as Uborshizzza huddled away in his Moscow apartment and began dicing up the results on his computer. It took him only a few hours to detect what he saw as a pattern of unabashed ballot-stuffing: how else was it possible that in districts with suspiciously high turnouts in this city, Vladimir V. Putin’s party received heaps of votes?

Here's a specific example.

Overall turnout was 18 percent in one Moscow district, and United Russia garnered 33 percent. In an adjacent district, turnout was 94 percent, and the party got 78 percent.

This was done by a statistician in his spare time, with access only to publicly available records.

Uborshizzza, who by day is a 50-year-old medical statistician named Andrei N. Gerasimov, sketched charts to accompany his conclusions and posted a report on his blog. It spread on the Russian Internet, along with similar findings by a small band of amateur sleuths, numbers junkies and assorted other muckrakers.

A similar study of open election records in 2008 also yielded obvious evidence of fraud.

Analyzing official returns on the Central Elections Committee Web site, blogger Sergei Shpilkin has concluded that a disproportionate number of polling stations nationwide reported round numbers — that is, numbers ending in zero and five — both for voter turnout and for Medvedev’s percentage of the vote.

It wasn't just any numbers though, but the numbers on the high end of the distribution.

In most elections, one would expect turnout and returns to follow a normal, or Gaussian, distribution — meaning that a chart of the number of polling stations reporting a certain turnout or percentage of votes for a candidate would be shaped like a bell curve, with the top of the bell representing the average, median, and most popular value. But according to Shpilkin’s analysis, which he published on his LiveJournal blog, podmoskovnik.livejournal.com, the distribution both for turnout and Medvedev’s percentage looks normal only until it hits 60 percent. After that, it looks like sharks’ teeth. The spikes on multiples of five indicate a much greater number of polling stations reporting a specific turnout than a normal distribution would predict.

Sadly, though, the reaction of the Russian people has been a collective shrug.

There was none of the sort of outrage on the streets that occurred in Iran in June, when backers of the incumbent president, Mahmoud Ahmadinejad, were accused of rigging the election for him. Nor the international clamor that greeted the voting in Afghanistan, which last week was deemed so tainted that President Hamid Karzai was forced into a runoff. The apparent brazenness of the fraud and the absence of a spirited reaction says a lot about the deep apathy in Russia, where people grew disillusioned with politics under Communism and have seen little reason to alter their view.

This disillusionment is easily demonstrated in public polling.

Opinion polls ... showed that 94 percent of respondents believed that they could not influence events in Russia. According to another, 62 percent did not think that elections reflect the people’s will.

Submitted by Steve Simon

### Questions

1. Compare the reaction of the Russians to these results to the reactions in the United States to the anomalously high votes for Patrick Buchanan in Palm Beach County during the 2000 election. What explains the difference?

2. What other measures of publicly available election records might be used to detect fraud?

## Vaccine effectiveness

“Does the Vaccine Matter?”
by Shannon Brownlee and Jeanne Lenzer, The Atlantic, November 2009

This is a very long and detailed article about influenza in particular, vaccines in general, and related health and economic issues, including some historical information. Its focus is on skepticism in the biomedical community about vaccine effectiveness claims.

Since flu is seasonal and is more likely to “contribute to death” than to “kill people directly,” “researchers studying the impact of flu vaccination typically look at deaths from all causes during flu season, and compare the vaccinated and unvaccinated populations.”

Studies have found that “people who get a flu shot in the fall are about half as likely to die that winter—from any cause—as people who do not.” So people are advised to get vaccinated.

When researchers … included all deaths from illnesses that flu aggravates, like lung disease or chronic heart failure, they found that flu accounts for, at most, 10 percent of winter deaths among the elderly. So how could flu vaccine possibly reduce total deaths by half? [One researcher] says: “For a vaccine to reduce mortality by 50 percent and up to 90 percent in some studies means it has to prevent deaths not just from influenza, but also from falls, fires, heart disease, strokes, and car accidents. That’s not a vaccine, that’s a miracle.”

The 50-percent estimate is based on “cohort studies” of vaccinated versus unvaccinated people, studies which are “notoriously prone to bias,” due to “confounding factors … such as education, lifestyle, income, etc..

When a medical investigator in Seattle started to question the 50-percent estimate:

People told me, “No good can come of [asking] this.” …. “Potentially a lot of bad could happen” for me professionally by raising any criticism that might dissuade people from getting vaccinated, because of course, “We know that vaccine works.”

In 2004 she and her colleagues began an investigation of whether “on average, people who get vaccinated are simply healthier than those who don’t, and thus less liable to die over the short term” (the “healthy user” effect). Based on 8 years of medical data on more than 72,000 people age 65-plus, they found:

[O]utside of flu season [author’s emphasis], the baseline risk of death among people who did not get vaccinated was approximately 60 percent higher than among those who did.”

This suggested to the researchers that “the vaccine itself might not reduce mortality at all.”

What was the reaction in the scientific community?

The results were also so unexpected that many experts simply refused to believe them. [Her] papers were turned down for publication in the top-ranked medical journals. One flu expert who reviewed her studies for the Journal of the American Medical Association wrote, “To accept these results would be to say that the earth is flat!” When the papers were finally published in 2006, in the less prominent International Journal of Epidemiology, they were largely ignored by doctors and public-health officials. “The answer I got,” says [the researcher], “was not the right answer.”

A London-trained epidemiologist is so outspoken on this subject that he has become “something of a pariah” in his scientific community. He has reviewed all of the known studies on the effectiveness of flu vaccines, found them wanting, and recommends placebo-controlled studies. However, there are ethical issues associated with withholding potential relief from sick people or exposing at-risk people to the potentially harmful effects of a vaccine.

Submitted by Margaret Cibes

## Prediction model using game theory

“Forecast: Self-Serving”
by Nicholas Thompson, The New York Times, November 5, 2009

This is a book review of The Predictioneer’s Game: Using the Logic of Brazen Self-Interest to See and Shape the Future [2], by Bueno de Mesquitar, NYU politics professor/Hoover Institution fellow/consultant.

According to the reviewer, Bueno de Mesquitar uses game theory to “model human behavior, divine the future and improve incentive systems … based on the premise that people are selfish.”

De Mesquitar believes that Mother Teresa’s incentive for good works was a desire for a heavenly reward no different from the incentive of global terrorists, and that Belgian King Leopold II’s incentive to behave more kindly at home in Belgium than in the Congo was a desire to keep his home environment more peaceful.

De Mesquitar hopes to “engineer better behavior” by use of his “Policon” analysis system.

His simulations rely on four factors: who has a stake; what each of these people wants; how much they care; and how much influence they have on others. He surveys experts on the topic, assigns numerical values to the four factors, plugs the data into a computer and waits for his software to spit out the future. ….
In a legal dispute involving a corporate client … and the United States attorney’s office, he gave all the possible outcomes a score on a scale from zero (one misdemeanor count) to 100 (multiple felony charges …). He then identified the crucial players in the game … and numerically scored their desired outcomes, their influence and their adamancy. His client entered the talks prepared to end at a position of around 60 …. But the modeling showed that negotiations … would end with … [a] final agreement … closer to 80 …. After running a long series of simulations, [he] came up with a new strategy. …. His model said that this strategy would lead to the case’s resolving at a point closer to 40 on his scale — which is indeed, he claims, how matters turned out.

Using his “Policon” analyses, de Mesquita claims a “90 percent accuracy rate” in his CIA-declassified predictions, a vague claim according to the reviewer.

De Mesquitar ends the book with some bold predictions, and the reviewer concludes:

[I]t’s hard not to feel the same sort of skepticism about the author that he feels toward Mother Teresa.

For a related book on applications of game theory, readers might enjoy NYU politics professor Steven Brams’ Biblical Games: Game Theory and the Hebrew Bible [3] (2002 edition, update from 1980 edition). The table of contents and sample pages are also available online [4].

Submitted by Margaret Cibes

## Some recent studies of potential interest

“Pacifiers Tied to Speech Disorders”
by Jeremy Singer-Vine, The Wall Street Journal, November 3, 2009

The author summarizes the results of five recent studies, including some “caveats” to consider before relying on the results for future decision-making.

(a) An observational study of 128 Chilean children [5]:

Result: Preschoolers with speech disorders were three times as likely as other children to have used a pacifier for at least three years … and thrice as likely to have started bottle-feeding before nine months of age.
Caveat: The infants' sucking behaviors were based on parental recollections rather than direct observation. A larger, randomized trial is needed to validate the findings ….

(b) A controlled study of mice [6]

Result: Nicotine patches appear to promote the spread and re-growth of cancer tumors ….
Caveat: Mouse and human cancers can differ significantly.

(c) A controlled study of 391 women [7].

Result: Women who lie down for 15 minutes after receiving artificial insemination appear to have a 50% higher chance of becoming pregnant ….
Caveat: The overall rate of pregnancy in this study was significantly lower than at many fertility centers ….

(d) A study of nearly 32,000 Swedish twins [8].

Result: Genetic factors appear to explain much of the connection between heart disease and hip fractures ….
Caveat: Though the study enrolled many subjects, there were fewer than 400 cases in which an identical twin fractured a hip after his or her sibling was diagnosed with heart disease.

(e) A controlled study of 49 patients [9].

Result: A three-day course of antibiotics was no less effective than the standard seven-day course for helping children recover from tonsillectomy …. Patients on the three-day course returned to a normal diet after an average of 5.7 days, while patients on the seven-day course took 6.0 days on average—a statistically insignificant difference.
Caveat: Enrolling more patients could have revealed significant differences that this small study missed. Pain in this study was not measured directly, but rather by the use of pain relievers.

Discussion

1. Suppose that the study of Chilean children had been based upon a larger, randomized trial, including direct observations instead of parental recollections, and suppose the result had been a strong association between the length of use of pacifiers/bottles and the presence of speech disorders. How would you respond to a claim that increased use of use of pacifiers/bottles causes speech disorders in young children? Can you think of a possible alternate explanation for the association?
2. How many twin cases would you need to examine in order to be more confident of a genetic connection between heart disease and hip fractures?
3. Given the statistically non-significant results of the controlled study of 49 patients with tonsillectomies, would you feel more confident about the results if you found statistically significant differences based upon a controlled study of 4900 patients?

Submitted by Margaret Cibes