# Chance News 52

## Contents

- 1 Quotations
- 2 Forsooths
- 3 Wednesdays, hot weather and increased suicide rates
- 4 Two new anti-aging studies
- 5 The Flaw of Averages
- 6 Sabermetrics
- 7 "Strong Evidence" standards
- 8 A return to coin tossing
- 9 Supersizing
- 10 A solution to the probability puzzle
- 11 A bet with no losers?
- 12 More scientific fraud
- 13 Two new scientific journals?
- 14 Predicting Hall of Famers
- 15 “Dappers” and “schnooks”
- 16 Cellphone driving risks

## Quotations

Correlation coefficients are now about as ubiquitous

and unsurprising as cockroaches in New York City.

*The Mismeasure of Man*,

Second edition, 1996, Page 286.

## Forsooths

Growing up in China [Yale Assistant Professor of Genetics] Jun Lu might have pursued a career in math if his father, a mathematician, hadn't advised against it.

His reasoning: Math can be done with little more than a pen, paper and your mind. For 2,000 years, thinkers have had those tools to contemplate the questions of mathematics. "Any questions left behind by them are probably very hard to address," Lu said.

But biology, his father said, capitalized on advances in technology. His son could have the chance to explore a new frontier.

*The Hartford Courant*, July 12, 2009

A Princeton professor of economics/public affairs, who is also vice chairman of the Promontory Interfinancial Network and former vice chairman of the Federal Reserve Board, argues that the economy has bottomed out.

[T]here is a reasonable chance—not a certainty, mind you, but a reasonable chance—that the second half of 2009 will surprise us on the upside. …. [T]his seemingly high growth scenario … [would follow] directly from the arithmetic of hitting bottom. …. Eventually those huge negative numbers [associated with the decline of some GDP components] must turn into (at least) zeroes … [and] … almost certainly do better. …. None of these events are probabilities; they are all certainties. The only issue is timing, about which we can only guess.

A mathematician, who is a former “algorithm manager” at a technology firm, has left his firm to “devote himself to his favorite shower-time epiphany – that what the world needs … is a museum devoted to math.” He came up with the following formula, where E = expertise, CP = computational power, C = capital, R = risk, A = altruism, O = obsession, and I = indifference:

Math Museum = {Delta-O + [E(CP + C) + A] – R} / I

*The New Yorker*, August 3, 2009

## Wednesdays, hot weather and increased suicide rates

“National Study Finds Highest Rate Of Suicide On Wednesdays”

by Arielle Levin Becker, *The Hartford Courant*, July 11, 2009

A University of California at Riverside study, published in *Social Psychiatry & Psychiatric Epidemiology*, found, surprisingly, that suicides are more likely to occur on Wednesdays than on any other day of the week. The study’s results were based on the 131,636 suicides in U.S. death records for the period 2000-2004.

One chart [3] gives the percentages of all U.S. adult suicides by day of the week: Sunday 11.8%; Monday 14.3%; Tuesday 12.7%; Wednesday 24.6%; Thursday 11.1%; Friday 11.2%; Saturday 14.4%.

A social worker suggested that workplace stress might mount up through the week, with weekend relief seeming too far away by Wednesday. A reader [4] blogged that a person in crisis might be upset to find that his/her therapist has Wednesday off, while he/she has to work. One of the researchers advised mental health workers to schedule more patient appointments on Wednesdays.

The researchers also found that summer and spring were more common seasons for suicide than fall or winter, another counterintuitive finding. A second chart [5] gives the percentages of all U.S. adult suicides by season: Autumn 23.8%; Winter 24.4%; Spring 25.8%; Summer 26%.

A third chart [6] shows the trend in the number of U.S. adult suicides (eyeballed approximations): 24,200 in Year 2000; 25,100 in Year 2001; 29,000 in Year 2002; 26,100 in Year 2003; 26,900 in Year 2004.

A *Hartford Courant* analysis of the 966 reported adult suicides in Connecticut for the period 2001-2004 showed less variation in rates among the days of the week. “Most suicides – 16.7 percent — occurred on Tuesday, while 16.4 percent occurred on Monday and 14.5 percent on Wednesday. Thursday had the lowest occurrence, 12.1 percent.”

However, Connecticut records agreed with the national study with respect to summer and spring being the more common seasons for suicides. A Connecticut mental health worker hypothesized that:

“People think it’s normal to be depressed in the winter. “Spring is the time of year when people are supposed to be rejuvenated and outside and enjoying themselves, and if you’re not, it makes you feel comparatively worse than everybody else, which may make you feel more hopeless,” he said.

**Discussion**

1. With respect to the national study, do you think that the differences among days of the week and/or among seasons of the year were statistically significant?

2. Were you surprised that the Connecticut analysis showed less variation than the national study, in rates among the days of the week, from a statistical point of view?

## Two new anti-aging studies

“Two Mammals' Longevity Boosted”

by Keith J. Winstein, *The Wall Street Journal*, July 9, 2009

In the journal *Nature*, anti-aging researchers from Maine, Michigan, and Texas reported on a study that found that a chemical (Wyeth’s rapamycin) used to treat organ transplant patients increased the life span of mice. Because the chemical suppresses the immune system, humans are advised against taking the drug to prolong their lives.

Mice given rapamycin -- starting when they were 600 days old, or roughly the equivalent of 60 human years -- lived longer on average than mice who didn't get the drug. Their "maximal life span" -- meaning the age at which 10% of the mice were still alive -- increased to 1,245 days for females, compared with 1,094 days for those not fed the drug, or a 14% increase. For males, the maximal life span was 1,179 days, a 9% increase over the 1,078 days for those not fed the drug.

In an upcoming issue of *Science* magazine, University of Wisconsin scientists will publish results of a study that shows that reducing the calorie intake of monkeys extends their lives. A person who has seen the study's results said that "after 20 years, only 20% of the calorie-restricted monkeys had died, compared with half of the monkeys on a normal diet."

The Wisconsin study, which began in 1989 with 30 monkeys and added 46 more in 1994, is an effort to test calorie restriction in an animal genetically closer to humans. Researchers have known since the 1930s that eating 30% fewer calories than normal lengthens the life span of mice. Half the monkeys were given a normal diet, and half had their food intake cut back by 30% at roughly age 10.

A British gerontologist commented, "Aging is, unequivocally, the major cause of death in the industrialized world and a perfectly legitimate target of medical intervention."

A blogger [7] provided the following hypothetical dialogue.

Joe: Do you want to live to 100?

Pete: Don't ask me; ask the guy who's 99.

## The Flaw of Averages

Sam L. Savage

John Wiley & Sons, published June 2009.

Sam is a Consulting Professor of Management Science at Stanford University, and a Fellow of the Judge Business School at the university of Cambridge. He came from a family of famous statisticians: his father Leonard Jimmie Savage and his Uncle I. Richard Savage.

Jimmie Savage is best known for his Book The Foundation of Statistics but we all gained from his work. During the Second World War he was a member of the Statistical Research Group based at Columbia University. In addition to solving military problems they solved the Wald Sequential Ratio Test which led to Optimal Stopping which in turn led to my thesis. His discovery of the work of Louis Bachelier on stochastic models for asset prices led to Random walk becoming fundamental to mathematical finance. I. Richard Savage was also a well known statistician and in his Memorial we read:

Savage is one of a few mathematical statisticians of his generation who chose to pursue the application of statistical principles and concepts to problems of public policy.

Now Sam is one of the few mathematical statisticans of his generation who chose to pursue the application of statistics and probability principles and concepts to problems of public policy.

Now on to the book. Sam starts with an explanation of the Flaw of Averages. A humorous example of this involves the statistician who drowned crossing a river that was on average 3 feet deep. It is pretty clear from this what Flaw of Averages means, but we are also invited to go to flawofaverages.com. Here we read that "Plans based on average assumptions are wrong on average!"

The book assumes no statistical background, but for those with statistical training the author claims he can repair the damage within the first few chapters. This type of humor appears throughout the book. We find here other examples of the Flaw of Averages including this one:

Suppose you are on time to your appointments on

average, and so is your partner. When you go someplace together, the Flaw of Averages ensures that you will be late on average. In fact the FOA pretty much explains why everything is behind

schedule, beyond budget, and below projection.

Sam give many more serious applications of FOA and illustrate the use of computer tools such as spreadsheets and simulations. An example of this is your Retirement Portfolio. For this we assume that your retirement fund is $200,000 dollars and your life expectancy is 20 years.We assume that you invest your money in a mutual fund with a good record. Their annual return has fluctuated from year to year with an average year return of 8%. You adviser calculates that you can withdraw $21,000 a year and will exhaust your funds in exactly 20 years. This is illustrated below by the first figure.

However of course there is no reason to believe that year return will be exactly 8%. So you look at what your resources will be with some random variations in the return.

The lower figure shows a dozen potential wealth paths simulated at random by a computer. Here we see that about 50 percent of these (those with dark lines) leave you without any money before you die.

I confess that when I planned my retirement I thought only the first figure and nothing like the second figure! Even worse we did not enfasise the flaw of averages in our book.

Laurie Snell

To be continued

## Sabermetrics

“Baseball Veers Into Left Field”, by Austin Kelley, July 17, 2009

This article describes some results of researchers in the field of “sabermetrics,” apparently the “science of baseball analysis,” in which “no topic is too small or hypothesis too unlikely” to be investigated due to the huge database available. Phil Birnbaum edits the statistical-analysis newsletter for the Society for American Baseball Research (SABR).

From Wayne State University researchers, we learn that:

[M]ajor-league players who have nicknames live 2 ½ years longer, on average, than those without them.

[B]aseball players live a little longer than average folks.

[P]layers who debut at a young age have a shorter life expectancy than their slower-developing teammates.

[P]layers with "positive" initials have a longer lifespan than those with initials like "P.I.G."

[S]outhpaws are a little shorter than right-handers.

[M]ajor-leaguers are more likely to die on their birthdays than chance would predict.

From a pair of researchers, at University of California-Berkeley and Yale University, we have:

[P]layers whose first or last name begins with "K" strike out more than those without "K" initials.

From another pair of researchers, at Pennsylvania State University and Washington University, we are told:

Democrats support the designated-hitter rule more than Republicans.

The website directs readers to some articles:

"The Etiology of Public Support for the Designated Hitter Rule” [8]

"Moniker Madness: When Names Sabotage Success” [9]

"The ‘K’ study for real” [10]

"Underestimating the Fog” [11]

Baseball fans see also "No Casual Fans at World Series of Baseball Trivia" for late-breaking news about results of the baseball trivia contest held at SABR's annual convention.

## "Strong Evidence" standards

“The Final Report of the National Mathematics Advisory Panel”

U.S. Department of Education, 2008

This report is the result of two years of work by a panel of mathematicians, educators, and foundation representatives, formed in 2006 “with the responsibilities of relying upon the ‘best available scientific evidence’ and recommending ways ‘… to foster greater knowledge of and improved performance in mathematics among American students.’” Their main focus in seeking national standards was on the “delivery system in mathematics education,” especially the high school algebra sequence and pre-secondary math courses leading up to it.

In compliance with the President’s Executive Order forming the group, the Panel’s “assertions and recommendations were based the “highest-quality evidence available from scientific studies.” The Panel reviewed “more than 16,000 research studies and related documents. Yet, only a small percentage of the 16,000 research studies reviewed met the standards of evidence and could support conclusions.”

The Panel placed strongest confidence in studies that “test hypotheses, that meet the highest methodological standards (internal validity), and that have been replicated with diverse samples of students under conditions that warrant generalization (external validity)….” The Subcommittee on Standards of Evidence developed relatively detailed standards for rating the quality of potential evidence as strong, moderately strong, suggestive, inconsistent, or weak. This section of the report begins on page 81 of the document.

Here is an excerpt relative to “Strong Evidence”:

All of the applicable high quality studies support a conclusion (statistically significant individual effects, significant positive mean effect size, or equivalent consistent positive findings) and they include at least three independent studies with different relevant samples and settings or one large high quality multisite study. Any applicable studies of less than high quality show either a preponderance of evidence consistent with the high quality studies (e.g., mean positive effect size) or such methodological weaknesses that they do not provide credible contrary evidence. Factors such as error variance and measurement sensitivity clearly influence the number of studies needed to support a conclusion ….

See the United States Coalition for World Class Math [12] for more information about efforts to mitigate the decline in math performance among U.S. students, as well as its ranking of individual states with respect to math performance.

## A return to coin tossing

In Chance News 50 we discussed a talk given by statistician Peter Donnelly. In this talk Peter asked us to consider two patterns of heads and tails, HTT and HTH, that would occur if we tossed a coin a sequence of times. He asked us if we thought that, on average, HTT would take longer to appear than HTH, or if HTH would take longer, or if they would take about the same number of tosses. He said that most people would think they would on average occur at about the same time. He said that this is wrong and in fact HTT would, on average, take less time to occur than HTH. This seems strange because it is obvious that they both have an equal chance of being the first to occur. However, it is strange but true. It turns out the expected number of tosses until the pattern HTT occurs is 8 and the expected time until HTH occurs is 10.

### Discussion

Suppose that Mary and John play a game in which Mary chooses HTH and John chooses HTT and the person whose pattern comes up first wins. Then this is a fair game even though the expected time for John's pattern coming is less than the expected time that Mary's pattern comes for the first time. How is this possible?

### Additional reading

Mathematical games, Sci. Amer. 10, 120-125. Here you will find an elegant combinatorial solution to this coin tossing problem, due to John Conway. This article is also included in Gardner's book "Time Travel and Other Mathematical Bewilderments" and in some of his other books.

Introduction to Probability, Grinstead and Snell, pp 428,430, 432.

## Supersizing

“XXXL”, by Elizabeth Kolbert, *The New Yorker*, July 20, 2009

This article starts out describing some results of the CDC’s National Health and Nutrition Examination Surveys, which have been carried out since the 1950s. According to the CDC’s surveys, the percentage of overweight American adults (“body-mass index greater than twenty-seven”) took a giant leap in the 1980s. CDC researchers published their results in *JAMA* in 1994:
First survey - early 1960s - 24.3%; Second survey - early 1970s - 25%; Third survey - late 1970s - 25.4%;
Fourth survey - 1980s - 33.3%.

Men are now on average seventeen pounds heavier than they were in the late seventies, and for women that figure is … nineteen pounds. The proportion of overweight children, age six to eleven, has more than doubled, while the proportion of overweight adolescents, age twelve to nineteen, has more than tripled. ….

“If this was about tuberculosis, it would be called an epidemic,” another researcher wrote in an editorial accompanying the report.

Reviewing five books about the recent surge in weight of many Americans (*The End of Overeating*, *Fat Land*, *Mindless Eating*, *The Fat Studies Reader*, *Globesity*), the article's author describes some hypotheses about our eating habits, as well as the results of some psychological experiments performed to try to identify possible psychological causes.

One story has Ray Kroc, of McDonald’s, questioning why they could sell more French-fries if they “supersized’ them.

Kroc pointed out that if people wanted more fries they could always order a second bag.

“But Ray,” [a McDonald’s Board member] is reputed to have said, “they don’t want to eat two bags – they don’t want to look like a glutton.” ….

The result is that as French-fry bags get bigger, so, too, do French-fry eaters.

## A solution to the probability puzzle

In Chance News 50 we gave the following probability puzzle

Find three random variables X, Y, Z, each uniformly distributed on [0, 1], such that their sum is constant. Since each random variable has expectation 1/2, the sum must in fact be 3/2.

The following solution was provided by Roger Pinkham

If {z} denotes the fractional part of z, then supposing U is uniform on [0, 1] put

X={2U}, Y=1-{U+1/2}, and Z=1-U. It is straightforward to check that X, Y, and Z have the required uniform distributions,

and that X+Y+Z = 3/2.

Congratulations Roger!

Laurie Snell

## A bet with no losers?

“Using the Lottery Effect to Make People Save”

by Jason Zweig, *The Wall Street Journal*, July 18-19, 2009

The author discusses a Michigan pilot program to increase people’s savings rates.

[P]sychologists have long known that people tend to overestimate the odds of rare events. Applying that behavioral insight, [a] finance professor … has devised a clever program called "Save to Win." Launched earlier this year for members of eight credit unions in Michigan, it is a cross between a certificate of deposit and a raffle ticket. Members who put $25 or more into a Save to Win one-year CD are entered into a monthly "savings raffle" for prizes up to $400, plus one annual drawing for a $100,000 jackpot. Only Michigan residents are eligible to participate.

This CD pays between 1% and 1.5% annual interest and has attracted about $3.1 million in new deposits. One new customer described her experience:

"The teller said somebody else she told about it won, … so I said, 'Well, you must be good luck then.' I thought it was a good idea, because earning interest means you win anyway. So I put down the minimum, $25." This past week, [the customer] won $400. She plowed the $400 back into her Save to Win account, getting a second shot at winning the $100,000 grand prize.

A credit union president stated, “You are sort of betting, but there's no losing."

Bloggers [13] commented:

Apply that same logic to the $250 SS stimulus check. Instead of sending $250 to everyone (which is mostly used to pay bills), give $2500 to every tenth person. Much more spending and stimulus will result.

One thing at least is certain - the credit union wins by offering a lower rate.

## More scientific fraud

“Defining Data Down”

Book review of Eugenie Samuel Reich’s *Plastic Fantastic*

by John Derbyshire, *The Wall Street Journal*, July 24, 2009

The book describes a late 1990s-early 2000s science fraud by a postdoctoral German physicist at Bell Labs, who was “seeking to persuade organic materials, like plastics,” instead of inorganic substances, like silicon, “to exhibit behaviors useful in electronics.”

The prize-winning physicist’s results were widely published, in academic physics journals, as well as in more popular science magazines. While his work had its skeptics, it wasn’t until 2002 that fraud was suspected, after which a Bell Labs committee confirmed that lab results had been faked and data fabricated. The man lost his job and his Ph.D and returned to Germany.

A key moment in the denouement came in April 2002 when a Bell Labs researcher noticed that two of [the physicist’s] papers from two years before, one in ¬

Scienceand one inNature, had reported outputs that were identical—even down to the electrical “noise”—from two quite different devices. Two years, in two journals claiming a combined readership of nearly two million, andnobody noticed?

The physicist was skillful in maintaining support from his managers, possibly because he was “an amiable employee … and a cheap one, delivering striking results from a minimum of resources.” An inadequate peer-review process on the part of editors or fellow scientists appears also to have been at fault.

No one seems to have found a motive for the dishonesty. According to the article’s author, “Perhaps there is no better explanation, at last, than that he did it because he could.”

A blogger [14] commented:

I have witnessed the bias in how proposals are often reviewed by "peers" - who at times seem more concerned with where the investigator was from as opposed to the quality of the ideas.

## Two new scientific journals?

“On Navel Lint and Other Scientific Triumphs”

by Melinda Beck, *The Wall Street Journal*, July 21, 2009

Although there exist 5200 medical journals, author Beck recommends adding two more:

I’d call them

Duh!, for findings that never seemed to be in doubt in the first place, andHuh?, for those whose usefulness remains obscure, at least to lay readers.

As background, she describes a number of findings from journals or conferences, one of which gave rise to the article’s title: “The more abdominal hair, the greater the tendency to collect belly-button lint.” She also provides details about other studies whose usefulness she questions.

Beck suggests that material for the journal *Duh!* might include a University of Cambridge obstetrician’s 2003 paper [15], in which the doctor “could find no randomized controlled trials testing whether parachutes prevent death and injuries in response to ‘gravitational challenge’ — *i.e.*, jumping out of aircraft.”

She suggests a deliberately humorous article for the journal *Huh!*. Apparently a Vienna University of Technology chemist was surprised when the journal *Medical Hypotheses* accepted his study “The Nature of Navel Fluff” [16].

Inspired by a question posed in the 2005 book,

Why Do Men Have Nipples?[the chemist] theorized that belly-button lint is largely the result of abdominal hair channeling loose shirt fibers. To test his hypothesis, he collected 503 pieces of his own belly-button lint over three years, wearing different shirts. Then he shaved his abdominal hair and found that no more lint collected.

“Many people nominated me for an ig-Nobel prize,” [the doctor] wrote.

## Predicting Hall of Famers

“A Computer Cracks the Cooperstown Code”

by Tim Marchman, *The Wall Street Journal*, July 27, 2009

A computer scientist and an MIS researcher claim that election of a Baseball Hall of Famer “is an entirely predictable outcome based on a few statistics.” Their conclusion is based on a study of 1,592 players for the period 1950-2002.

[H]its, home runs and on-base plus slugging percentages are what count for hitters, while wins, saves, earned run average and winning percentage are what count for pitchers. All-Star Game appearances count for both, being especially valuable for hitters as they serve as a useful proxy for position.

A blogger comments [17]:

It's actually pretty easy to predict who will get into the Hall of Fame with just about any statistical method: 90% of the candidates are an easy decision. The single biggest factor is longevity and/or high totals. Even something unexciting and even uninformative - like at-bats for hitters - will be highly accurate as a predictor because you can't get a lot of them without getting a lot of everything else.

The blogger also cites David R. Tufte and John Topoleski’s article, “Are There Performance Thresholds Which Help Determine Election to the Baseball Hall of Fame,” *Proceedings of the American Statistical Association*, Section on Statistics in Sports, 2000, pp. 81-86.

## “Dappers” and “schnooks”

Cocksure”, by Malcolm Gladwell, *The New Yorker*, July 27, 2009

The author describes examples of a phenomenon that psychologists call the “illusion of control.” This phenomenon obtains when “confidence spills over from areas where it may be warranted (‘I’m savvier than that schnook’) to areas where it isn’t warranted at all (‘and that means I’m going to draw higher cards’).”

[A psychologist] had subjects engage in a betting game against either a self-assured, well-dressed opponent or a shy and badly dressed opponent (in Langer’s delightful phrasing, the “dapper” or the “schnook” condition), and she found that her subjects bet far more aggressively when they played against the schnook. They looked at their awkward opponent and thought, I’m better than he is. Yet the game was pure chance: all the players did was draw cards at random from a deck, and see who had the high hand. ”

[Another psychologist] created a computer program that mimicked the ups and downs of an index like the Dow, and recruited, as subjects, members of a highly paid profession [investment bankers]. As the line moved across the screen, [the researcher] asked his subjects to press a series of buttons, which, they were told, might or might not affect the course of the line. At the end of the session, they were asked to rate their effectiveness in moving the line upward. The buttons had no effect at all on the line. But many of the players were convinced that their manipulation of the buttons made the index go up and up.

## Cellphone driving risks

Drivers and legislators dismiss cellphone risks. New York Times, 18 July 2009

U.S. withheld data on risks of distracted driving. New York Times, 21 July 2009

In study, texting lifts crash risk by large margin.
New York Times, 27 July 2009

Matt Richtel

In July, the *Times* published a series of articles with global title "Driven to Distraction", which reports both anecdotal evidence and research studies about the risks posed by drivers using cellphones. This topic has been under discussion for well over a decade. The first article cites a 2003 Harvard study which found cellphones contributed an 2600 fatal accidents annually, as well as 330,000 accidents involving "moderate or severe injuries". The article also links to an extensive 2005 bibliography on cellphone risks from the National Highway Transportation Safety Administration (NHTSA). More recently, the enormous popularity of texting has added a new dimension to the problem. The third article describes a recent study of truck drivers conducted by the Virginia Tech Traffic Institute, which found that when texting the drivers had a 23 times greater risk of an accident.

This is collection of relatively long articles, which raise numerous issues that could be discussed in a statistics class. We were reminded of a classic 1997 article by Redelmeier and Tibshirani, " Is Using a Car Phone Like Driving Drunk?", which appeared in *Chance* magazine (vol. 10, no.2). An electronic version is archived at the US Department of Transportation website here. The article gives a very accessible discussion of the case-crossover design used in the authors' original research article, (published in *The New England Journal of Medicine*). Looking back, it is worth noting how many of the issues raised there are still current. For example, Redelmeier and Tibshirani were warned by colleagues that their investigation would anger the cellphone industry, which was already a powerful financial force. The second *Times* article asserts that the NHSTA refrained from publishing extensive research on cellphone risks because Congress did not want the agency to be seen as lobbying states for restrictions on the devices. Elsewhere in that article, industry representatives are quoted as arguing that it didn't make sense to say cellphones posed a risk given that overall accident rates were not increasing. The industry also argued that cellphones should not be singled out for restrictions, given the many other distractions now available to drivers. In fact, these two concerns had been anticipated in 1997 by the Redelmeier and Tibshirani article, which offers rebuttals. For example, they note that conversation with a passenger differs from a cellphone conversation, because passengers are aware of driving conditions and may even contribute to pointing out hazards, whereas cellphone conversations focus the participants on another environment.