Chance News 65

From ChanceWiki
Jump to navigation Jump to search

June 28, 2010 to August 12, 2010


"All scientific work is incomplete - whether it be observational or experimental. All scientific work is liable to be upset or modified by advancing knowledge. That does not confer upon us a freedom to ignore the knowledge we already have or postpone the action that it appears to demand at a given time."

Sir Austin Bradford Hill, as quoted at Toxipedia.
Submitted by Steve Simon.

"His lectures were loud and entertaining. ...He took umbrage when someone interrupted his lecturing by pointing out some glaring mistake. He became red in the face and raised his voice, often to full shouting range. It was reported that on occasion he had asked the objector to leave the classroom. The expression 'proof by intimidation' was coined after Feller's lectures (by Mark Kac)."

Gian-Carlo Rota, Fine Hall in its golden age: Remembrances of Princeton in the early fifties.

"When, as a student in 1946, I decided that I ought to learn some probability theory, it was pure chance that led me to take the book Theory of Probability by Jeffreys from the library shelf."

William Feller, as quoted in “Not only defended but also applied”: A look back at Feller’s take on Bayesian inference, by Andrew Gelman and Christian P. Robert

Submitted by Paul Alper

“In the dark candlelit room where they swear allegiance to FIFA, coaches and commentators have agreed that if you are a soccer person, you have to say you don’t buy into stats.”

Mark Brunkhart, president of Match Analysis, as quoted in When it comes to stats, soccer seldom counts, New York Times, 8 July 2010.
Submitted by Bill Peterson

“Basically, I’m not interested in doing research and I never have been. I’m interested in understanding, which is quite a different thing. And often to understand something you have to work it out yourself because no one else has done it.”

David Blackwell, as quoted in David Blackwell, scholar of probability, dies at 91, New York Times, 16 July 2010.
Suggested by Priscilla Bremser

"It's better than Disneyland in terms of how you can take technologies and go after a resource that is thousands of years old and do so in an environmentally sound way."
-- Alaska Senator Lisa Murkowski to the Senate Energy Committee (7 months ago) on the science of deep-sea drilling

Predictable outcomes are unusual within ecological systems, while “unpredictable, chaotic events [are] usual."
--Environmentalist Carolyn Merchant

Quoted in “A Hole in the World”, The Nation, July 12, 2010
Submitted by Margaret Cibes

”When I teach a course on statistics (to about 300 second-year psychology students, all of whom have had several courses in introductory stats already) I start by waving a 20 euro bill around, telling them ‘if you can tell me exactly what a p-value is you get 20 euros’. Result: I always get to keep my money, because they have no clue what a p-value is.”

Blogger responding to “Odds Are, It's Wrong: Science fails to face the shortcomings of statistics”, Science News, March 27, 2010.

“The practitioner [of the religion of Statistics] engages in a ritual known as ‘hunting for p values.’ …. Once the calculations are completed, … the practitioner must be prepared to suffer the wrath of the angry gods of Statistics. If the p value is bigger than .05, he will not be allowed to publish. It may even mean running another experiment. If he is clever, the practitioner may find ways to modify the original data (leaving out numbers that are obviously wrong is the most common practice) and invoke the gods again. …. Sometimes, however, no manipulation of the data short of outright fraudulent misrepresentation will produce a p value less than .05. The sensible practitioner will remember that we live in an unfair and irrational world and accept his defeat.”

David Salsburg, in “The Religion of Statistics as Practiced in Medical Journals”, The American Statistician, August 1985.
See also “Comment on ‘The Religion of Statistics’”, The American Statistician, August 1986.

“[T]he difference between ‘significant’ and ‘not significant’ is not itself statistically significant.”

Andrew Gelman and Hal Stern, in an article of the same title[1], The American Statistician, November 2006.

Submitted by Margaret Cibes


The following Forsooths are from the RSS NEWS June 2010

Labour's betrayal of British workers. Nearly every one of 1.67m jobs created since 1997 has gone to a foreigner.

Immigration was at the centre of the election campaign today as it emerged that virtually every extra job created under Labour has gone to a foreign worker.

Figures suggested an extraordinary 98.5 per cent of 1.67 million new posts were taken by immigrants.

The ONS figures show the total number of people in work in both the private and the public sector has risen from around 25.7 million in 1997 to 27.4 million at the end of last year, an increase of 1.67 million.

But the number of workers born abroad has increased dramatically by 1.64 million from 1.9 million to 3.5 million.

The English language currently comprises roughly a million words. Discounting new words that are added every day, and those occasionally lost to posterity, the possibility of forming a three-word combination is therefore a million cubed, or a quadrillion--that's followed by 216 zeros.

The Guardian, 21 August 2009

Submitted by Laurie Snell

”There has always been a question about just how much of a forecasting mechanism markets are. Hence the saying that stocks have correctly predicted 15 of the past nine recessions.”

“With Stocks, It’s Not the Economy”, TIME, August 2, 2010

Submitted by Margaret Cibes

Some responses to a Josephson Institute survey of American public and private high-school students, “The Ethics of American Youth: 2008”:

30 percent said that they had stolen from a store within the past year.

42 percent said that they sometimes lie to save money.
64 percent said that they had cheated on a test during the past year.
26 percent admitted that they had lied on at least one or two questions on the survey.
93 percent said that they were satisfied with their personal ethics and character.

77 percent said that when it comes to doing what is right, I am better than most people I know.

The 2008 survey had 29,760 respondents, although not all respondents replied to all questions. The website has links to the original questions and to demographic background data for every question.

Submitted by Margaret Cibes

Return on investment in college

Union, RPI rank high on education value.
by Caitlin Tremblay, Daily Gazette (Schenectady, NY), 30 June 2010, p. A5

The article reports that that two local colleges (Union College and Rensselaer Polytechnic Institute) “are listed among New York state’s and the nation’s best colleges for making back the money spent on a bachelor degree, according to a study by the website …Payscale, a compensation research website, took the price of the schools’ degree and compared it to the average income of graduates to calculate a ‘return on investment.’ Only those with undergraduate degrees and full-time hourly or salaried jobs were included…Topping Payscale’s list are Massachusetts Institute of Technology (annual ROI of 12.6 percent), California Institute of Technology (12.6 percent) and Harvard University (12.5 percent).”

The website (which also explains the methodology) asserts that “A return on investment (ROI) calculation tells you what you get back for what you spend - and it's a great way to compare college costs…PayScale helps you figure out which school's tuition costs will return the biggest dividends for you after graduation.”

Discussion Questions

With the help of the website methodology description:

  1. Critique, from a statistical perspective, the use of the results of this study in comparing colleges with regard to assessing “what you get back for what you spend.”
  2. How might the validity of such a study be improved and, if implemented, how would this impact any reservations you might have about the conclusions that you might draw?
  3. Comment on any other aspects of the underlying methodology and how it might be improved.

Submitted by Gerry Hahn

Paul the octopus plumps for Spain

ESPN soccernet, 9 July 2010

Spain will defeat Netherlands in Sunday's World Cup final, according to the latest prediction from Paul the psychic octopus.

To intense media interest on Friday morning, Paul, who has an unblemished record in the tournament so far, picked Spain as the victors in the Johannesburg final and also predicted that Germany will defeat Uruguay in Saturday's third-place play-off.

The decision was welcomed in Spain - who were also tipped by Paul to defeat his home country, Germany, in the semi-finals - with Marca's website leading with the story of how el pulpo Paul predicted that Spain would be campeones.

Paul has achieved global fame after correctly predicting the results of all of Germany's games at the tournament in South Africa. In order to harness his powers, his keepers at Sea Life Oberhausen present Paul with the choice of two glass boxes, both containing a mussel but each bearing the flag of a different country.

The odds of Paul correctly predicting Germany's results so far are 1 in 64 and he proved correct once again when tipping Spain to beat Joachim Low's side in the semi-final, which they duly did thanks to a header from Carles Puyol.

Many German fans were unhappy with Paul's decision to plump for Spain and, fearing a backlash, Spanish Prime Minister Jose Luiz Rodriguez Zapatero has joked he will offer state protection to Paul!


Read more about Paul here and see if you would trust Paul in your bets.

Submitted by Laurie Snell and suggested by Dan Rockmore

A golf oddity

“Paul Goydos and the Odds of Shooting 59”
by John Paul Newport, The Wall Street Journal, July 10, 2010

In a July 8, 2010 golf tournament, Paul Goydos shot a 59, only the fourth such score in 612,489 rounds on the PGA Tour.

Those odds of 153,123 to 1 compare with 2,139 to 1 for baseball no-hitters and 21,084 to 1 for perfect games during the same period. A 59 is 1/300th as likely as a hole-in-one on the PGA Tour; [it is] 1/25th as likely as a double eagle.

Submitted by Margaret Cibes

Tuesday’s child

The famous nursery rhyme proclaims: “Tuesday’s child is full of grace.” Well, factoring in Tuesday for a birth date as discussed in Chance News 64 and Andrew Gelman’s blog produced a flood of comments.

Without Tuesday muddying the waters, the well-known answer to

I have two children.

One is a boy.
What is the probability that I have two boys?

is 1/3, rather than 1/2 as many are prone to say. William Feller in his famous book (Introduction to Probability Theory and Its Applications, Volume I, Third Edition, page 117) says the value of 1/2 is the solution to a much simpler problem: “A boy is chosen at random and comes from a family with two children; what is the probability that the other child is a boy?” He explains why: The 1/3 “might refer to a card file of families,” while the 1/2 “might refer to a file of males. In the latter, each family with two boys will be represented twice, and this explains the difference between the two results.”

Many of the comments focused on the intuitively irrelevant aspect of Tuesday and yet, a careful laying out of the sample space indicates that the day of the week for the birth of a boy turns out to be relevant. Some of the comments tried to explain the cognitive dissonance by referring to similarities to the so-called Monty Hall Problem, in the sense that available information needs to be accounted for.

With Tuesday thrown into the mix, the answer to

I have two children.

One is a boy born on Tuesday.
What is the probability that I have two boys?

surprisingly, turns out to be 13/27, which is close to 1/2, the answer to the simpler problem.

Consider a different physical situation, where “boy” now represents a successful knee operation and “girl” now represents an unsuccessful knee operation--we have, after all, but two knees. Ignoring the “Tuesday” aspect, knowing there is a successful knee operation implies a 1/3 chance of two successful knee operations. But this seems especially the wrong-way round because knowing of an unsuccessful knee operation implies a 2/3 chance of a successful knee operation.

When “Tuesday” is added to knee replacement, the implication is closer to 1/2. In fact if we recorded time of day to the nearest minute of the day, rather than to the particular day of the week, we would be even much closer to 1/2. But that is bothersome too because this allows for manipulation of the data keeping/presentation merely by tacking on what might be deemed a "spurious" variable that can take on many values.


1. Expanding on Feller’s explanation, what is the proper “card file” to use here?

Submitted by Paul Alper

2. Perhaps we really need to be asking, “How did we obtain the information regarding Tuesday.” If you tell me that your license plate number is some random license-plate-like number, I won’t find it surprising. But if I ask you whether your license plate number is some random number that I thought up and you say “yes”, I’ll be highly surprised. (The license plate example is based on what Richard Feynman was said to use when teaching physics.)

The same sort of thing may be happening here in a more subtle way. As Paul pointed out, if we add “minute of the day” to the information and if we use the same logic, the probability approaches 0.5.

But the boy had to be born at some particular time, so if the father merely supplies whatever that time happened to be, is he giving me any useful information? On the other hand, if I ask whether one of the children was a boy born on Tuesday, his “yes” does supply information.

Submitted by Emil Friedman

Placing great stock in stock software

“Letting the Machines Decide”
by Scott Patterson, The Wall Street Journal, July 13, 2010

The author describes a small ($7 million) New York hedge fund, Rebellion, that has been using an artificial-intelligence program, “Star,” to invest in stocks since 2007. Its conservatively traded portfolio has beat the S&P 500 by an average of 10% per year (after fees).

Run by a “small team of twentysomething math and computer whizzes,” Star bases its buy/sell/hold recommendations on about 30 factors and more than 10 years of historical market data and adjusts its strategy on its own when the portfolio is underperforming.

The company claims that a Rebellion human trader always follows Star’s recommendations. One member of the Rebellion team stated that, even when worried about a Star artificial-intelligence recommendation,

I’ve learned not to question the AI [rtificial] I[ntelligence].

One blogger commented, “I hope they have the plug on a short leash so it can be pulled at a moment’s notice.” Another stated, “[T]he AI is only as good as the person designing it, and humans make mistakes.” On the other hand, a third blogger felt that “the biggest advantage to AI is the fact that it is not emotional, which can trip up many investors.”

Submitted by Margaret Cibes

Loss aversion and streaks

Author Michael Shermer pens a column for Scientific American and has written several useful books on "skepticism" and science.

In his recent book (The Mind of the Market, Times Books, 2007) on how evolution has shaped human economic behavior, we find the following passage (p. 93):

Gamblers, for example, are highly sensitive to losses, but not in the way you might think. They tend to follow a losing hand by placing bigger bets, and turn conservative after a winning hand by placing smaller bets. One rationale for this strategy is "double up to catch up"--no matter how many losses in a row, if you double the bet each time, you will get back all of your money when you eventually do win. But most gamblers tend to underestimate the number and length of losing streaks.
...More important, gamblers also tend to underestimate the number and length of winning streaks and lose out on the reward of placing larger bets during them. Of course, even with an optimal betting strategy that plays to win every hand, and keeping loss aversion in check, if you play long enough you will lose because of the slight edge to the house built into the rules of the game. But casinos make even more money than the house percentage would predict because of our loss aversion.

Discussion Questions

1. What do you think Shermer means here? Is it possible for players to detect when they are in the midst of "winning streaks" while playing randomly-determined games, and win more money as a result?

2. Do casinos win extra money from players who simply run out of funds before their luck can even out? Assuming each individual play has the same house edge and the bet amount is the same, would the casino care if one person placed a series of 100 bets rather than 100 people placing 1 at a time?

Submitted by Greg Bart

A penchant for perfection

An amateur golfer from Stockton, California recently made local headlines [2] by sinking his 16th career hole-in-one at a tournament in Reno. Rod Souza Sr., a 60-year old retired Fire Department Captain attributes his success to two things: "frequency of play and luck". According to Golf Digest, the odds of an amateur golfer hitting a hole-in-one on a given hole are 1 in 12,750.

Discussion Questions:

1) In the article, Mr. Souza tells reporters that he has been playing golf for over 40 years, averaging 3-4 rounds per week during this time. Suppose that he averages 3.5 rounds per week and that a round consists of 18 holes. If Mr. Souza was relying on "luck" alone, how many hole-in-ones should he expect to have after 40 years of playing golf?

2) What is the probability that Mr. Souza would have made 16 or more hole-in-ones if he was relying on chance alone?

3) Based on your calculation in question 2, do you think Mr. Souza has shown an unusually high "penchant for perfection"?

Submitted by John Mayberry

Three recent John Paulos stories

How Much Oil's Spilling? It's Not Rocket Science
Five or Six Reasons Why Parity Puzzles Are Fun
Medical Statistics Don't Always Mean What They Seem to Mean

Choose one of these stories and see if you agree with this it.
Submitted by Laurie Snell

Statistical frustration

The headline of Gina Kolata’s New York Times article is “Spinal-Fluid Test Is Found to Predict Alzheimer’s.” The abbreviated reprint in the Minneapolis Star Tribune is “Spinal test can detect Alzheimer’s accurately.” Someone who is not connected with the study in the Archives of Neurology says, “This is what everyone is looking for, the bull’s-eye of perfect predictive accuracy.”

Naturally, a closer look is less positive. The only numbers in the abbreviated reprint were: “The new study included more than 300 patients in their 70s, 114 with normal memories, 200 with memory problems and 102 with Alzheimer’s disease.” That is, a non-NYT reader would know only how many were in each arm of the study and no idea of the numerical results.

The NYT reader would find an additional paragraph (actually six in all): “Nearly every person with Alzheimer’s had the characteristic spinal fluid protein levels. Nearly three quarters of people with mild cognitive impairment, a memory impediment that can precede Alzheimer’s, had Alzheimer’s-like spinal fluid proteins. And every one of those patients with the proteins developed Alzheimer’s within five years. And about a third of people with normal memories had spinal fluid indicating Alzheimer’s. Researchers suspect that those people will develop memory problems.”

Discussion Questions

1. The word “accuracy” is ambiguous because of the different types of errors in medical testing, Prob(no disease|test+) and Prob(disease|test-). Noting that no efficacious treatment exists at present, which error seems more serious?

2. From the results stated in the NYT, does “bull’s-eye of perfect predictive accuracy” seem warranted?

3. The Minneapolis Star Tribune cut out the final six paragraphs of Kolata’s NYT article. Give a justification and a criticism for the abbreviation.

Submitted by Paul Alper

Surrogate markers

Evidence-based medicine in disguise: Beware the surrogate!
by Michael Kirsch, MD Whistleblower blog, 1 August 2010

Dr. Michael Kirsch is a gastroenterologist who writes an interesting blog. His masthead proclaims


In the post referenced above he writes, “A surrogate marker is an event or a laboratory value that researchers hope can serve as a reliable substitute for an actual disease.” He believes “A common practice and serious flaw in medical research is to rely upon a surrogate marker when studying a disease.” Surrogate markers are relied upon because “It is much easier and cheaper for researchers to measure surrogates than actual disease events.” Further, “Surrogate research is valid if the surrogate truly represents the disease. Often, this assumption is questionable or outright false.”

Much of evidence-based medicine depends on the use of statistics and the trick is to measure that which is consequential rather than that which is convenient, a.k.a., a surrogate marker; unfortunately, that could be difficult. His first example is medication to lower cholesterol:

What could be simpler than measuring blood cholesterol levels? In contrast, it would be a very tough slog to show that a cholesterol-lowering drug reduced heart failure or mortality rates. With a surrogate, medical studies can be completed much more rapidly, in contrast to studying actual diseases, which can take a decade or more to complete. By then, the findings may no longer be relevant. Surrogate research is also much less expensive to perform.

Surrogate results have flashy marketing appeal because their findings can be expressed in catchy headlines that extrapolate the actual conclusions.

He even criticizes his own field of gastroenterology:

[G]astroenterologists remove colon polyps with enthusiasm and zeal. Polyps are not diseases. They are surrogates for colon cancer. We hope and believe that when we remove pre-cancerous polyps that we are reducing your risk of colon cancer. Interestingly, there is no double-blind placebo controlled trial (the gold standard of medical research) that establishes that colonoscopy reduces colon cancer. Just because it sounds logical, doesn’t mean that it’s true.

Discussion Questions

1. A reader to Kirsch’s blog wrote, “Perhaps there needs to be a distinction between surrogates, precursors, and prerequisites.” Read Kirsch’s response and give some of your own examples of possible distinctions among the three terms.

2. Surrogate markers arise in many fields besides medicine. For example, spirituality is not easy to measure and instead church attendance might be considered a surrogate marker. Defend and criticize church attendance as a useful surrogate marker. Put forward a better surrogate marker.

3. Marital fidelity is likewise difficult to assess. Suggest some surrogate markers for marital fidelity.

4. Intelligence, a complex characteristic, is often measured by the surrogate marker, an IQ test. Defend and criticize the IQ test as a surrogate marker for intelligence.

5. The field of nutrition is overflowing with surrogate markers which are often called “risk factors.” Suggest some risk factors which are tightly connected to a disease and some risk factors which are tenuously connected to a disease.

Submitted by Paul Alper