Chance News 26

From ChanceWiki
Jump to navigation Jump to search


It is now proved beyond doubt that smoking is one of the

leading causes of statistics.

Fletcher Krebel
Reader's Digest (December 1961)

Steve Simon provided the following quotation:

One of the naturalists had argued that On the Origin of Species was too theoretical, that Darwin should have just "put his facts before us and let them rest." In response, Darwin reflected that science, to be of any service, required more than list making; it needed larger ideas that could make sense of piles of data. Otherwise, Darwin said, a geologist "might as well go into a gravel-pit and count the pebbles and describe the colours." Data without generalizations are useless; facts without explanatory principles are meaningless.

Michael Shermer
Why Darwin Matters. The Case Against Intelligent Design. (page 1)


The following Forsooths were in the April 2007 RSS News:

Britain has been basking in the early onset of spring with temperatures almost twice as warm as the same time last year.

Lucy Ballinger
Daily Mirror
12 March 2007

PHEW! Twice as warm as Corfu

It's not often we put Corfu in the shade weatherwise, especially at tis time of the year. But while the Greek holiday spot could only manage a paltry 8C (46F) yesterday, Britons basked in the sun as temperatures reached 16C (60F) yesterday.

Stephen White
Daily Mirror
12 March 2007

He (Persi Diaconis) proved that it takes seven shuffles to perfectly randomize a pack of cards.

Justin Mullins
New Scientist
March 24-30, 2007, p 52

Contributed by Laurie Snell.

We were eleven people obtaining those 30.000 millions. I want the 11% that corresponds to me.

A politician of Madrid in a phone dialogue recorded by the police.
El Pais
20th October, 2006,

Contributed by Carlos Silva.

Keith Crank, Assistant Director for Research and Graduate Education, writes:

Here is a possible entry under Forsooths in Chance News. It comes from a

recent report of the Council of Graduate Schools, titled: Graduate Education The Backbone of American Competitiveness and Innovation. On page 22 of the document, it states, "..., while the majority of students who enter doctoral programs have the academic ability to complete the degree, on average only 50-60 percent of those who enter doctoral programs in the United States complete their degrees."

The report is available here. Given the members of the committee that prepared this report, you would think

someone among them would realize that 50-60% is a majority.

From $355 million Mega Millions jackpot has Californians dreaming big, San Francisco Chronicle, March 5, 2007

I realize I don't have a chance, but nobody's got a chance. So the way I look at it, I have a 50-50 chance -- either I win it or someone else wins it," reasoned Barrie Green, 60, after buying a single ticket Monday afternoon at the Merritt Restaurant and Bakery near his home in Oakland.

Contributed by Alan Shuchat

CANADA is to investigate claims that tens of

thousands of native Indian and Inuit (First Nation) children died of tuberculosis at church-run residential schools in the early 20th century, and that their deaths were hushed up. …Their experiences were often brutal, and Canada is finalising a C$1.9 billion ($1.7 million) class-action settlement for 80,000 surviving former inmates…

New Scientist
May 5, 2007,

Contributed by Paul Campbell

A history of smoking in the US

The Cigarette Century: The rise, fall, and deadly persistence of the product that defined America
Allan M. Brandt, 600 pp.
Basic Books, 2007, Amazon $23.76.

Allen Brandt is Professor of the History of Medicine at Harvard Medical School, and a professor in the Department of History of Science at Harvard University. His book is a complete history of smoking in the U. S. It is divided into five chapters: Culture (how cigarettes came into the American culture), Science, (The Causal Conundrum), Politics, (The Surgeon General report), Law (The trials of Big Tobacco), and Globalization, (Exporting an Epidemic). While this book will clearly be the bible of smoking you might want to start with some related videos.

For an overall picture of the book you can watch here a lecture Brandt gave about his book.

We are most interested in Brandt's chapter: "The Causal Conundrum". For an introduction to this we recommend watching video 11 (ˇThe Question of Causation') of Against all Odds (This is free but you have to sign in). This video was made while the pioneers who recognized the association of cigarette smoking with lung cancer were still alive and it is great to see and hear their personal involvement. Here is sample: Doctor Dwight Harken is speaking:

Dr. Wyndor, then a student at St Louis under Dr Everts Graham, came to see me and said "Camel Cigarettes cause cancer of the lungs". I couldn’t believe it and, you know, you see what you look for and look for what you know -- and it never occurred to me that cigarettes caused cancer. So we went to see my patients and at that time I had quite a large practice. We discovered, to our amazement, that patients who had cancer of the lung were 17 times to 1 as apt to be to be two- pack-a-day smokers. So here was a fact trying to tell us something.

In 1950 Wyndor, with the help of his teacher Dr. Graham, collected questionnaires regarding smoking habits from hospital patients and concluded that lung cancer was associated with smoking. This belief was supported by prospective and retrospective studies by the well-known statisticians Richard Dole and Bradford Hill. As Brandt observes these new kind of studies were originated in the attempt to show that smoking caused lung cancer.

Neither source discusses in any detail the reasons that the two famous statisticians Joseph Berkson in the US and R.A Fisher in the UK were not convinced that smoking caused lung cancer. Berkson explained his reasons in his article: Smoking and lung cancer: some observations on two recent reports. J Am Stat Assoc 1958;53:28-38.

Here he argues that causation cannot be concluded from statistical studies which do not deal with laboratory experiments or placebo-controlled clinical trials. He is also concerned that the studies upon which causation is concluded found an association of smoking with a wide variety of other diseases including those for which, unlike lung cancer, there was no reason to expect an association. In addition he had reasons to believe that the studies were not as carefully carried out, as he explained in an earlier paper Smoking and cancer of the lung, Proceedings of the Mayo Clinic, Vol. 34, No. 13, pp. 367 to 385.

The best way to understand Fisher's concerns about the claim that smoking caused lung cancer is to read the six articles he wrote about smoking and lung cancer. There are numbers 269, 270, 274, 275, 276, and 276A in Collected Papers of R. A. Fisher, Edited by J. S. Bennett, 1971-1974.

His first article is a letter to the British Medical Journal 2: 43, (1957) in which he states, on p. 1418, that the hazards of cigarette-smoking "must be brought home to the public by all the modern devices of publicity", and, on p. 1519, "in the presence of the painstaking investigations of statisticians that seem to have closed every loophole of escape for tobacco as the villain in the piece." Concerning the first statement he writes: "This is just what some of us with research interests are afraid of. A common 'device' is to point to a real cause for anxiety, such as the increased incidence of lung cancer, and to ascribe it in urgent tones to what is possibly an entirely imaginary cause. Concerning the second statement he writes "I believe I have seen the sources of all the evidence cited. I do see a great deal of other statisticians. Many would still feel, and I did about five years ago, that a good prima facie case has been made for further investigation. None think that the matter is already settled".

While Fisher agreed that retrospective studies of Hill and Doll and others suggest that smoking is associated with lung cancer, he believed that more research was necessary to establish causation. He gives examples of further research that might be done. One of the Hill and Doll papers included a question about inhaling. It found fewer inhalers among the cancer patients than among the non-cancer patients. Fisher felt that this should be studied further. He remarks that if it could be shown that inhaling was in fact strongly associated with lung cancer, this would support causation. But if not, one could not accept the simple theory that smoking causes cancer. The second area of research he suggested was to see if there are genotypic differences between the different smoking classes. If so he says " we might expect differences in the type or frequency of cancer they display."

Fisher was a scientific consultant to the Tobacco Manufactures' Standing Committee set up in 1956 by the UK tobacco industry to assist research on the relationship between smoking and health and to make this information available to the public. Berkson was a consultant for the similar US Council for Tobacco Research. Brandt remarks that the establishment of the Council for Tobacco Research was a good idea because:

The call for new research implied that existing studies were inadequate or flawed. It made clear that there was "more to know," and it made the industry seem a committed participant in the scientific enterprise rather than a detractor.

What was the solution to the Brandt's Causal Conundrum? For example did evidence justify saying that smoking caused lung cancer? In 1965 Sir Bradford Hill addressed the question of how to make such decisions in his Royal Statistical Society Presidental address. You can read this address here. In this article Hill suggests nine aspects of association that we should especially consider before deciding that the most likely interpretation is causation. Note that he does not say that we can prove causation. He discusses these in detail but here are short describtions provided here

1.Strength (Is the risk so large that we can easily rule out other factors?)
2. Consistency (Have the results have been replicated by different researchers and under different conditions?)
3.Specificity (Is the exposure associated with a very specific disease as opposed to a wide range of diseases?)
4.Temporality (Did the exposure precede the disease?)
5.Biological Gradient (Are increasing exposures associated with increasing risks of disease?)
6.Plausibility (Is there a credible scientific mechanism that can explain the association?)
7.Coherence (Is the association consistent with the natural history of the disease?)
8.Experimental Evidence (Does a physical intervention show results consistent with the association?)
9.Analogy (Is there a similar result to which we can draw a relationship?)

The Advisory Committee for the 1964 Surgeon General Report used criteria 1,2,3,4,7 in concluding that cigarette smoking causes Lung Cancer. It seems that the 1964 report is no longer available from the Surgeon General web site but the 1967 and later reports are available here. Of course your library should have the 1964 report.

Discussion questions:

(1) Can a retrospective studies distinguish between "smokers are more likely to get lung cancer than non-smokers" and "those who get lung cancer are more likely to be smokers"? Does it matter?

(2) Why do you think the Advisory Committee for the Surgeon General report did not use all the Hill criteria?

Submitted by Laurie Snell

The Numbers Guy

Annette Georgey recently wrote to the Isolated Statisticians:

A friend just alerted me to a blog maintained by "The Numbers Guy," a columnist for the Wall Street Journal who writes about probability and statistics in the news. Although the WSJ online is available to subscribers only, the blog is available to all. It contains many great examples for the classroom, written in everyday English, such as the odds of a three-way tie in the TV game show "Jeopardy," understanding statistical significance in recent hormone studies, the Texas lottery, and more.

And don't forget statistician Andrew Gelman's wonderful Blog Statistical Modeling, Causal Inference, and Social Science

Submitted by Laurie Snell

Lies, Damned Lies, and Drug War Statistics

A Critical Analysis of Claims Made by the Office of National Drug Control Policy
Matthew B. Robinson, Renee G. Scherlen, Renee G. Scherl
State University of New York Press, 2007

From the Back Cover:

Book Description:

This book critically analyzes claims made by the Office of National Drug Control Policy (ONDCP), the White House agency of accountability in the nation's drug war. Specifically, the book examines six editions of the annual National Drug Control Strategy between 2000 and 2005 to determine if ONDCP accurately and honestly presents information or intentionally distorts evidence to justify continuing the war on drugs.

The authors have performed a valuable service to our democracy with their

meticulous analysis of the White House ONDCP public statements and reports. They have pulled the sheet off what appears to be an official policy of deception using clever and sometimes clumsy attempts at statistical manipulation. This document, at last, gives us a map of the truth.

Mike Gray
Author of Drug Crazy: How We Got into This Mess and How We Can Get Out

Robinson and Scherlen make a valuable contribution to documenting how

ONDCP fails to live up to basic standards of accountability and


Ethan Nadelmann
Executive Director, Drug Policy Alliance

At Appalachian State University, Matthew B. Robinson is Associate Professor of Criminal Justice, and Renee G. Scherlen is Associate Professor of Political Science. Robinson is the author of several books, including Justice Blind? Ideals and Realities of American Criminal Justice, Second Edition.

Submitted by John Finn

Excluding car bombs from a measure of sectarian violence

Optimistic Iraq report omits bombs killing civilians Nancy Youssef, McClatchy Newspapers, April 26, 2007.

In a story widely touted in the blogosphere, the U.S. report showing a sharp decline in sectarian violence in Iraq excluded any casualties associated with car bombs.

Car bombs and other explosive devices have killed thousands of Iraqis in the past three years, but the administration doesn't include them in the casualty counts it has been citing as evidence that the surge of additional U.S. forces is beginning to defuse tensions between Shiite and Sunni Muslims.

What's the rationale for this exclusion?

Experts who have studied car bombings say it's no surprise that U.S. officials would want to exclude their victims from any measure of success. Car bombs are almost impossible to detect and stop, particularly in a traffic-jammed city such as Baghdad. U.S. officials in Baghdad concede that while they've found scores of car bomb factories in Iraq, they've made only a small dent in the manufacturing of these weapons.

Critics of the Bush administration have another explanation.

"Since the administration keeps saying that failure is not an option, they are redefining success in a way that suits them," said James Denselow, an Iraq specialist at London-based Chatham House, a foreign policy think tank.

Submitted by Steve Simon

Infinite Regress, Turtles All the Way Down

No one doubts that weather forecasting is important. Most would concede that, with the use of intensive computing of computer models based on massive data acquisition, weather forecasting has greatly improved. But, would you be willing to pay $90,000 annually to "WSI, a firm that owns the Weather Channel and sells forecasts of its own to airlines and other weather-dependent companies" for its "new product called MarketFirst, a sort of forecast of the forecast"? The British magazine, The Economist, points out that WSI "has detected the subtle biases in both the American and European models" used by government agencies. WSI claims that "European weathermen, for example, underestimate temperatures for western America in spring and autumn." On the other hand, "American forecasters are prone to predict chillier temperatures than they should for the period from 11 to 15 days from the time of the forecast."

To illustrate just how flexible capitalism is, WSI is considering "various methods of selling it [MarketFirst], including releasing it earlier to certain customers for a higher fee." Further, "Another option would be to sell a forecast of the forecast of the forecast," or as someone put it, "Turtles all the way down."


1. Google the expression "Turtles all the way down" to determine what it refers to and why it is relevant to this wiki.

2. Ira Scharf is in charge of selling MarketFirst. According to The Economist, he concedes that "The more widespread MarketFirst becomes, the less useful it will be to its subscribers." Why would this be true?

3. "WSI claims that it [MarketFirst] is right 70 percent of the time." The Economist indicates that ordinary weather forecasts are less reliable. Discuss why "right 70 percent of the time" is a nebulous statement. Further, discuss how "right 70 percent of the time" compares with tossing a coin.

Submitted by Paul Alper

The media usually back the wrong horse

Covered in shame, Buttonwood, The Economist, 3rd May 2007.

This article discusses whether the media's coverage of a company's performance is indicative of how that company will perform in the future. It starts by mentioning some famous cases of where newspapers got their predictions completely wrong. For example, Business Week's 1979 cover predicted 'The Death of Equities', when the Dow-Jones market index was at 800; today it is at 13,000 and The Economist was modest enough to remind its own readers of its infamous '$5 per barrel of oil' prediction, in the late 1990s; these days oil rarely trades below $50 a barrel.

An academic study prompted this article. That study tests the belief that the media usually get their prediction wrong. Over a 20-year period, the 549 stories on individual companies that made the cover of Business Week, Fortune and Forbes, are grouped into categories, depending on whether the coverage was very positive, neutral or very negative and the share performance in the 500 days before the cover story is compared to the following 500 days. (Shorter time horizons are also discussed in the paper.)

Prior to publication, the more positive the story, the more positive the share performance. After publication, the more positive the story, the more negative the share performance. The authors summarised this as

positive stories generally indicate the end of superior performance and negative news generally indicates the end of poor performance.

The Economist refers to this phenomenon as 'recency bias', the tendency to be excessively affected by the pattern of recent data. For example, brokers may subconsciously favour 'hot stocks' when making recommendations, since they believe clients will also favour such shares.

The academic study recommends a trading strategy: for those who are shorting a stock, that is betting that the price will fall, a cover exposé of that company is a good time to unwind any short position.


  • Is what The Economist calls 'recency bias' just another name for regression to the mean? What are the differences, if any?
  • The study observed that there were a lot more positive than negative stories. Can you think of reasons for this bias? How might this affect the results of the study?
  • Suppose the 'recency bias', or possibly regression to the mean, is a valid conclusion from this analysis. Assume you have $2 million to invest, and you can select companies that made the magazine cover, or some that did not. You can take "long" or "short" positions. How will you allocate your investments?
  • To what extend does The Economist's definition of recency bias match Wikipedia's, where it is know as 'chronological snobbery'.

Further reading

Submitted by John Gavin.