Chance News (September-October 2005)
Why Medical Studies are Often Wrong
Why medical studies are often wrong; John Allen Paulos explains how bad math haunts heath research
Who's Counting, ABCNews.com, 7 August 2005
In this installment of his online column, Paulos discusses a recent JAMA article about contradictions in health research ( John P. A. Ioannidis, J.P.A. Contradicted and initially stronger effects in highly cited clinical research. JAMA, July14, 2005; 294:218-228 ). You can find an abstract of the study here.
The JAMA article followed up on 45 studies that appeared in JAMA, the New England Journal of Medicine, and the Lancet over the years 1990-2003. All led to widely publicized claims of positive effects for some medical treatment. Hormone replacement therapy for post-menopausal women is a prominent example. For seven of these studies, later research contradicted the original claims; for seven others, later research found the benefits to be substantially smaller than originally stated. Popular news accounts summarized these results by saying one third of medical studies are wrong (for example, see this Associated Press report)!
Paulos cites a number of reasons for the problems. A single study is rarely definitive, but headlines and soundbites usually don't wait for scientific consensus to develop. People fail to appreciate differences in quality of research. Experiments are stronger than observational studies; in particular, surveys that depend on patients' self-reporting of lifestyle habits can obviously be unreliable. These points echo responses made by the medical journals themselves. Finally, Paulos discusses some conflicting psychological responses to medical news. People can be overly eager to believe that a new treatment will work. On the other side of the coin, in what he calls the "tyranny of the anecdote," people also overreact to stories of negative side-effects, even though such incidents may be isolated.
(1) On the last point, Paulos writes:
A distinction from statistics is marginally relevant. We're said to commit a Type I error when we reject a truth and a Type II error when we accept a falsehood. In listening to news reports people often have an inclination to suspend their initial disbelief in order to be cheered and thereby risk making a Type II error. In evaluating medical claims, however, researchers generally have an opposite inclination to suspend their initial belief in order not to be beguiled and thereby risk making a Type I error.
Do you understand the distinction being drawn? To what hypotheses does this discussion refer?
(2) Should we wait for a subsequent analysis to see if the one-third figure stands up?
Do Car Seats Really Work?
Freakonomics: the seat-belt solution
New York Times, 10 July 2005,
Steven J. Dubner and Steven D. Levitt
Dubner and Levitt are the authors of Freakonomics: A Rogue Economist Explains the Hidden Side of Everything (HarperCollins, 2005), which raises a host of provocative questions, including "Why do drug dealers still live with their mothers?" and "What do schoolteachers and sumo wrestlers have in common?"
In the present article, Dubner and Levitt challenge the conventional wisdom on car seats. Their take-no-prisoners style is evident in the following quote: "They [car seats] certainly have the hallmarks of an effective piece of safety equipment: big and bulky, federally regulated, hard to install and expensive. (You can easily spend $200 on a car seat).” Indeed, regarding the third point, the National Highway Traffic Safety Administration (NHTSA) estimates that 80 percent of car seats are not installed correctly.
What then are the benefits? Here the authors cite another NHTSA statistic: “[Car seats] are 54 percent effective in reducing deaths for children ages 1 to 4 in passenger cars.” It turns out, however, that this compares riding in a car seat to riding with no restraint. Surely the relevant comparison, as suggested in the title of this article, is to riding with seat belts.
The authors concede that for children up to two years old, seat belts are not an option, so car seats logically offer some protection. But for children of ages 2 and older, federal Fatality Analysis Reporting System (FARS) data shows no decrease in overall death rate for children riding in cars seats compared with seat belts. Moreover, this conclusion does not change after controlling for obvious confounding variables such as vehicle size or number of vehicles involved in the accident.
But perhaps the potential benefit of car seats is being masked by the installation woes noted earlier. To check this, the article reports that Dubner and Levitt had an independent lab conduct crash tests, using both 3-year-old and 6-year-old dummies, to compare car seats to lap-and-shoulder seat belts. In 30 mile per hour crashes, the impact figures for 3-year-olds were “nominally higher” with seat belts; for 6-year-olds the figures were “virtually identical.” In addition, both restraint systems performed well enough against federal standards that no injuries would be expected.
Is there a housing bubble?
Be Warned: Mr. Bubble's Worried Again
The New York Times, August 21, 2005
Irrational exuberance-Second edition Princeton University Press, 2005 Robert Shiller
According to the Times article, in December 1996 Shliller, while having lunch with Federal Reserve Chairman Alan Greenspan, asked him when the last time was that somebody in his job had warned the public that the stock market had become a bubble.
The next day while driving his son to school, Shiller heard on the radio that stocks were plunging becuased Greenspan had asked in a speech whether "irrational exuberance" was infecting the markets. He told his wife "I may have just started a worldwide stock-market crash," She accused him of delusions of grandeur.
What has become called the 2000 stock market crash did not start until 2000. In April 2000 the first edition of this book was published and according to the publisher the stock market crash prediced in this book started one month later.
Just how reliable are scenitific papers?
The Economist, 3 September 2005.
This article is available on line.
John Ioannidis, an epidemiologist claims that 50% of scientific papers eventually turn out to be wrong.
While it is know that science is a Darwinian process, proceeding as much by refutation as by publication, noone has tried to quantify this issue until recently. The author sets out to understand how frequently highly cited studies are contradicted.
"There is increasing concern that, in modern research, false findings may be the majority or even the vast majority of published research claims," says researcher John Ioannidis in a related analysis,Most published research findings may be false, which appears in PLoS Medicine an open access, freely available international medical journal. (The Public Library of Science (PLoS), which publishes The PLoS Medicine is a non-profit organization of scientists and physicians committed to making the world's scientific and medical literature a freely available public resource.)
Ioannidis examined 49 articles which were cited at least 1,000 times in widely read medical journals between 1990 and 2003. But 14, about a third, were later refuted, such as hormone replacement therapy safety (it was, then it wasn't), vitamin E increasing coronary health (it did, then it didn't) and the effectiveness of stents in balloon angioplasty for coronary-artery disease (they are but not as much as first claimed).
One source of error is unsophisticated reliance on 'statistical signifigance', as twenty randomly chosen hypothesis are likely to result in one or more statistically signifigant results. In fields like genetics where thousands of possible hypothesis, genes that contribute to a particular disease, are examined, many (false) positive results will routinely occur purely by chance.
Other factors contribute to false results. One driving factor is sample size. "The smaller the studies conducted in a scientific field, the less likely the research findings are to be true," says Ioannidis. And another factor is effect size, such as drugs that work only on a small number of patients. Research findings are more likely to be true in scientific fields with large effects, such as the impact of smoking on cancer, than in scientific fields where postulated effects are small, such as genetic risk factors for diseases where many different genes are involved in causation. If the effect sizes are very small in a particular field, says Ioannidis, it is "likely to be plagued by almost ubiquitous false positive claims."
The author goes on to define a mathematical model to quantify sources of error. He concludes that a large, well-designed study with little researcher bias has only an 85% chance of being right. A small sample, poorly performing drug with researcher bias has only a 17% chance of reaching the right conclusions. And over half of all published research is probably wrong.
The author's overall conclusion is 'Contradiction and initially stronger effects are not unusual in highly cited research of clinical interventions and their outcomes. The extent to which high citations may provoke contradictions and vice versa needs more study. Controversies are most common with highly cited nonrandomized studies, but even the most highly cited randomized trials may be challenged and refuted over time, especially small ones.'
In their related editorial, the PLoS Medicine editors discuss the implications of Ioannidis' analysis. They agree with him in some respects as "publication of preliminary findings, negative studies, confirmations, and refutations is an essential part of getting closer to the truth," they say. And the editors "encourage authors to discuss biases, study limitations, and potential confounding factors. We acknowledge that most studies published should be viewed as hypothesis-generating, rather than conclusive."
The original paper Contradicted and Initially Stronger Effects in Highly Cited Clinical Research, appeared in the Journal of the American Medical Association in July 2005. and is available on line for subscribers. The abstract is available on-line.
Dr Ioannidis's study focuses on medical research only. Would the same conclusions be applicable to other sciences such as physics or is there an inherent bias in his research?
The Economist article finishes by asking 'Is there a less than even chance that Dr. Ioannidis's paper is itself wrong?'
Submitted by John Gavin.