Chance News (September-October 2005): Difference between revisions

From ChanceWiki
Jump to navigation Jump to search
Line 42: Line 42:
50% of scientific papers eventually turn out to be wrong.
50% of scientific papers eventually turn out to be wrong.


While it is know that science is a Darwinian process,
While it is known that science is a Darwinian process,
proceeding as much by refutation as by publication,
proceeding as much by refutation as by publication,
noone has tried to quantify this issue until recently.
noone has tried to quantify this issue until recently.
The author sets out to understand how frequently highly cited studies are contradicted.
The author sets out to understand how frequently highly cited studies are contradicted.


"There is increasing concern that, in modern research, false findings may be the majority or even the vast majority of published research claims," says researcher John Ioannidis in a related analysis,[http://www.eurekalert.org/pub_releases/2005-08/plos-mpr082405.php Most published research findings may be false], which appears in  
"There is increasing concern that, in modern research, false findings may be the majority or even the vast majority of published research claims," says researcher John Loannidis in a related analysis,[http://www.eurekalert.org/pub_releases/2005-08/plos-mpr082405.php Most published research findings may be false], which appears in  
[http://www.plosmedicine.org  PLoS Medicine]  
[http://www.plosmedicine.org  PLoS Medicine]  
an open access, freely available international medical journal.  
an open access, freely available international medical journal.  
Line 56: Line 56:
medical literature a freely available public resource.)
medical literature a freely available public resource.)


Ioannidis examined 49 articles which were cited at least 1,000 times
Loannidis examined 49 articles which were cited at least 1,000 times
in widely read medical journals between 1990 and 2003.
in widely read medical journals between 1990 and 2003.
But 14, about a third, were later refuted,
But 14, about a third, were later refuted,
Line 72: Line 72:
Other factors contribute to false results.
Other factors contribute to false results.
One driving factor is sample size.
One driving factor is sample size.
"The smaller the studies conducted in a scientific field, the less likely the research findings are to be true," says Ioannidis.
"The smaller the studies conducted in a scientific field, the less likely the research findings are to be true," says Loannidis.
And another factor is effect size, such as drugs that work only on a small number of patients.
And another factor is effect size, such as drugs that work only on a small number of patients.
Research findings are more likely to be true in scientific fields with large effects, such as the impact of smoking on cancer, than in scientific fields where postulated effects are small, such as genetic risk factors for diseases where many different genes are involved in causation. If the effect sizes are very small in a particular field, says Ioannidis, it is "likely to be plagued by almost ubiquitous false positive claims."  
Research findings are more likely to be true in scientific fields with large effects, such as the impact of smoking on cancer, than in scientific fields where postulated effects are small, such as genetic risk factors for diseases where many different genes are involved in causation. If the effect sizes are very small in a particular field, says Loannidis, it is "likely to be plagued by almost ubiquitous false positive claims."  


The author goes on to define a mathematical model to quantify sources of error.
The author goes on to define a mathematical model to quantify sources of error.
Line 91: Line 91:
and refuted over time, especially small ones.'
and refuted over time, especially small ones.'


In their related editorial, the PLoS Medicine editors discuss the implications of Ioannidis' analysis.  
In their related editorial, the PLoS Medicine editors discuss the implications of Loannidis' analysis.  
They agree with him in some respects as
They agree with him in some respects as
"publication of preliminary findings, negative studies, confirmations, and refutations is an essential part of getting closer to the truth," they say.  
"publication of preliminary findings, negative studies, confirmations, and refutations is an essential part of getting closer to the truth," they say.  

Revision as of 18:05, 8 September 2005

Do car seats really work?

Freakonomics: the seat-belt solution
New York Times, 10 July 2005,
Steven J. Dubner and Steven D. Levitt

Dubner and Levitt are the authors of Freakonomics: A Rogue Economist Explains the Hidden Side of Everything (HarperCollins, 2005), which raises a host of provocative questions, including "Why do drug dealers still live with their mothers?" and "What do schoolteachers and sumo wrestlers have in common?"

In the present article, Dubner and Levitt challenge the conventional wisdom on car seats. Their take-no-prisoners style is evident in the following quote: "They [car seats] certainly have the hallmarks of an effective piece of safety equipment: big and bulky, federally regulated, hard to install and expensive. (You can easily spend $200 on a car seat).” Indeed, regarding the third point, the National Highway Traffic Safety Administration (NHTSA) estimates that 80 percent of car seats are not installed correctly.

What then are the benefits? Here the authors cite another NHTSA statistic: “[Car seats] are 54 percent effective in reducing deaths for children ages 1 to 4 in passenger cars.” It turns out, however, that this compares riding in a car seat to riding with no restraint. Surely the relevant comparison, as suggested in the title of this article, is to riding with seat belts.

The authors concede that for children up to two years old, seat belts are not an option, so car seats logically offer some protection. But for children of ages 2 and older, federal Fatality Analysis Reporting System (FARS) data shows no decrease in overall death rate for children riding in cars seats compared with seat belts. Moreover, this conclusion does not change after controlling for obvious confounding variables such as vehicle size or number of vehicles involved in the accident.

But perhaps the potential benefit of car seats is being masked by the installation woes noted earlier. To check this, the article reports that Dubner and Levitt had an independent lab conduct crash tests, using both 3-year-old and 6-year-old dummies, to compare car seats to lap-and-shoulder seat belts. In 30 mile per hour crashes, the impact figures for 3-year-olds were “nominally higher” with seat belts; for 6-year-olds the figures were “virtually identical.” In addition, both restraint systems performed well enough against federal standards that no injuries would be expected.

Is there a housing bubble?

Be Warned: Mr. Bubble's Worried Again
The New York Times, August 21, 2005
David Leonhardt

Irrational exuberance-Second edition Princeton University Press, 2005 Robert Shiller

According to the Times article, in December 1996 Shliller, while having lunch with Federal Reserve Chairman Alan Greenspan, asked him when the last time was that somebody in his job had warned the public that the stock market had become a bubble.

The next day while driving his son to school, Shiller heard on the radio that stocks were plunging becuased Greenspan had asked in a speech whether "irrational exuberance" was infecting the markets. He told his wife "I may have just started a worldwide stock-market crash," She accused him of delusions of grandeur.

What has become called the 2000 stock market crash did not start until 2000. In April 2000 the first edition of this book was published and according to the publisher the stock market crash prediced in this book started one month later.

To be continued

Just how reliable are scenitific papers?

The Economist, 3 September 2005. This article is available on line.
Why Most Published Research Findings Are False (pdf version), John P. A. Ioannidis

John Ioannidis, an epidemiologist claims that 50% of scientific papers eventually turn out to be wrong.

While it is known that science is a Darwinian process, proceeding as much by refutation as by publication, noone has tried to quantify this issue until recently. The author sets out to understand how frequently highly cited studies are contradicted.

"There is increasing concern that, in modern research, false findings may be the majority or even the vast majority of published research claims," says researcher John Loannidis in a related analysis,Most published research findings may be false, which appears in PLoS Medicine an open access, freely available international medical journal. (The Public Library of Science (PLoS), which publishes The PLoS Medicine is a non-profit organization of scientists and physicians committed to making the world's scientific and medical literature a freely available public resource.)

Loannidis examined 49 articles which were cited at least 1,000 times in widely read medical journals between 1990 and 2003. But 14, about a third, were later refuted, such as hormone replacement therapy safety (it was, then it wasn't), vitamin E increasing coronary health (it did, then it didn't) and the effectiveness of stents in balloon angioplasty for coronary-artery disease (they are but not as much as first claimed).

One source of error is unsophisticated reliance on 'statistical signifigance', as twenty randomly chosen hypothesis are likely to result in one or more statistically signifigant results. In fields like genetics where thousands of possible hypothesis, genes that contribute to a particular disease, are examined, many (false) positive results will routinely occur purely by chance.

Other factors contribute to false results. One driving factor is sample size. "The smaller the studies conducted in a scientific field, the less likely the research findings are to be true," says Loannidis. And another factor is effect size, such as drugs that work only on a small number of patients. Research findings are more likely to be true in scientific fields with large effects, such as the impact of smoking on cancer, than in scientific fields where postulated effects are small, such as genetic risk factors for diseases where many different genes are involved in causation. If the effect sizes are very small in a particular field, says Loannidis, it is "likely to be plagued by almost ubiquitous false positive claims."

The author goes on to define a mathematical model to quantify sources of error. He concludes that a large, well-designed study with little researcher bias has only an 85% chance of being right. A small sample, poorly performing drug with researcher bias has only a 17% chance of reaching the right conclusions. And over half of all published research is probably wrong.

The author's overall conclusion is 'Contradiction and initially stronger effects are not unusual in highly cited research of clinical interventions and their outcomes. The extent to which high citations may provoke contradictions and vice versa needs more study. Controversies are most common with highly cited nonrandomized studies, but even the most highly cited randomized trials may be challenged and refuted over time, especially small ones.'

In their related editorial, the PLoS Medicine editors discuss the implications of Loannidis' analysis. They agree with him in some respects as "publication of preliminary findings, negative studies, confirmations, and refutations is an essential part of getting closer to the truth," they say. And the editors "encourage authors to discuss biases, study limitations, and potential confounding factors. We acknowledge that most studies published should be viewed as hypothesis-generating, rather than conclusive."

The original paper Contradicted and Initially Stronger Effects in Highly Cited Clinical Research, appeared in the Journal of the American Medical Association in July 2005. and is available on line for subscribers. The abstract is available on-line.

A related Guardian (September 8, 2005) article is Don't dumb me down. 'Statistics are what causes the most fear for reporters, and so they are usually just edited out, with interesting consequences. Because science isn't about something being true or not true: that's a humanities graduate parody. It's about the error bar, statistical significance, it's about how reliable and valid the experiment was, it's about coming to a verdict, about a hypothesis, on the back of lots of bits of evidence.'

In fact, the Guardian has a web page with weekly articles devoted to bad science. You are invited to make submissions 'if you are a purveyor of bad science, be afraid. If you are on the side, of light and good, be vigilant: and for the love of Karl Popper, email me every last instance you find of this evil. Only by working joyously together can we free this beautiful, complex world from such a vile scourge.'

Discussion

Dr Ioannidis's study focuses on medical research only. Would the same conclusions be applicable to other sciences such as physics or is there an inherent bias in his research?

The Economist article finishes by asking 'Is there a less than even chance that Dr. Ioannidis's paper is itself wrong?'

Submitted by John Gavin.

Paulos on errors in medical studies

Why medical studies are often wrong; John Allen Paulos explains how bad math haunts heath research
Who's Counting, ABCNews.com, 7 August 2005

In this installment of his online column, Paulos considers the JAMA report about contradictions in health research (Ioannidis, J.P.A. Contradicted and initially stronger effects in highly cited clinical research. JAMA, July14, 2005; 294:218-228 ). This research is well described above.

In the present article, Paulos cites a number of reasons for the problems. A single study is rarely definitive, but headlines and soundbites usually don't wait for scientific consensus to develop. People fail to appreciate differences in quality of research. Experiments are stronger than observational studies; in particular, surveys that depend on patients' self-reporting of lifestyle habits can obviously be unreliable. These ideas echo points made by the medical journals themselves in response to news reports (see, for example, see this Associated Press report).

Paulos also describes some conflicting psychological responses to medical news. People can be overly eager to believe that a new treatment will work. On the other side of the coin, in what he calls the "tyranny of the anecdote," people also overreact to stories of negative side-effects, even though such incidents may be isolated.

DISCUSSION QUESTION:

On the last point, Paulos writes:

A distinction from statistics is marginally relevant. We're said to commit a Type I error when we reject a truth and a Type II error when we accept a falsehood. In listening to news reports people often have an inclination to suspend their initial disbelief in order to be cheered and thereby risk making a Type II error. In evaluating medical claims, however, researchers generally have an opposite inclination to suspend their initial belief in order not to be beguiled and thereby risk making a Type I error.

Do you understand the distinction being drawn? To what hypotheses does this discussion refer?