Chance News 103: Difference between revisions

From ChanceWiki
Jump to navigation Jump to search
Line 102: Line 102:


In the dating example, he shows how evaluating the relationship between handsome and nice in one's individual optimum dating pool gives a false impression of the relationship between handsome and nice in the entire potential dating pool.  And he comes back to his original question about the relationship between a novel's popularity and its quality:
In the dating example, he shows how evaluating the relationship between handsome and nice in one's individual optimum dating pool gives a false impression of the relationship between handsome and nice in the entire potential dating pool.  And he comes back to his original question about the relationship between a novel's popularity and its quality:
<blockquote>Why are popular novels so terrible? It’s not because the masses don’t appreciate quality. It’s because the novels you read are the ones [that only satisfy your individual popular-and/or-good criterion].  …. If you force yourself to read unpopular [and low quality] novels chosen essentially at random … you find that most of them, just like the popular ones, are pretty bad.  And I imagine if you dated men chosen completely at random from OkCupid, you’d find that the less attractive men were just as jerky as the chiseled hunks.</blockquote>  
<blockquote>Why are popular novels so terrible? It’s not because the masses don’t appreciate quality. It’s because the novels you read are the ones [that only satisfy your individual popular-and/or-good criterion].  …. If you force yourself to read unpopular novels chosen essentially at random … you find that most of them, just like the popular ones, are pretty bad.  And I imagine if you dated men chosen completely at random from OkCupid, you’d find that the less attractive men were just as jerky as the chiseled hunks.</blockquote>  
Other examples/discussion of Berkson's fallacy can be found in:
Other examples/discussion of Berkson's fallacy can be found in:


1.  Ellenberg's 2014 book [http://www.jordanellenberg.com/how-not-to-be-wrong/ <i>How Not To Be Wrong</i>], where he describes an interesting medical example.<br>
1.  Ellenberg's 2014 book [http://www.jordanellenberg.com/how-not-to-be-wrong/ <i>How Not To Be Wrong</i>], where he describes an interesting medical example<br>
2.  [http://en.wikipedia.org/wiki/Berkson%27s_paradox "Berkson's paradox"], <i>Wikipedia</i><br>
2.  [http://en.wikipedia.org/wiki/Berkson%27s_paradox "Berkson's paradox"], <i>Wikipedia</i><br>
3.  Sneop's [http://ije.oxfordjournals.org/content/43/2/515 "Commentary: A structural approach to Berkson's fallacy and a guide to a history of opinions about it"] in <i>The International Journal of Epidemiology</i>, February 28, 2014<br>
3.  Sneop's [http://ije.oxfordjournals.org/content/43/2/515 "Commentary: A structural approach to Berkson's fallacy and a guide to a history of opinions about it"] in <i>The International Journal of Epidemiology</i>, February 28, 2014<br>

Revision as of 17:23, 4 February 2015

Quotations

"[T]he Law of Large Numbers works … not by balancing out what's already happened, but by diluting what's already happened with new data, until the past is so proportionally negligible that it can safely be forgotten." [p. 74]

"'I've been in a thousand arguments over this topic [hot hand],' [Amos Tversky] said. 'I've won them all, and I've convinced no one.'" [p. 127]

"The significance test is the detective, not the judge." [p. 161]

"Correlation is not transitive. …. Niacin is correlated with high HDL, and high HDL is correlated with low risk of heart attack, but that doesn't mean that niacin prevents heart attacks." [p. 342]

Jordan Ellenberg, in How Not To Be Wrong, 2014

Submitted by Margaret Cibes

Note. In fact, regarding the last quote above, if A is positively correlated with B and B is positively correlated with C, it is possible that A is negatively correlated with C. See Is the property of being positively correlated transitive? (The American Statistician, Vol. 55, No. 4, November, 2001). Thanks to Paul Alper for this link.


“Best, Smith, and Stubbs (2001)[1] found a positive relationship between perceived scientific hardness of psychology journals and the proportion of area devoted to graphs. It is interesting that Smith et al. (2002)[2] found an inverse relationship between area devoted to tables and perceived scientific hardness.”

Lane & Sandor, in “Designing Better Graphs by Including Distributional Information ....”, Psychological Methods, 2009

Submitted by Margaret Cibes

Forsooth

"And second, take a closer look at the NFL's ball rules:

The ball shall be made up of an inflated (12 1/2 to 13 1/2 pounds) urethane bladder enclosed in a pebble grained, leather case...

"Notice how there's no mention of pounds per square inch? It's just pounds. Taken literally, the rules say that the bladder inside the football should weigh between 12.5 and 13.5 lbs. ...According to the NFL's own rules, football is meant be played with something approaching the weight of the average bowling ball..."

in: Deflategate: Can science tell us if the Patriots cheated?, Christian Science Monitor, 21 January 2015

Submitted by Bill Peterson

Cancer and luck

Cancer’s random assault
By Denise Grady, New York Times, 5 January 2015

The article concerns a recent research paper, Variation in cancer risk among tissues can be explained by the number of stem cell divisions (Science 2 January 2015). From the abstract

Here, we show that the lifetime risk of cancers of many different types is strongly correlated (0.81) with the total number of divisions of the normal self-renewing cells maintaining that tissue’s homeostasis. These results suggest that only a third of the variation in cancer risk among tissues is attributable to environmental factors or inherited predispositions.

News coverage has created controversy by summarizing the findings in more colloquial terms, similar to this from the NYT article:

Random mutations may account for two-thirds of the risk of getting many types of cancer, leaving the usual suspects — heredity and environmental factors — to account for only one-third, say the authors, Cristian Tomasetti and Dr. Bert Vogelstein, of Johns Hopkins University School of Medicine.

Of course, saying that two-thirds of the variation among cancer types is "explained" by the rate of cell division is not the same thing as saying that two-thirds of risk of a particular cancer is can be accounted for by chance, or that two-thirds of all cancer cases are attributable to bad luck. But versions of these latter interpretations have in appeared in various responses to the article. For example, one letter to the NYT commented, "If their conclusion is correct, that two-thirds of many cancer types are caused by random mutations, then we have a long road ahead." Or consider this headline from Forbes: Most cancers may simply be due to bad luck.

The resulting confusion is addressed in

Bad luck and cancer: A science reporter’s reflections on a controversial story
by Jennifer Couzin-Frankel, Science Insider, 13 January 2015

This article presents the following data graphic of the relationship

Sn-cancer.png

We now see where the two-thirds comes from: if the correlation coefficient <math>r = 0.81</math>, as noted in the abstract above, then <math>R^2=0.66</math>.

In response to the controversy, Drs. Tomasetti and Vogelstein (the study's authors), offered some clarifying remarks in an addendum to the original Johns Hopkins news release. In particular, they construct the following extended analogy with driving a car: the road conditions correspond to environmental factors; the condition of your car corresponds to hereditary factors; the length of the trip corresponds to the number of cell divisions; and the risk of having an a accident corresponds to the risk of getting cancer. It makes sense that for any combination of car and road conditions, your risk of an accident increases with the length of the trip. Nevertheless, this does not suggest that you should routinely neglect to service your vehicle or or to intelligently plan your routes.

Discussion

1. The original headline of the news release was "Bad Luck of Random Mutations Plays Predominant Role in Cancer, Study Shows." Do you think this could have contributed to the misinterpretations? Can you suggest another wording?

2. Consider the same questions for the NYT headline, "Cancer's random assault."

Submitted by Bill Peterson

How not to describe a CI

Jeff Witmer sent the following example to the Isolated Statisticans e-mail list, with the subject line "Bayesians at NOAA?" It comes from an NOAA page explaining how to understand uncertainty in climate reports. The context was the recent announcement that 2014 was the warmest year on record (see, for example 2014 breaks heat record, challenging global warming skeptics, New York Times, 16 January 2015).

The plus/minus numbers, which are presented in the data tables of the monthly and annual Global State of the Climate reports, indicate the range of uncertainty (or "range") of the reported global temperature anomaly. For example, a reported global value of +0.69°C ±0.09°C indicates that the most likely value is 0.69°C warmer than the long-term average, but, conservatively, one can be confident that it falls somewhere between 0.60°C and 0.78°C above the long-term average. More technically, it is 95% likely that the value falls within this range. The chance of the actual value being at or beyond the range on the warm side is 2.5% (one in forty chance). Likewise, the chance of the actual value being at or beyond the cool end of the range is 2.5% (one in forty chance).

On a related note, the article Playing dumb on climate change (by Naomi Orestes, New York Times, 5 January 2015) gives a parallel misinterpretation of a p-value.

Typically, scientists apply a 95 percent confidence limit, meaning that they will accept a causal claim only if they can show that the odds of the relationship’s occurring by chance are no more than one in 20. But it also means that if there’s more than even a scant 5 percent possibility that an event occurred by chance, scientists will reject the causal claim.

None of this is to dispute the scientific evidence for climate change; we are simply documenting the persistent confusion in news reports that attempt to describe statistical confidence and/or significance. For an excellent discussion of this, see Michael Lavine's post at STATS.org, Climate change, statistical significance, and science (26 January 2015).

Deflate-gate statistics

Naomi Neff sent a link to the following, noting that a "stat-spat" sounded interesting.

Deflate-gate triggers stat spat as analysts attempt to solve why Patriots don't fumble
by Eric Adelson, Yahoo! Sports, 28 January 2015

For anyone who hasn't heard the details, the controversy over whether the New England Patriots broke the rules by deliberately underinflating footballs now has its own Wikipedia page. Indeed, there have been seemingly daily updates in major papers and evening television news. While some observers have lamented the disproportionate attention this story has received, it has at least been refreshing to see an exposition of the ideal gas law in the news!

The "stat spat" stems from a blogger's analysis claiming that New England has had an exceptionally low fumble rate in recent years, with the implication that the team has been cheating all along, deflating their footballs to make them easier to hold on to:

Stats show the New England Patriots became nearly fumble-proof after 2006 rule change proposed by Tom Brady
by Warren Sharp, Sharp Football Analysis, 26 January 2015

Sharp's analysis received wide coverage, including articles in Slate and Wall Street Journal. However, as data analysis experts began to take a closer look the analysis, they found numerous flaws. Neil Payne's post Your guide To Deflate-gate/Ballghazi-related statistical analyses at FiveThirtyEight.com (28 January 2015) includes links to a number of these.

Submitted by Bill Peterson

All handsome men may not be jerks

"Why Are Handsome Men Such Jerks?"
by Jordan Ellenberg, Slate, June 3, 2014

Ellenberg opens with a question about why the online likeability ratings of books drop after the books are awarded literary prizes, i.e., higher prestige leads to lower popularity. Then he makes an analogy to a person's dating experiences, where one might observe that "the handsome ones tend not to be nice, and the nice ones tend not to be handsome." According to Ellenberg, these are examples of Berkson's fallacy.

Joseph Berkson (1899-1982) headed the Mayo Clinic's statistical group in the mid-1900s. The fallacy refers to a situation in which two independent events become negatively dependent when one only considers outcomes where at least one of them occurs.[3]

In the dating example, he shows how evaluating the relationship between handsome and nice in one's individual optimum dating pool gives a false impression of the relationship between handsome and nice in the entire potential dating pool. And he comes back to his original question about the relationship between a novel's popularity and its quality:

Why are popular novels so terrible? It’s not because the masses don’t appreciate quality. It’s because the novels you read are the ones [that only satisfy your individual popular-and/or-good criterion]. …. If you force yourself to read unpopular novels chosen essentially at random … you find that most of them, just like the popular ones, are pretty bad. And I imagine if you dated men chosen completely at random from OkCupid, you’d find that the less attractive men were just as jerky as the chiseled hunks.

Other examples/discussion of Berkson's fallacy can be found in:

1. Ellenberg's 2014 book How Not To Be Wrong, where he describes an interesting medical example
2. "Berkson's paradox", Wikipedia
3. Sneop's "Commentary: A structural approach to Berkson's fallacy and a guide to a history of opinions about it" in The International Journal of Epidemiology, February 28, 2014
3. Mainland's short piece, "Berkson's fallacy in case-control studies", in the British Medical Journal, February 2, 1980

Submitted by Margaret Cibes