Chance News 104: Difference between revisions

From ChanceWiki
Jump to navigation Jump to search
Line 69: Line 69:
This installment from the John's "Who's Counting" column describes several real world illustrations of non transitivity in correlation.   
This installment from the John's "Who's Counting" column describes several real world illustrations of non transitivity in correlation.   


Among these is an analysis from the aforementioned ''American Statistician'' article.  Looking at the 2000 batting data from the New York Yankees, it was found that the number of triples hit by a player correlated positively with the number of base hits he had, which in turn correlated positively with the number of home runs he hit;  however, the number of triples a player hit correlated negatively with the number of home runs he hit.  As John explains, good hitters get base hits of all kinds, so it is not surprising that home runs an triples are positively correlated with total hits.  But triples tend to be the result of speed, while home runs require power, so the powerful physique typical of home run hitters makes them less likely to get triples.
Among these is an analysis from the aforementioned ''American Statistician'' article.  Looking at the 2000 batting data from the New York Yankees, it was found that the number of triples hit by a player correlated positively with the number of base hits he had, which in turn correlated positively with the number of home runs he hit;  however, the number of triples a player hit correlated negatively with the number of home runs he hit.  As John explains, good hitters get base hits of all kinds, so it is not surprising that home runs an triples are positively correlated with total hits.  But triples tend to be the result of speed, while home runs require power, and powerfully built sluggers tend not to be fast runners.


See the column for further discussion, including an example of [http://en.wikipedia.org/wiki/Nontransitive_dice non-transitive dice], the potential for non-transitive preferences in three way elections, and the potential pitfalls resulting from the large number of correlations in medical data.
See the column for further discussion, including an example of [http://en.wikipedia.org/wiki/Nontransitive_dice non-transitive dice], the potential for non-transitive preferences in three way elections, and the potential pitfalls resulting from the large number of correlations in medical data.

Revision as of 20:24, 2 April 2015

Quotations

"Regression to the mean is so powerful that once-in-a-generation talent basically never sires once-in-a-generation talent. It explains why Michael Jordan’s sons were middling college basketball players and Jakob Dylan wrote two good songs....

"The Bush family’s dominance would be the basketball equivalent of Michael Jordan being the father of LeBron James and Kevin Durant — and of Michael Jordan’s father being Walt Frazier....In other words, it is virtually impossible, statistically speaking, that Bushes are consistently the most talented people to lead our country. Same for Chelsea Clinton or any other member of a political dynasty thought to be possible presidential timber."

-- Seth Stephens-Davidowitz, in Just how nepotistic are we?, New York Times, 21 March 2015

Submitted by Bill Peterson


“Keeping an open mind is a virtue – but, as the space engineer James Oberg once said, not so open that your brains fall out.”

“You can often see error bars in public opinion polls .... Imagine a society in which every speech in the Congressional Record, every television commercial, every sermon had an accompanying error bar or its equivalent.”

-- Carl Sagan in The Demon-Haunted World, 1996

Submitted by Margaret Cibes


"In the context of most observational studies, worrying about whether p < 0.05 or > 0.05 is like worrying about whether you made your bed when your house is burning."

-- Donald Berry, quoted by Gary Schwitzer at Health News Review

Submitted by Paul Alper


“Why are governments so eager to protect their citizens against dread risks, from cows to swine, and so hesitant to protect the very same people against the risk of financial disaster from investment banking?”

Gerd Gigerenzer in Risk Savvy: How to Make Good Decisions, 2014

Submitted by Margaret Cibes

Forsooth

From a Vancouver demographer commenting, tongue-in-cheek, on the result of making Canada’s census long form voluntary in 2010:
“Because of the move to the voluntary NHS, Canada is a richer, whiter, more educated country now.”
Note that the response rate dropped from 98.5 percent in 2006 to 68.6 in 2011.

“The Tragedy of Canada’s Census”, The Wall Street Journal, February 26, 2015

"The percentage of students scoring at/above Proficient in 3rd grade math increased .... Curiale posted the highest gain ..., improving from 27.0 percent to 51.9 percent, an increase of 24.9% percent. …. In 6th grade, the percentage of students scoring at/above Goal ... increased from 28.0 percent to 39.4 percent, a gain of 11.4 percent."

"2013 CAPT Results Show Increases and CMT Results Show Decreases"
CT State Department of Education CSDE News, August 13, 2013

“A few years ago I performed surgery to correct a displaced abomasums ... in a dairy cow .... Ben, the owner of the farm, asked how likely the cow was to have problems after the surgery. Trying to put it in terms that he could relate to I said, ‘If we did this procedure on 100 cows, I expect about 10 to 15 would not completely recover within a few weeks of surgery.’ He paused a moment and said, ‘Well that’s good because I only have 35 cows.’’”

Gerd Gigerenzer in Risk Savvy: How to Make Good Decisions, 2014

".... President Dwight Eisenhower express[ed] astonishment and alarm on discovering that fully half of all Americans have below average intelligence ....”

Carl Sagan in The Demon-Haunted World, 1996

“Texas GOP Representative Pete Sessions recently claimed on house floor that the cost of each [Obamacare] enrollee was costing the U.S. treasury $5 million. He came up with that estimate by taking a $108 billion estimate cost and dividing by 12 million new enrollees. The only problem with that? 108 billion divided by 12 million equals about 9,000. So he was only off by about $4,991,000.”

“GOP Rep claims Obamacare costing $5 million per enrollee”, Daily Kos, March 25, 2015

Submitted by Margaret Cibes


“I just bought jumbo rolls of toilet paper--big bargain. It says on label: 12 mega rolls equals 48 regular rolls. On the other side of the label it says: use four times less.”

Personal correspondence, March 21, 2015

Submitted by Margaret Cibes at the suggestion of Howard Mayer

Transitivity, Correlation and Causation

Theorem 1 of the article cited by Paul Alper in the previous issue, "Is the Property of Being Positively Correlated Transitive?" (The American Statistician, Vol. 55, No. 4, November, 2001), depends on the existence of non-observed independent random variables U, V, and W which cause the correlations between X=U+V, Y=W+V, and Z=W-U to be non-transitive. An interesting question is whether this relates back to the difference between causation and correlation.

The answer turns out to be no, we can get the same sort of result even in the presence of causative relationships between X, Y and Z. Here’s an example:

  • X is N(0,1);
  • Y = X + U, where U is N(0,1) and independent of X;
  • Z = Y - 1.5*X.

The correlation coefficients between X and Y and between Y and Z are both positive but the correlation coefficient between X and Z is negative.

Stan Lipopvetsky’s follow-up letter (The American Statistician, 56:4, 341-342, 2002) hints at this but does not include an actual example.

Submitted by Emil M Friedman

Baseball, medicine and politics

Thanks to John Allen Paulos for sending the following link:

Who's Counting: Non-transitivity in baseball, medicine, gambling and politics
by John Allen Paulos, ABCNews.com, 5 December 2010

This installment from the John's "Who's Counting" column describes several real world illustrations of non transitivity in correlation.

Among these is an analysis from the aforementioned American Statistician article. Looking at the 2000 batting data from the New York Yankees, it was found that the number of triples hit by a player correlated positively with the number of base hits he had, which in turn correlated positively with the number of home runs he hit; however, the number of triples a player hit correlated negatively with the number of home runs he hit. As John explains, good hitters get base hits of all kinds, so it is not surprising that home runs an triples are positively correlated with total hits. But triples tend to be the result of speed, while home runs require power, and powerfully built sluggers tend not to be fast runners.

See the column for further discussion, including an example of non-transitive dice, the potential for non-transitive preferences in three way elections, and the potential pitfalls resulting from the large number of correlations in medical data.

IQ and breast-feeding

A propos this last comment about medical data (though not transitivity per se) we received the following link from Douglas Rogers, with the comment "so many variables..."

Breastfeeding raises IQ… and some worrying questions
by Dean Burnett, Guardian, 18 March 2015

A long-term study found that the length of time babies were breastfed was positively associated with both IQ and financial success in later life. This article discusses the potential for confounding with such variables as the parents's income and education, the mother's age and health, and the baby's weight at birth. The researchers took great care to consider alternative explanations for the observed effects, but conceded that they "could not completely rule out the possibility mothers who breastfed helped their babies’ development in other ways. 'Some people say it is not the effect of breastfeeding but it is the mothers who breastfeed who are different in their motivation or their ability to stimulate the kids,' Horta [lead author on the study] told the Guardian."

TED Talk: Mathematics of Love

“The Mathematics of Love”, by Hannah Fry, April 2014
(17 min video, transcript provided)

Fry, an aerodynamicist, discusses three topics related to mating, based on recent statistical studies:

Topic #1: How to win at online dating (presenting oneself on social media in order to be popular)
Topic #2: How to pick the perfect partner (timing one’s choice)
Topic #3: How to avoid divorce (analogy to nations headed for war)

One study she refers to is “Why I Don’t Have a Girlfriend”, by economist Peter Backus, who uses the Drake equation to estimate the number of potential girlfriends for him.

Submitted by Margaret Cibes

Health-care advice diverges

From “Personal genetic testing service launches in the UK,” Significance, February 2015:

A study, published in Genetics in Medicine in 2013..., compared 23andMe’s analysis of risk factors to that of two similar services .... It found differences in the way each company scored risks for specific people and diseases, including examples where different companies placed the same people in entirely opposite risk categories.

From The Wall Street Journal, March 23, 2015:

Headline #1: “Are Low-Salt Diets Necessary (or Healthy) for Most People?”
Op-ed responses from doctors :

Yes: Less Salt Reduces the Risk of Heart Disease
No: A Low salt Diet is Neither Safe Nor Feasible

Headline #2: “Should All Adults Take a Daily Aspirin?”
Op-ed responses from doctors:

Yes: The Evidence Is Clear It Reduces Deaths From Cancer
No: The Risks Are Large, and Increase as a Person Ages

Headline #3: “Is a Paleo Diet Healthy?”
Op-ed responses from doctors:

Yes: It Helps Control Weight, and Lowers Risks of Cancer
No: You Lose Too Much Pleasure – For Dubious Benefits

Submitted by Margaret Cibes