Chance News 104: Difference between revisions

From ChanceWiki
Jump to navigation Jump to search
No edit summary
Line 31: Line 31:
Among these is an analysis from the aforementioned ''American Statistician'' article.  Looking at the 2000 batting data from the New York Yankees, it was found that the number of triples hit by a player correlated positively with the number of base hits he had, which in turn correlated positively with the number of home runs he hit;  however, the number of triples a player hit correlated negatively with the number of home runs he hit.  As John explains, good hitters get base hits of all kinds, so it is not surprising that home runs an triples are positively correlated with total hits.  But triples tend to be the result of speed, while home runs require power, so the powerful physique typical of home run hitters makes them less likely to get triples.
Among these is an analysis from the aforementioned ''American Statistician'' article.  Looking at the 2000 batting data from the New York Yankees, it was found that the number of triples hit by a player correlated positively with the number of base hits he had, which in turn correlated positively with the number of home runs he hit;  however, the number of triples a player hit correlated negatively with the number of home runs he hit.  As John explains, good hitters get base hits of all kinds, so it is not surprising that home runs an triples are positively correlated with total hits.  But triples tend to be the result of speed, while home runs require power, so the powerful physique typical of home run hitters makes them less likely to get triples.


See the column for further discussion, including an example of [http://en.wikipedia.org/wiki/Nontransitive_dice non-transitive dice], the potential for non-transitive preferences in three way elections, and the pitfalls from the large number of correlations in medical data.
See the column for further discussion, including an example of [http://en.wikipedia.org/wiki/Nontransitive_dice non-transitive dice], the potential for non-transitive preferences in three way elections, and the potential pitfalls resulting from the large number of correlations in medical data.


==Item 2==
==Item 2==

Revision as of 15:23, 24 March 2015

Quotations

Forsooth

Transitivity, Correlation and Causation

Theorem 1 of the article cited by Paul Alper in the previous issue, "Is the Property of Being Positively Correlated Transitive?" (The American Statistician, Vol. 55, No. 4, November, 2001), depends on the existence of non-observed independent random variables U, V, and W which cause the correlations between X=U+V, Y=W+V, and Z=W-U to be non-transitive. An interesting question is whether this relates back to the difference between causation and correlation.

The answer turns out to be no, we can get the same sort of result even in the presence of causative relationships between X, Y and Z. Here’s an example:

  • X is N(0,1);
  • Y = X + U, where U is N(0,1) and independent of X;
  • Z = Y - 1.5*X.

The correlation coefficients between X and Y and between Y and Z are both positive but the correlation coefficient between X and Z is negative.

Stan Lipopvetsky’s follow-up letter (The American Statistician, 56:4, 341-342, 2002) hints at this but does not include an actual example.

Submitted by Emil M Friedman

Followup

Thanks to John Allen Paulos for sending the following link:

Who's Counting: Non-transitivity in baseball, medicine, gambling and politics
by John Allen Paulos, ABCNews.com, 5 December 2010

This installment from the John's "Who's Counting" column describes several real world illustrations of non transitivity in correlation.

Among these is an analysis from the aforementioned American Statistician article. Looking at the 2000 batting data from the New York Yankees, it was found that the number of triples hit by a player correlated positively with the number of base hits he had, which in turn correlated positively with the number of home runs he hit; however, the number of triples a player hit correlated negatively with the number of home runs he hit. As John explains, good hitters get base hits of all kinds, so it is not surprising that home runs an triples are positively correlated with total hits. But triples tend to be the result of speed, while home runs require power, so the powerful physique typical of home run hitters makes them less likely to get triples.

See the column for further discussion, including an example of non-transitive dice, the potential for non-transitive preferences in three way elections, and the potential pitfalls resulting from the large number of correlations in medical data.

Item 2