Sandbox: Difference between revisions

From ChanceWiki
Jump to navigation Jump to search
Line 12: Line 12:
*Data naivete
*Data naivete


His principle example is a story that was widely circulated via social media.  Here is the much-tweeted scatterplot  
His principle example is a story that was widely circulated via social media, featuring the following scatterplot  
<center>[[File:Porn_politics.png‎  | 500px]]</center>
<center>[[File:Porn_politics.png‎  | 500px]]</center>
Kansas is a clear outlier.  Harris credits a reader of Andrew Sullivan's blog for the following explanation
Kansas is a clear outlier.  Harris credits a reader of Andrew Sullivan's blog for the following explanation

Revision as of 01:28, 3 June 2014

Politics and porn (What's the matter with Kansas?)

Distrust your data
by Jacob Harris, Source (opennews.org) 22 May 2014

Harris identifies 6 ways to make mistakes in reporting data:

  • Sloppy proxieson
  • Dichotomizing
  • Correlation does not equal causation
  • Ecological inference
  • Geocoding
  • Data naivete

His principle example is a story that was widely circulated via social media, featuring the following scatterplot

Porn politics.png

Kansas is a clear outlier. Harris credits a reader of Andrew Sullivan's blog for the following explanation

What happened here was that a large percentage of IP addresses could not be resolved to an address any more specific than “USA.” When that address was geocoded, it returned a point in the centroid of the continental United States, which placed it in the state of—you guessed it—Kansas!

The "ecological fallacy" is similar to Durkheim's (see Chance News 92 here for more discussion), where he noted that the more Protestant the Prussian province, the larger the suicide rate--but it turns out that the suicides were actually committed by Catholics, not Protestants. The possible analogy here: in Democratic states it is the Republicans who are frequenting pornography web sites.

Submitted by Paul Alper