Sorry, you need to enable JavaScript to visit this website.

Data Management & Organization

  • The NYPD lab uses interactive, online graphs to better understand patterns in stop and arrest data for the New York Police Department. These data were originally collected by New York Police Department officers and record information gathered as a result of stop question and frisk (SQF) encounters during 2006. These data were used in a study carried out, under contract to the New York City Police Foundation, by the Rand Corporation's Center on Quality Policing. The release of the study, "Analysis of Racial Disparities in the New York Police Department's Stop, Question, and Frisk Practices" (Rand Document TR-534-NYCPF, 2007) generated interest in making the data available for secondary analysis. This data collection contains information on the officer's reasons for initiating a stop, whether the stop led to a summons or arrest, demographic information for the person stopped, and the suspected criminal behavior."

    0
    No votes yet
  • The Military Spending lab uses interactive, online graphs to better understand total military spending for each country. We see the limitations of traditional histograms and also consider the importance of using appropriate scales when comparing countries.  The emphasisis of this lab is on understanding the impact of appropriate data transformations and data visualizations.

    App:  http://shiny.grinnell.edu/Military_Spending_Basic/

    Handout:  http://web.grinnell.edu/individuals/kuipers/stat2labs/Handouts/MilSpendB...

    0
    No votes yet
  • The Journal of Statistics Education provides a collection of Java applets and excel spreadsheets (and the articles associated with them) from as early as 1998 on this webpage.

    0
    No votes yet
  • StatCrunch is a web-based package that does a complete range of statistical calculations. Formerly known as WebStat, it provides statistical calculation functions that would be done in most introductory statistics courses, including, but not limited to, creating histograms, pie charts, and boxplots; calculating summary statistics and confidence intervals; and performing hypothesis tests. It allows data to be entered in a spreadsheet style data window or opened from a file. StatCrunch does require a subscription for students and professionals ($13 for 6 months and $23 for 12 months).

    StatCrunchThis allows you to pull data sets contained on many web pages in various forms directly into StatCrunch for analysis.

    0
    No votes yet
  • CODAP provides an easy-to-use web-based data analysis platform, geared toward middle and high school students, and aimed at teachers and curriculum developers. CODAP can be incorporated across the curriculum to help students summarize, visualize and interpret data, advancing their skills to use data as evidence to support a claim.

    5
    Average: 5 (1 vote)
  • "There are a lot of small data problems that occur in big data.  They don't disappear because you've got lots of stuff.  They get worse." is a quote by British biostatistician David J. Spiegelhalter (1953 - ).  The quote may be found in a March 28, 2014 article in the Financial Times written by Tim Hartford entitled "Big data: are we making a big mistake?"

    0
    No votes yet
  • This is a chapter on data wrangling excerpted from a book on data science. The book is “Modern Data Science with R,” and the authors are Benjamin J. Baumer, Daniel T. Kaplan, and Nicholas J. Horton. It contains the R code needed to do basic things with data such as sorting, arranging, and summarizing data.

    0
    No votes yet
  • This site is a lesson on using SQL. It starts with a simple SELECT query. The user must type in the correct command to select certain columns from a database. Once the user has completed the first lesson, then he or she may continue to more complicated lessons.

    0
    No votes yet
  • Big data analysis is explained in this online course that introduces the user to the tools Hadoop and Mapreduce. These tools allow for the parallel computing necessary to analyze large amounts of data.

    0
    No votes yet
  • This is a web application framework for R, in which you can write and publish web apps without knowing HTML, Java, etc. You create two .R files: one that controls the user interface, and one that controls what the app does. The site contains examples of Shiny apps, a tutorial on how to get started, and information on how to have your apps hosted, if you don't have a server.

    0
    No votes yet

Pages

list