Sorry, you need to enable JavaScript to visit this website.

Data Management & Organization

  • This page presents a series of tutorials and interdisciplinary case studies that can be used in a variety of blended as well as brick-and-mortar courses. The materials can be used in introductory level data science courses as well as more advanced data science or statistics courses.  These materials assume that students have a basic prior knowledge of R or Rstudio.

    0
    No votes yet
  • The Global Terrorism Database (GTD) contains information about more than 140,000 terrorist incidents occurring between 1970 and 2014. The data in the GTD are gathered from information gathered through multiple news sources (LaFree, Dugan, & Miller, 2015). In this activity, we will study the extent to which chemical, biological, radiological, and nuclear (CBRN) weapons have been used so far. We analyze whether or not their past use fits with our perceptions. Have CBRN weapons been used successfully in the past? Which weapons are more historically dangerous (more fatalities, injuries) in the hands of terrorists? What are the implications of past usage of CBRN weapons compared to other weapons in determining our priorities in counter-terrorism policies?

    0
    No votes yet
  • The NYPD lab uses interactive, online graphs to better understand patterns in stop and arrest data for the New York Police Department. These data were originally collected by New York Police Department officers and record information gathered as a result of stop question and frisk (SQF) encounters during 2006. These data were used in a study carried out, under contract to the New York City Police Foundation, by the Rand Corporation's Center on Quality Policing. The release of the study, "Analysis of Racial Disparities in the New York Police Department's Stop, Question, and Frisk Practices" (Rand Document TR-534-NYCPF, 2007) generated interest in making the data available for secondary analysis. This data collection contains information on the officer's reasons for initiating a stop, whether the stop led to a summons or arrest, demographic information for the person stopped, and the suspected criminal behavior."

    0
    No votes yet
  • The Military Spending lab uses interactive, online graphs to better understand total military spending for each country. We see the limitations of traditional histograms and also consider the importance of using appropriate scales when comparing countries.  The emphasisis of this lab is on understanding the impact of appropriate data transformations and data visualizations.

    App:  http://shiny.grinnell.edu/Military_Spending_Basic/

    Handout:  http://web.grinnell.edu/individuals/kuipers/stat2labs/Handouts/MilSpendB...

    0
    No votes yet
  • The Journal of Statistics Education provides a collection of Java applets and excel spreadsheets (and the articles associated with them) from as early as 1998 on this webpage.

    0
    No votes yet
  • StatCrunch is a web-based package that does a complete range of statistical calculations. Formerly known as WebStat, it provides statistical calculation functions that would be done in most introductory statistics courses, including, but not limited to, creating histograms, pie charts, and boxplots; calculating summary statistics and confidence intervals; and performing hypothesis tests. It allows data to be entered in a spreadsheet style data window or opened from a file. StatCrunch does require a subscription for students and professionals ($13 for 6 months and $23 for 12 months).

    StatCrunchThis allows you to pull data sets contained on many web pages in various forms directly into StatCrunch for analysis.

    0
    No votes yet
  • CODAP provides an easy-to-use web-based data analysis platform, geared toward middle and high school students, and aimed at teachers and curriculum developers. CODAP can be incorporated across the curriculum to help students summarize, visualize and interpret data, advancing their skills to use data as evidence to support a claim.

    5
    Average: 5 (1 vote)
  • "There are a lot of small data problems that occur in big data.  They don't disappear because you've got lots of stuff.  They get worse." is a quote by British biostatistician David J. Spiegelhalter (1953 - ).  The quote may be found in a March 28, 2014 article in the Financial Times written by Tim Hartford entitled "Big data: are we making a big mistake?"

    0
    No votes yet
  • This is a chapter on data wrangling excerpted from a book on data science. The book is “Modern Data Science with R,” and the authors are Benjamin J. Baumer, Daniel T. Kaplan, and Nicholas J. Horton. It contains the R code needed to do basic things with data such as sorting, arranging, and summarizing data.

    0
    No votes yet
  • This site is a lesson on using SQL. It starts with a simple SELECT query. The user must type in the correct command to select certain columns from a database. Once the user has completed the first lesson, then he or she may continue to more complicated lessons.

    0
    No votes yet

Pages

list