Sorry, you need to enable JavaScript to visit this website.

Data Presentation

  • R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

    R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

    0
    No votes yet
  • Which is more robust against outliers: mean or median?  This app demonstrates the (in)stability of these descriptive statistics as the value of an outlier and the number of data points change.

    0
    No votes yet
  • This compendium facilitates the creation of good graphs by presenting a set of concrete examples, ranging from the trivial to the advanced. The graphs can all be reproduced and adjusted by copy-pasting code into the R console. Almost every example in this compendium is driven by the same philosophy: A good graph is a simple graph, in the Einsteinian sense that a graph should be made as simple as possible, but not simpler.  A note for R fans: the majority of our plots have been created in base R, but you will encounter some examples in ggplot.

     

    0
    No votes yet
  • Find the best linear fit for a given set of data points and residuals (or let this app show you how it is done).

    0
    No votes yet
  • This resource is designed to provide new users to R, RStudio, and R Markdown with the introductory steps needed to begin their own reproducible research. Many screenshots and screencasts (with no audio) will be included, but if further clarification is needed on these or any other aspect of the book, please create a GitHub issue here or email me with a reference to the error/area where more guidance is necessary.  It is recommended that you have R version 3.3.0 or later, RStudio Desktop version 1.0 or higher, and rmarkdown R package version 1.0 or higher. 

    0
    No votes yet
  • These handouts/links give a foundational understanding of how to set up and use R

    0
    No votes yet
  • This page presents a series of tutorials and interdisciplinary case studies that can be used in a variety of blended as well as brick-and-mortar courses. The materials can be used in introductory level data science courses as well as more advanced data science or statistics courses.  These materials assume that students have a basic prior knowledge of R or Rstudio.

    0
    No votes yet
  • The goal of this text is to provide a broad set of topics and methods that will give students a solid foundation in understanding how to make decisions with data. This text presents workbook-style, project-based material that emphasizes real world applications and conceptual understanding. Each chapter contains:

    • An introductory case study focusing on a particular statistical method in order to encourage students to experience data analysis as it is actually practiced.
    • guided research project that walks students through the entire process of data analysis, reinforcing statistical thinking and conceptual understanding.
    • Optional extended activities that provide more in-depth coverage in diverse contexts and theoretical backgrounds. These sections are particularly useful for more advanced courses that discuss the material in more detail. Some Advanced Lab sections that require a stronger background in mathematics are clearly marked throughout the text.
    • Data sets from multiple disciplines and software instructions for Minitab and R.

    The text is highly adaptable in that the various chapters/parts can be taken out of order or even skipped to customize the course to your audience. Depending on the level of in-class active learning, group work, and discussion that you prefer in your course, some of this work might occur during class time and some outside of class. 

    0
    No votes yet
  • The Global Terrorism Database (GTD) contains information about more than 140,000 terrorist incidents occurring between 1970 and 2014. The data in the GTD are gathered from information gathered through multiple news sources (LaFree, Dugan, & Miller, 2015). In this activity, we will study the extent to which chemical, biological, radiological, and nuclear (CBRN) weapons have been used so far. We analyze whether or not their past use fits with our perceptions. Have CBRN weapons been used successfully in the past? Which weapons are more historically dangerous (more fatalities, injuries) in the hands of terrorists? What are the implications of past usage of CBRN weapons compared to other weapons in determining our priorities in counter-terrorism policies?

    0
    No votes yet
  • The NYPD lab uses interactive, online graphs to better understand patterns in stop and arrest data for the New York Police Department. These data were originally collected by New York Police Department officers and record information gathered as a result of stop question and frisk (SQF) encounters during 2006. These data were used in a study carried out, under contract to the New York City Police Foundation, by the Rand Corporation's Center on Quality Policing. The release of the study, "Analysis of Racial Disparities in the New York Police Department's Stop, Question, and Frisk Practices" (Rand Document TR-534-NYCPF, 2007) generated interest in making the data available for secondary analysis. This data collection contains information on the officer's reasons for initiating a stop, whether the stop led to a summons or arrest, demographic information for the person stopped, and the suspected criminal behavior."

    0
    No votes yet

Pages

list