W06: Visualization and Wrangling of Complex Datasets as an Entree to Statistics

With Danny Kaplan (Macalester College)


Visualization and modeling of today's data generally requires preliminary work to clean, reshape, and condense data and to bring together data from multiple sources. This workshop will present the techniques taught in Macalester College's no-prerequisite, short course: Data Computing Fundamentals. This will be very much a hands-on workshop where you will learn to use visualization software such as ggplot and data wrangling software such as dplyr. You'll also see how to introduce the topics to students. A reference for the material that will be introduced is "Data Computing" (2015) by Danny Kaplan. You'll be given a printed copy at the workshop.

Pre-requisites for this workshop:

  • working knowledge of R and experience with RStudio. In particular, you should know what a "data frame" is, and the use of functions.
  • comfort with editing dot-R and dot-Rmd files in RStudio
  • experience making simple statistical graphics in R
  • a computer on which you have installed the most recent versions of RStudio and R. You will also need to install several R packages from CRAN before the workshop begins.


  • You don't need to know data wrangling at any level.
  • You don't need experience with *programming* in R. As you'll see, we'll have no use for loops, if statements, or even writing new functions.