4E: Teaching authentic data science without pre-requisites


With Matthew Beckman (Penn State University), and Daniel Kaplan (Macalester College)


Abstract

What can our students accomplish in terms of authentic data science with no prerequisite statistics, mathematics, or computing? This session will report on our experiences teaching variants of such a course at Penn State University and Macalester College. With a bit of early scaffolding, such as interactive R commands and working code examples, students were rapidly able to engage in authentic data science. By the end of a one-credit semester long course, students are capable of data wrangling and visualization with modern R packages such as dplyr, tidyr, lubridate, and ggplot2 in addition to literacy with skills such as, regular expression matching, basic machine learning, and access to data by various means (e.g., GitHub, HTML tables on the web). Also underpinning this work, is a consistent emphasis on reproducible research using RMarkdown throughout the course. Participants attending the breakout session will see examples of student work from these courses, as well as work through guided activities for themselves that illustrate tasks that can be accomplished with their students as early as the first term. The session will also connect the benefits of this approach for use in subsequent coursework in a statistics curriculum. Participants should bring a laptop computer with a current web browser.