2H: Implementing reproducible research in statistics courses using Git/Github

Adam J Sullivan (Brown University) & Matthew Beckman (Pennsylvania State University)


Reproducible research is an emerging concept that needs to be modeled early in a student’s coursework. Source control tools (e.g. Git/GitHub) are an important part of a reproducible workflow. There are many resources out there for learning Git/Github, but most are written for people with more technical experience. How can we onboard students efficiently? How can we help students avoid common problems?

This breakout session intends to (1) introduce the audience to source control with Git/GitHub, (2) make a case for source control as a substantive learning objective in teaching/training statisticians and analysts, and (3) share resources and engage participants in implementation details to begin adding Git/GitHub as a meaningful part of a statistics course.

We will share two models for implementation suitable for an array of technical and non-technical students with no prior experience needed.  We will discuss engagement opportunities wherein students reproduce the work of another researcher and then extend the work that was done using a reproducible workflow.  We will also discuss viable solutions to an important tension between providing students real experience with tools designed to facilitate collaboration and code sharing while preserving an environment protects academic integrity.

The session includes hands-on experiences with methods to first introduce GitHub to students, to streamline Git & RStudio linkage, and to distribute assignments with starter code to students using GitHub Classroom.