P3-02: Building early data science tools for a diverse audience

By Kayla Frisoli, Gordon Weinberg, & Rebecca Nugent (Carnegie Mellon University)


"As the data analytics needs of industry continue to grow and evolve, universities must respond by integrating new tools and technologies into their courses tools and technologies that better provide students with a marketable skill set. We discuss our initial incorporation of modern data science tools (R Markdown, Github, html) into Statistics & Data Science Methods, the second course in Carnegie Mellon Statistics & Data Science's introductory freshmen sequence. While most students in the class will study quantitative disciplines, this student demographic remains heterogeneous in both their intended majors and their computational background. Our intent is to teach the full data analysis workflow in a structured way that is accessible to anyone and devotes minimal cognitive load to learning coding syntax. Transitioning from Minitab (a traditional point-and-click statistics package), we built a series of R Markdown templates with structured code and guided practice that required minimal editing. Students, in a controlled environment, slowly pick up the technical tools necessary to build a complete data analysis report.

In Spring 2018, we have 170 students, each with 10 labs, 10 homework assignments and 2 student-driven data analysis projects. These projects simultaneously allow students to practice implementing data analysis methods and our software transition team to gauge how the students are using and hopefully mastering this new approach. Here we present summary information about class performance (both academic and coding/technology use) and lessons learned about course design and technology transition. We also include student feedback garnered from surveys administered during and at the end of the course."