By Daisy Philtron (The Pennsylvania State University)
Abstract
A second course in statistics is what students progress to after they complete an introductory statistics course. In practice, students take a wide variety of introductory statistics courses: AP statistics in high school, a simulation-based inference introductory stat course, or a traditionally taught probability-based first course in statistics. Additionally, by the time students enter a second course they may have gained experience coding in R and/or Python, or they may not. An emerging challenge in designing an effective second course is how to bring all students from these diverse backgrounds to the same place, so they are ready to learn effectively.
In this work we describe the ‘stations’ approach developed at Penn State over the course of three semesters. The class environment consists of 15 to 40 students from a variety of backgrounds and majors. We expect all students to have access to a laptop or computer that can run R and Rstudio during class. Throughout the first week of class students work through three separate stations to ensure they all a) are ready to use R Markdown for coding and writing reports and b) can perform exploratory data analysis and inference typically taught in an introductory course.
The three stations are physical locations created from re-assembled desks with printed instructions and manipulatives for in-person learning or break-out rooms with pdf instructions for virtual learning. Based on descriptions of necessary skills, students choose to begin at either Station 1 or Station 2. Station 1 is for students with no experience in R; they get step-by-step guidance to install R and Rstudio on their machines. All others start at Station 2 where they are introduced to a motivating dataset about the US. The dataset contains census information such as high school and college graduation rates, percentage of residents that report eating enough fruits or vegetables, and percentages of residents that report drinking or smoking. There is also information from recent presidential elections, midterm election turnout, and most recently covid data. Students work together to develop original research questions and use R for the first time to produce exploratory data analysis. Finally (in station 3), students perform hypothesis tests in R to answer the research questions.
Their work is evaluated as their first homework assignment, when they must produce a polished report in Markdown that includes discussion of their research question, a thorough EDA, and in-context interpretation of the results.