Some thoughts and experiences using simulation/randomization based methods in introductory statistics courses and in the undergraduate statistics curriculum

Andrew Schaffner, Cal Poly, San Luis Obispo

I’m a skeptic. As a mid-career classically trained statistician, for many years I held tight to the teaching methods used when I was a student: lecture presentations and mathematical arguments to support instruction. For non-calculus based courses I would rely heavily on analogies to bridge concepts (e.g., Behar, et al. Twenty five analogies). Yet even with analogies, students performance on exams and conversations in my office hours often fell short of demonstrating real understanding. I’m waking up. In part because of my work as a co-author with Jeff Witmer, or perhaps because my across-the-hall neighbor is Beth Chance, I’ve finally begun to embrace randomization and simulation methods for classroom instruction.[pullquote]When working with our majors, … we can take the time to develop foundational understanding with a more in depth randomization curriculum. [/pullquote]

For me, the barrier to using simulation methods in introductory courses has always been the disconnect between software that is useful for teaching versus doing statistics. In many of our service courses, by the end of the course students are expected to be able to use statistical software to conduct a plethora of simple analyses germane to their disciplines (e.g., using JMP or Minitab) so doing is important. In a 10-week quarter how is it possible to foster deep understanding and provide technical skills? Moving in the right direction to bridge this gap is the collection of Rossman/Chance Applets among others. The newer generation of web based tools for exploring statistical concepts certainly don’t look like software for doing statistics, but they are actually quite powerful and produce output that can now more easily be compared to output from traditional doing software like Minitab or JMP. For example, the more recent versions of the Rossman/Chance two-sample t-test applet has options to graphically display the randomization distribution for both the difference of sample means as well as the value of the t-statistic making it very easy to connect the output of the learning applet to the doing software. More importantly, the applets directly support the intuitive notions that we are trying to help our students recognize. The technology barrier is crumbling.

My early attempts to bring even these quality tools into the classroom via activities and demonstrations, however, made it hard for me to progress through the very long list of prescribed service curriculum. As a result, my most recent attempts have led me to adopt a flipped mode of instruction for a few key modules including the introduction to sampling distributions and the two-sample randomization test (quantitative response). Before class, students watch short online videos where I demonstrate a problem background and introduce the applet. Then, at home before class, students follow a guided activity using the applets and write about their findings. In class on the following day we connect the randomization results to the more traditional approach (e.g., two-sample t-test) and demonstrate how traditional statistical software is used to carry out the analysis. While this isn’t a full adoption of a randomization based curriculum, students have commented on how helpful the modules were and I’ve been able to mostly keep the pace of the course because of the flipped delivery. [pullquote]The curriculum can be well supported by existing web based tools such as the [Rossman/Chance] Applets described above or others (e.g., StatKey by Lock et al. or ). Alternatively one can even include the development of basic computing skills as students write code (e.g., R or Python) to implement randomization methods.[/pullquote]

The above, however, is really a digression from the main topic of this thread — implications in the undergraduate statistics curriculum. When working with our majors, we have the luxury of not having to rush them through a laundry list of methods in their first course. Instead, we can take the time to develop foundational understanding with a more in depth randomization curriculum. The curriculum can be well supported by existing web based tools such as the Applets described above or others (e.g., StatKey by Lock et al. or ). Alternatively one can even include the development of basic computing skills as students write code (e.g., R or Python) to implement randomization methods. At Cal Poly we use a mix of these approaches with our first two major core courses: STAT 301 and 302. Simulation and randomization methods are used for one and two sample inference for means and proportions as well as regression, one, and two-way ANOVA. Subsequent courses offer more depth in these topics, provide mathematical underpinnings, and demonstrate the use of statistical software for fitting more complex models.

Ideally we might consider reforming our service courses to fully adopt randomization methods by convincing our colleagues in other disciplines that understanding trumps the ability to force some data through a black-box model. But until then, we can start with our own major. The availability of quality free introductory tools and the growing need for greater computing skills in our next generation of statistics graduates makes this the ideal time to embrace this trend.

Simulation-based statistical inference

A blog about teaching introductory statistics with simulation-based inference

Some thoughts and experiences using simulation/randomization based methods in introductory statistics courses and in the undergraduate statistics curriculum

Leave a Reply Cancel reply