Overview

Teachers of introductory statistics are increasingly using simulation-based methods to help students learn concepts and methods of statistical inference. Two quick anecdotes:
• The theme of the 2011 U.S. Conference on Teaching Statistics was “The Next BIG Thing,” and the consensus emerging from the conference was that the BIG thing is teaching introductory statistics with simulation-based methods.
• The recently conducted International Conference on Teaching Statistics in Flagstaff (July 2014) included a large number of sessions on this topic, featuring presenters from around the world, often with standing-room-only crowds of attendees.

Let me start by trying to clarify what I mean by the term “simulation-based inference.” I’ll start with a relatively simple example involving a binomial process, for which you want to draw an inference about the underlying probability of success. I once spun my tennis racquet 100 times, wanting to test whether the spun racquet was equally likely to land with the label facing up or down. I found that my racquet landed with the label up on 46 of those 100 spins. How to proceed? I could use the binomial distribution to calculate the relevant p-value, or I could use a normal approximation to determine a z-score and approximate p-value. But the simulation-based approach would say to simulate a large number of repetitions of spinning a racquet 100 times, assuming a .5 probability of landing with the label up or down.

Examples of simulation-based inference in slightly more complicated settings include bootstrapping to illustrate sampling variability and determine confidence intervals, and simulating a randomization test to compare responses between two groups.

One thing I do NOT mean by this term is the use of simulation only to illustrate a statistical concept such as a sampling distribution. That’s a great use of simulation as a teaching tool, but stopping there without proceeding to draw an inference based on the simulation results does not qualify as what I would consider to be simulation-based inference.

Where does this movement toward teaching introductory statistics with simulation-based inference come from? Many cite George Cobb as the driving force behind this initiative. George gave an inspiring after-dinner talk at the first USCOTS in 2005 in which he challenged the audience of statistics educators to redesign introductory courses to focus on the logic of inference. George subsequently wrote an article, with the delightful and provocative title “The Introductory Statistics Course: A Ptolemaic Curriculum,” that appeared in the inaugural issue of Technology Innovations in Statistics Education.

Of course, the ideas behind these methods of statistical inference go back as least as far as Fisher, and they have been presented in classic textbooks such as Statistics for Experimenters by Box, Hunter, and Hunter. For the Stat 101 audience, Bob Wardrop wrote an entire textbook around simulation- and randomization-based inference in the early 1990s. But one big factor behind the current craze, of course, is high-speed computing: its ease, availability, and low cost allow for conducting simulations quickly and efficiently. Moreover, many people have developed highly visual, interactive, appealing software tools that students can use to explore and apply these methods.

I have been using these methods in my own classes for more than a decade. My friend and colleague Beth Chance and I developed an introductory course for mathematically inclined students, Investigating Statistical Concepts, Applications, and Methods. Our interest in adapting these materials for the Stat 101 audience led us to join the team led by Nathan Tintle, then at Hope College and now at Dordt College. Nathan is the Principal Investigator for a newly-funded NSF project of which this blog is one component in spreading the word about simulation-based inference.

Why use a blog to spread this word? Our hope is that this blog format allows for a more light, personal, and entertaining writing style than typical academic writing. We also plan to keep these blog posts short and to the point. Our biggest hope is that these blog posts will be thought-provoking. We encourage readers to respond by posting some of their reactions in the comments sections.

What kinds of issues will be addressed in our blog posts? Any question associated with teaching introductory statistical inference from a simulation-based perspective is fair game. My experience is that this enterprise lends itself to lots of questions and viewpoints. To give just a few examples:
• How many different kinds of resampling/randomization/sampling-based methods should be taught, and in what circumstances?
• Does order of topics matter? What are some good ordering choices, and why?
• What about software? Should one use a commercially available package or specially designed applets and apps?
• Should normal-based inference methods continue to be taught? If so, how and when?
• What are some common student misconceptions, and how can instructors help students to avoid them, or better yet learn from them?
• How can student learning be assessed well with this approach?
• How can I best integrate just a few of these ideas into my existing course? Where do I start?
• How might you convince colleagues to give this a try?
On many of these questions my colleagues and I do not agree unanimously on the answers. I’m hoping that blog posts can provide an informative glimpse into some of the disagreements and debates that we’ve had on these issues.

Who will contribute these blog posts? Many leaders in this effort have agreed to participate in Nathan’s grant project, and we hope to convince them to contribute some blog posts. These include the Lock family, the CATALST group, the Tabor/Franklin team, the Woodard/West duo, the Winona State University team, and the UCLA group. We expect lots of other folks teaching with these methods to contribute also, and I’ve already mentioned that we hope readers will provide lots of comments. Please let us know what questions you’d like us to address, as we try to provide thought-provoking reading that will benefit your teaching and your students’ learning.

Simulation-based statistical inference

A blog about teaching introductory statistics with simulation-based inference

Leave a Reply Cancel reply