**Robin Lock, St Lawrence University**

Around 1998, Allan Rossman and Beth Chance asked me to help out with a new edition of their popular Workshop Statistics book that would be adapted to use a new software package called Fathom that was being developed by Bill Finzer, then at KCP Technologies.

But I could detect light bulbs going on with students thinking, “Oh, that’s what he means by seeing what would happen if the null hypothesis is true!’

Although I had primarily used these features in Fathom to do fairly traditional demonstrations of sampling distributions, I added a new example (near the end of the semester in 2004) where students found a p-value using a randomization distribution to test a hypothesis about a correlation between baseball ballpark capacity and attendance. The scramble feature in Fathom (permuting the values of one variable) made it easy to generate samples that would obey a null hypothesis of no association and collecting the correlation coefficients for many samples was a quick way to generate and display the randomization distribution. Just count the number of scrambles with correlation coefficients more extreme than the original data and we had a p-value. This came at a point late in the semester when students had already had a lot of experience with formula/traditional, distribution-based tests, including a *t*-test for the correlation. But I could detect light bulbs going on with students thinking, “Oh, that’s what he means by seeing what would happen if the null hypothesis is true!”

It took several semesters of doing this activity before it dawned on me that perhaps it would be better to give students this strong intuition about a p-value *at the start* of teaching about hypothesis tests, rather than at the end! This process was helped along by George Cobb’s famous USCOTS 2005 banquet talk and later 2007 ISE paper, where (as only George can do) he managed to compare introducing inference via formula-based traditional distributions (rather than simulation-based methods) to continuing to believe that the sun revolves around the earth. So, as a start on hypotheses testing, I added an early Fathom activity to test a claim about the mean weight of pumpkins in a farmer’s field by taking a sample, shifting it to match the null mean, and then selecting lots of random samples (with replacement) to form a sampling distribution.

Having dipped my toe in the water and found it to be pretty warm, I took the plunge in 2010 to make my course more Cobb-compliant (and less Ptolemaic) by using simulation-based methods (bootstrap confidence intervals and randomization tests) for the entire introduction to inference, before moving on to the more traditional distribution-based formulas. I was helped immensely in this endeavor by collaborations with my wife, Patti (who has been teaching intro statistics at St. Lawrence for many years), and our three statistician children, Kari, Eric, and Dennis, as we embarked on a project to develop materials to support this approach. We got lots of positive and constructive feedback from the students who patiently worked from a couple of “just in time” draft text chapters and new Fathom activities.

While being happy with how things went and finding that many parts of the course still worked fine without revision, we recognized that many of the simulation-based class activities and exercises were very dependent on the technology features found in Fathom. It didn’t seem feasible to expect other instructors to abandon their existing statistics technology, need to deal with a lot of programming, or incur additional expense by adding on Fathom in order to use these methods. This prompted us to develop StatKey as a set of freely available web apps that we designed specifically to facilitate teaching inference from a simulation-based perspective with easy, student-friendly tools.

Using simulation-based methods allows students to see a wider variety of inference applications earlier in the course.

With several years of experience now, I find that much of my course is not hugely different than before making this switch. Initial material on issues of data collection/production and summarizing data numerically and graphically is basically the same. Using simulation-based methods allows students to see a wider variety of inference applications earlier in the course. For example, last Friday (October 3^{rd}, Day 17 in a semester with 42 hour-long class meetings), my students worked through examples to test for a difference in proportions “Does lithium work better than a placebo in treating cocaine addiction?,” a difference in means “Are beer drinkers more attractive to mosquitoes than water drinkers?,” and a correlation “Do football teams with more malevolent uniforms tend to get more penalty yards?” At this point in the semester they have done confidence intervals and hypothesis tests for most of the common parameters (mean, proportion, difference in means, difference in proportions, correlation, slope, and even a CI for a standard deviation) and won’t see a normal distribution until next week.

I still cover the traditional methods, but find they go much more quickly now that students aren’t trying to learn the basic ideas about confidence intervals and tests at the same time they are grappling with formulas. Now they can focus just on “What’s the formula for the standard error in this situation?” and “What conditions are needed for the bootstrap/randomization distribution to be approximated by the theoretical reference distribution?” Thus students finish the course seeing all the topics they saw in the “old” days, with some additional skills from bootstrapping/randomization and, most importantly, a better general understanding of the core statistical ideas.

Overall, as often happens with curricular innovation, incorporating simulation-based methods in my courses has been an evolutionary process – keeping changes that worked well and revising some that didn’t. I was fortunate to have contact with people like Allan, Beth, and George (as well as my family!) who had lots of good ideas along the way. I would hope that the process is easier now for faculty who want to move in this direction, as we have more experience from different groups developing materials to support this approach.