Development of the CATALST course

The CATALST group – University of Minnesota


Inspired by George Cobb’s plenary address at the first USCOTS in 2005, we began to explore ways to turn his ideas into an actual curriculum. We decided to explore the use of models and modeling in the course, and, funded by a NSF grant, developed the CATALST curriculum.

Our guiding principle was to teach students to really cook, rather than follow recipes.

Our goal was to develop a course that focused on randomization methods and random sampling, taking away the traditional focus on the two-sample t-test. The CATALST course went through many iterations and had input from a team of great collaborators, including courageous graduate students who taught early versions of this radically different course. Our guiding principle was to teach students to really cook, rather than follow recipes. The cooking method uses randomization and repeated sampling methods to make statistical inferences. Even though there were many challenges, we feel that we developed a course that engages students and stimulates them to think, build and test models, and understand the core ideas of statistical inference.


Almost all of the appeal in teaching simulation-based methods is in the potential for student learning. The visual nature of many of the software packages used to implement these methods seems to really help students understand ideas related to repeated sampling and sampling variability—often difficult, theoretical ideas for many students. These methods also highlight aspects of the inferential process that are generally opaque in a more conventional course. For example, because students need to initially build a model before they can simulate from it, it is more transparent that the p-value (or anything else we compute from the simulation) was based on a particular model (e.g., the null hypothesis). Similarly, model validity is at the forefront. Students in simulation-based courses seem quicker to recognize that the model they built is not only an incomplete snapshot of reality, but also how it might be flawed—an essential skill for anyone wanting to make decisions based on data. Lastly, students in simulation-based courses have the advantage of being able to “re-run” their simulations and get different results. This is invaluable for learning statistics. Ideas of uncertainty are prevalent and cannot be avoided in these courses. The question of, ‘how much variation is there in the results if I run the simulation over and over?’ arises naturally in simulation-based courses and is the crux of what teaching statistics is all about.


Pedagogical Implications: Evaluating students on their understanding of the material has been enhanced using simulation-based methods. Instead of seeing whether they can plug and chug with an equation (math skills) and interpret the result (memorization), we can write better assessment questions using simulation-based methods that target students understanding of the concept.

Emphasis on Conceptual Understanding, Not Computation: Students seem to have a better understanding of sampling distributions and their importance in statistics. The simulated results are more tangible and intuitive than the theoretical normal curves we used in traditional courses. There are fewer formulas and emphasis is placed on concepts.

This is key for students who are math phobic. They still can grasp the concepts in statistics without getting lost in the formulas. At the same time, students can make sense of formulas, after experiencing computer simulations. For example, when students see two bootstrap distributions (e.g., one of n = 20 and the other n = 100), they notice that n = 20 has more variability. When they see the formula SE = s/sqrt(n), they can remember that as n increases, the SE decreases – not just because of the formula but because they have seen the sampling distributions.  Students seem to have a better understanding of the different purposes of random sampling and random assignment. Students can visualize the sampling variability by looking at a bootstrap distribution, rather than simply calculating the standard error. As the Locks have pointed out, students can see that even though formulas can take different forms for different types of situations, the overall conceptual framework is the same. They see the “big picture” of inference rather than seeing lots of disconnected statistical procedures.

Some Topics are Understood More Easily: Difficult phenomena (e.g., stopping criteria, “two of a kind” type problems) can be simulated and explored without heavy theoretical background or computational formulas.

Difficult phenomena (e.g., stopping criteria, “two of a kind” type problems) can be simulated and explored without heavy theoretical background or computational formulas.

And this can be done within the first couple of weeks in the course.  Teaching with simulation generates conversations about the assumptions that go into the model (e.g., what assumptions did we have to make about the model in order to construct it). We never had those conversations when teaching the traditional curriculum. The ideas of hypothesis testing and statistical inference can be understood, not just memorized. For example, rather than memorizing “if the p-value is less than .05, Reject the null hypothesis,” students can understand why we reject the null when the p-value is low. Finally, conditions and assumptions for statistical inference are less stringent, which makes the methods more generalizable.

Disadvantages of the Randomization Curriculum: There are some accessibility issues. Due to the highly visual nature of the randomization curriculum, those with eyesight problems find the course very challenging (I have a blind student this semester!). Students do not gain familiarity with traditional methods of statistics (e.g., t-tests or z-tests) if they are in our simulation-only class. And technology accessibility can be an issue, as when many students don’t have laptops.

A Preference for Simulation Methods: In our experience, students tend to choose randomization when they have the choice between the two methods! Even if students are given the option to use traditional methods or simulation methods, we have found that our students like to visualize the distribution to obtain their results rather than rely on a formula, especially those who were typically math phobic.

Leave a Reply

Your email address will not be published. Required fields are marked *