Chris Malone – Winona State University
One’s success in a course is often determined by his or her desire and motivation to learn. Unfortunately, desire and motivation are often lacking in an introductory statistics course. I have learned some tricks over my years of teaching to enhance motivation — leverage their existing knowledge whenever possible and require students to repeatedly consider the phrase “What would happen if … .”[pullquote]Modern technologies and the recent advances in the use of simulation-based methods in teaching introductory statistics have allowed students to easily consider a variety of “What would happen if …” scenarios. [/pullquote]
Consider the difficult topic of getting students to conceptualize the sampling distribution of a sample proportion. Modern technologies and the recent advances in the use of simulation-based methods in teaching introductory statistics have allowed students to easily consider a variety of “What would happen if …” scenarios. For example, students can quickly obtain a simulated sampling distribution under Ho: p = 0.50 and compare it against a simulated distribution under Ho: p = 0.90 to understand that the variation in these distributions depends on p. To be honest, I’ve come to accept that fact that very few of my introductory students will ever come to appreciate this as much as I do. Instead, their concern tends to be centered on their ability to get their p-value below 0.05 so they can achieve the desirable outcome of rejecting Ho. But in my opinion, these ‘What would happen if …” investigations do indeed invoke curiosity and in turn motivate students to learn statistics.
I should not advocate for the use of these “What would happen if …” investigations without communicating some of the adverse consequences. Consider again the investigation of the sampling distribution of a sample proportion under Ho. I find that students tend to gravitate toward the more natural situation of creating the sampling distribution using instead of generating the distribution under Ho. Obviously, in a testing situation the sampling distribution using is not considered as we evaluate evidence against the null hypothesis. However, generating the sampling distribution using is completely valid in other situations, such as obtaining an estimate of the margin-of-error. Generating a sampling distribution using (or under any highly-specified model) can be thought of as a parametric bootstrap approach which differs from the typical non-parametric approach of taking several repeated samples with replacement from your data.
Statistics is a discipline in a constant state of change. It is no wonder that instructors often feel overwhelmed when considering how to best to teach statistics. Instructors should carefully consider their motives in using the bootstrap or other simulation-based approaches in teaching introductory statistics. Our motives and goals in using the bootstrap are not often matched by the students. [pullquote]Instructors should carefully consider their motives in using the bootstrap or other simulation-based approaches in teaching introductory statistics. Our motives and goals in using the bootstrap are not often matched by the students.[/pullquote]From our perspective, the purpose of generating a sampling distribution under Ho (to obtain a p-value), is very different than the purpose of generating a sampling distribution using (to obtain an estimated for the margin-of-error). From a student’s perspective, the “What would happen if …” approach to teaching along with the bootstrap provides a way for him or her to freely explore distributions without some pre-determined motive. The bootstrap does not make teaching easier, students still need to understand that the p-value is to be computed under Ho and that the margin-of-error is computed when the sampling distribution is obtained using the observed statistic. The bootstrap does however require less in the way of formulas and theoretical assumptions which often needlessly complicate the teaching of introductory statistics.
I will be using the Lock5 book teaching at Cornell College for a couple blocks this spring. I will probably have a more informed opinion after that. I am terribly excited about the pedagogical advantages of some of the simulations I’ve seen, but am a bit worried about how to present the logic of bootstrapping. This might be just old dog, new tricks, but I am not sure I have an adequate understanding of the theory of bootstrapping.
Any recommendations for sources? I do bookstores and Amazon really well…
The lock book explains bootstrapping clearly. You have a sample and that’s it. You can think of making a bootsrap distribution because you cannot make a sampling distribution without either knowing the entire population (in which case none of this would be necessary) or taking many many samples of size n from the pop(. Think of a bootstrap distribution as a pseudo sampling distribution.
A sampling distribution shows us the dist of our sample stat. Then we look at and see where our particular sample is relative to all the possible samples and infer.
A bootstrap distribution gives us the same info but is made only from our sample. It’s kind of cool. I am in my 2nd semester with lock5.
I think you’ll like it.
For an excellent account of current insights about the bootstrap, I recommend the recent article by Tim Hesterberg, “What Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum.” (See http://arxiv.org/abs/1411.5279) It is a fascinating read!
I find it very difficult to teach interval estimation without bootstrapping in a course that emphasizes randomization/simulation based approaches to hypothesis testing. I build the case for why these computational approaches are great for hypothesis testing but then when I get to estimation I don’t have anything similar and have to rely on the same theory-based methods I’ve been trying to avoid.
Bootstrapping takes care of this problem and has many of the same advantages of intuitive understanding that randomization methods have for hypothesis testing, although it is a little more complicated. My general approach is that we want to simulate what would happen to our statistic if we took many samples from the actual population – what would that distribution of sample statistics look like? Well, we can’t do it because we only have the single sample we collected. So just as we use the sample statistic to estimate the population parameter, we use the sample itself as an estimate of the population. It’s our best estimate for what the population looks like. So we blow it up to infinite size and sample from it repeatedly.
The nice thing about this is that once they get the concept, the process is the same for any statistic. So it has a universal logic in the same way that the scrambling methodology does for multiple variable hypothesis testing. The other nice thing about it is that it is a fundamentally important tool for many types of modern research. Boostrapping lets me give due emphasis to estimation in a randomization-based curriculum.
Chris, could you elaborate on the following statement a little more?
For phat, they are the same thing, no? What do you have in mind for other highly-specified models?
I agree with Scott’s points about some of the benefits of teaching interval estimation via bootstrapping. I think its easy to do early in a course – after students have done some descriptive statistics (mean, std. dev., proportion, correlation, etc.), a natural question to ask is “How accurate is that estimate?” Bootstrapping offers a way to address this question via simulation that requires little additional machinery (or assumptions about a form for the underlying distribution of the population).
Bootstrap sampling with replacement from the original sample is equivalent to simulating samples from a “population” that is just many copies of that original sample. In most cases that’s the best we can do using the data from a single sample if we aren’t making additional assumptions about the structure of the population. As long as students have appropriate, easy-to-use technology to create and summarize the bootstrap samples, the process is not hard to follow and, as Scott mentioned, the same process applies to lots of parameter/statistics.
The following analogy may be helpful for understanding what’s going on with bootstrapping:
(A visual version may be easier to follow – check starting around slide 19 of the powerpoint at http://www.lock5stat.com/ppt/ICOTS2014RLock.pptx)
Think of a tree standing alone in a field. The trunk of the tree is the parameter we are trying to locate. The tree drops a lot of seeds ( or acorns, or apples, …) that scatter around, most fairly close to the trunk but some farther way. Think of each seed as a sample statistic and this distribution of seeds as the standard sampling distribution of statistics for many samples drawn from the population (the original tree). In practice, we usually can’t see the results for lots of samples, we see just one of those many seeds (just our original sample). The key question is “How far away might the trunk of its tree (which we can no longer see) be?” What can we do with a single seed? Grow a new tree! Then let that tree (which we assume is genetically similar to the original tree) drop lots of seeds and see how far they tend to go way from its trunk (the statistic for the original sample). This new distribution of seeds represents the bootstrap distribution and we are assuming that its variability is similar to the variability in distribution of seeds around the original tree.