Hello SBI listserv participants and SBI blog readers,
Hope you are enjoying your Saturday morning!
First, Thank you for your discussions on/contributions to the listserv - it is great to hear about all the things that statistics teachers are doing in their classes!
Second, we have several new articles on the Simulation-based Inference blog (https://www.causeweb.org/sbi/) that have been recently posted:
1) We have two new posts on "How to use real data" by Kevin Ross and Nathan Tintle.
2) Erin Blankenship, Karen McGaughey, and Kathryn Dobeck have written about their experiences and what they thought was "The hardest thing about getting started with simulation-based curricula."
3) For readers interested in "How to implement simulation-based methods in high school classrooms/AP Statistics classes" - we have articles from Bob Peterson, Catherine Case, and Josh Tabor, all AP Statistics teachers, writing about their experiences.
On behalf of the ISI team, I'd like to thank all our blog contributors for writing these pieces for us.
I hope you enjoy reading these articles, and others posted on the blog, as much as I do!
Have a nice weekend!
- Soma
-----------------------
Soma Roy
Associate Professor
Statistics
California Polytechnic State University
San Luis Obispo CA 93407
Phone no.: (805)-756-5250
"… for whenever you learn something new, the whole world becomes that much richer." - Norton Juster, The Phantom Tollbooth
I believe that Scott Rifkin has it exactly right with the bootstrap. I have used the approach that he described with my students in a first college course in Statistical Science. In the last week of the course just completed, my students worked in teams of three using resampling and bootstrapping. The focus, of course, is on understanding the variability of as estimate derived from a sample.
I believe that, at a basic level, students can understand and appreciate the bootstrap. Of course there are subtleties, but then that's true in most real statistical problems! For those who want to learn more about the subtleties and the performance of various procedures that grow out of bootstrapping, see the excellent recent paper, written for teachers, by Tim Hesterberg. I learned a lot from it, and my students verified some of the points that Tim makes for bootstrap confidence intervals.
John Emerson
Middlebury, VT
Date: Sat, 16 May 2015 09:56:55 -0700
From: Scott Rifkin <sarifkin(a)ucsd.edu>
To: sbi(a)causeweb.org
Subject: [SBI] How to estimate parameters: To bootstrap or not?
Message-ID: <555776D7.7020209(a)ucsd.edu>
Content-Type: text/plain; charset=utf-8; format=flowed
My approach to the conceptual hurdle of using a single sample to mimic the population runs something along the following lines:
First get them comfortable with the idea that a statistic measured on the sample (usually we talk about mean) is our best estimate of the corresponding population parameter. The goal is to make the link between sample properties and population properties.
Then present the problem:
Problem: This sample statistic can move around depending on the sample.
How variable is it? How big is this sample variability? Since the population parameter is fixed, we want to get an idea of how close our 'best estimate' is.
Ideal solution: keep taking samples from the population and get a collection of statistics. Then we'd know because we'd have a sampling distribution (a distribution of sample statistics).
Problem: This is impractical. Usually we can only afford to take one sample.
Question: is there a way we can mimic/simulate this ideal solution?
Problem: We don't know what the population actually looks like.
But we do know something about it. In fact, everything we know about it is encapsulated in the sample. The sample is our best estimate of the population in a particular way: we expect the population to be much bigger in size, but the frequencies of each value in the sample are representative (in a statistical way) of the frequencies of each value in the actual population. And we already have it in hand.
So instead of doing the impractical - continually using scarce resources to sample from the population and calculate our statistic on each sample
- we use what we already have and sample from our best estimate of the population - our sample itself - and calculate our statistic from each bootstrap sample.
For my students, the key is when they understand that the sample itself plays the role of an estimate of the population. And that we use bootstrapping to study the variability (not the location) of our statistic of interest.
- Scott Rifkin
------------
EBE, Division of Biological Sciences
UCSD
>
> Hello All,
>
> As person who spends most of her summer working with high school
> teachers on stats and probability content and creating lesson plans,
> which are used in the next school year, I've followed this discussion
> eagerly.
>
> High school teachers are relatively easily convinced that a large
> enough, random sample is usually representative of the population.
> Convincing teachers that one of these samples could be used to mimic
> the entire population and then be utilized to generate more random
> samples is quite a different thing. I am convinced of the
> bootstrapping process, but to leap there immediately with teachers
> versus the more cumbersome routes discussed in this chain of responses
> might cause serious distress.
>
> Are there resources to help educate high school teachers (and myself
> further) in regard to bootstrapping? Research and experience shows
> that teachers with either omit or superficially enact contact that
> they feel is beyond their current knowledge base.
>
> Simulation, in general, has been daunting for high school teachers.
> Of 23 we worked with last summer, only 25% took the plunge with
> re-randomization. However, the ones that did, thoroughly enjoyed the
> experience, as did their students.
>
> Best,
>
> Maryann
>
> /----------------------------------------/
>
> /Maryann E. Huey/
>
> /Mathematics and Computer Science/
>
> /Drake University/
>
> /515/271-2839///
>
> <46A78617-9B87-4C74-8821-5F91750143B1[6].png>
>
------------------------------
Message: 2
Date: Sat, 16 May 2015 19:27:17 -0400
From: Daren Starnes <dstarnes(a)lawrenceville.org>
To: Simulation-Based Inference <sbi(a)causeweb.org>
Subject: Re: [SBI] How to estimate parameters: To bootstrap or not?
Message-ID:
<CAMo0yhrLr_9y6JKxO6s9cv9jAM-T-pGVH0Kw3LmOT5c9bihHkg(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi, Maryann. From my own work with high school teachers, I have found that the best entry point for simulation-based inference is to introduce them to two cases that are pretty accessible:
1. Using simulation to test a claim about a population proportion based on a random sample from that population. Just simulate many, many samples of that size under the assumption that the claim is true and record the value of the sample proportion for each one in a dotplot. Then look where the observed result falls in the simulated sampling distribution, and ask whether the sample result is sufficiently surprising (far out in the tails of the distribution) to provide convincing evidence against the claim.
Ideally, we'd have learners do this with a spinner or some other physical device first before proceeding to technology, which would necessitate using a fairly small sample size for practical reasons.
2. Using simulation to determine whether the difference between two proportions is statistically significant in a randomized experiment.
Assume that there is no difference in the effects of the two treatments on the subjects in the study (null hypothesis). Simulate re-doing the random assignment of subjects to treatments many, many times, keeping each subject's response (success or failure) the same as it was in the original experiment. Each time, record the difference in proportions of successes for the two groups on a dotplot. Then look where the observed result falls in the simulated randomization distribution, and ask whether the observed difference in proportions is sufficiently surprising (far out in the tails of the distribution) to provide convincing evidence against the null hypothesis. Ideally, we'd have learners do this with by shuffling and dealing cards or some other physical device first before proceeding to technology, which would necessitate using fairly small group sizes for practical reasons.
There are great resources available from several members of this list that could be used as the basis for these two distinct activities that would introduce teachers to the different scope of inference for random sampling and randomized experiments.
Daren Starnes
**********************************************************************************************
Hello All,
As person who spends most of her summer working with high school teachers on stats and probability content and creating lesson plans, which are used in the next school year, I've followed this discussion eagerly.
High school teachers are relatively easily convinced that a large enough,
random sample is usually representative of the population. Convincing
teachers that one of these samples could be used to mimic the entire population and then be utilized to generate more random samples is quite a different thing. I am convinced of the bootstrapping process, but to leap there immediately with teachers versus the more cumbersome routes discussed in this chain of responses might cause serious distress.
Are there resources to help educate high school teachers (and myself
further) in regard to bootstrapping? Research and experience shows that teachers with either omit or superficially enact contact that they feel is beyond their current knowledge base.
Simulation, in general, has been daunting for high school teachers. Of 23
we worked with last summer, only 25% took the plunge with re-randomization.
However, the ones that did, thoroughly enjoyed the experience, as did their students.
Best,
Maryann
*----------------------------------------*
*Maryann E. Huey*
*Mathematics and Computer Science*
*Drake University*
*515/271-2839 <515%2F271-2839>*
Hi, Maryann. From my own work with high school teachers, I have found that
the best entry point for simulation-based inference is to introduce them to
two cases that are pretty accessible:
1. Using simulation to test a claim about a population proportion based on
a random sample from that population. Just simulate many, many samples of
that size under the assumption that the claim is true and record the value
of the sample proportion for each one in a dotplot. Then look where the
observed result falls in the simulated sampling distribution, and ask
whether the sample result is sufficiently surprising (far out in the tails
of the distribution) to provide convincing evidence against the claim.
Ideally, we'd have learners do this with a spinner or some other physical
device first before proceeding to technology, which would necessitate using
a fairly small sample size for practical reasons.
2. Using simulation to determine whether the difference between two
proportions is statistically significant in a randomized experiment.
Assume that there is no difference in the effects of the two treatments on
the subjects in the study (null hypothesis). Simulate re-doing the random
assignment of subjects to treatments many, many times, keeping each
subject's response (success or failure) the same as it was in the original
experiment. Each time, record the difference in proportions of successes
for the two groups on a dotplot. Then look where the observed result falls
in the simulated randomization distribution, and ask whether the observed
difference in proportions is sufficiently surprising (far out in the tails
of the distribution) to provide convincing evidence against the null
hypothesis. Ideally, we'd have learners do this with by shuffling and
dealing cards or some other physical device first before proceeding to
technology, which would necessitate using fairly small group sizes for
practical reasons.
There are great resources available from several members of this list that
could be used as the basis for these two distinct activities that would
introduce teachers to the different scope of inference for random sampling
and randomized experiments.
Daren Starnes
**********************************************************************************************
Hello All,
As person who spends most of her summer working with high school teachers
on stats and probability content and creating lesson plans, which are used
in the next school year, I've followed this discussion eagerly.
High school teachers are relatively easily convinced that a large enough,
random sample is usually representative of the population. Convincing
teachers that one of these samples could be used to mimic the entire
population and then be utilized to generate more random samples is quite a
different thing. I am convinced of the bootstrapping process, but to leap
there immediately with teachers versus the more cumbersome routes discussed
in this chain of responses might cause serious distress.
Are there resources to help educate high school teachers (and myself
further) in regard to bootstrapping? Research and experience shows that
teachers with either omit or superficially enact contact that they feel is
beyond their current knowledge base.
Simulation, in general, has been daunting for high school teachers. Of 23
we worked with last summer, only 25% took the plunge with re-randomization.
However, the ones that did, thoroughly enjoyed the experience, as did
their students.
Best,
Maryann
*----------------------------------------*
*Maryann E. Huey*
*Mathematics and Computer Science*
*Drake University*
*515/271-2839 <515%2F271-2839>*
My approach to the conceptual hurdle of using a single sample to mimic
the population runs something along the following lines:
First get them comfortable with the idea that a statistic measured on
the sample (usually we talk about mean) is our best estimate of the
corresponding population parameter. The goal is to make the link
between sample properties and population properties.
Then present the problem:
Problem: This sample statistic can move around depending on the sample.
How variable is it? How big is this sample variability? Since the
population parameter is fixed, we want to get an idea of how close our
'best estimate' is.
Ideal solution: keep taking samples from the population and get a
collection of statistics. Then we'd know because we'd have a sampling
distribution (a distribution of sample statistics).
Problem: This is impractical. Usually we can only afford to take one
sample.
Question: is there a way we can mimic/simulate this ideal solution?
Problem: We don't know what the population actually looks like.
But we do know something about it. In fact, everything we know about it
is encapsulated in the sample. The sample is our best estimate of the
population in a particular way: we expect the population to be much
bigger in size, but the frequencies of each value in the sample are
representative (in a statistical way) of the frequencies of each value
in the actual population. And we already have it in hand.
So instead of doing the impractical - continually using scarce resources
to sample from the population and calculate our statistic on each sample
- we use what we already have and sample from our best estimate of the
population - our sample itself - and calculate our statistic from each
bootstrap sample.
For my students, the key is when they understand that the sample itself
plays the role of an estimate of the population. And that we use
bootstrapping to study the variability (not the location) of our
statistic of interest.
- Scott Rifkin
------------
EBE, Division of Biological Sciences
UCSD
>
> Hello All,
>
> As person who spends most of her summer working with high school
> teachers on stats and probability content and creating lesson plans,
> which are used in the next school year, I've followed this discussion
> eagerly.
>
> High school teachers are relatively easily convinced that a large
> enough, random sample is usually representative of the population.
> Convincing teachers that one of these samples could be used to mimic
> the entire population and then be utilized to generate more random
> samples is quite a different thing. I am convinced of the
> bootstrapping process, but to leap there immediately with teachers
> versus the more cumbersome routes discussed in this chain of responses
> might cause serious distress.
>
> Are there resources to help educate high school teachers (and myself
> further) in regard to bootstrapping? Research and experience shows
> that teachers with either omit or superficially enact contact that
> they feel is beyond their current knowledge base.
>
> Simulation, in general, has been daunting for high school teachers.
> Of 23 we worked with last summer, only 25% took the plunge with
> re-randomization. However, the ones that did, thoroughly enjoyed the
> experience, as did their students.
>
> Best,
>
> Maryann
>
> /----------------------------------------/
>
> /Maryann E. Huey/
>
> /Mathematics and Computer Science/
>
> /Drake University/
>
> /515/271-2839///
>
> <46A78617-9B87-4C74-8821-5F91750143B1[6].png>
>
Beth, Robin, et al.
While the method of finding plausible values (i.e. a confidence interval)
for a single proportion by testing many different null hypothesis values is
inefficient, I personally find it valuable, at least as a starting point,
because (a) It reinforces the idea of how to do tests of significance and
(b) It reinforces the language of 'null is plausible' vs. the common
student mistake of 'null is true' when the p-value is large. While I don't
spend a lot of time with this approach (as others have mentioned it is time
consuming from the students perspective and limited in terms of cases when
it is applicable) it seems to act as a nice 'bridge' to other techniques,
like estimating the SE from the simulated null distribution and taking 2*SE
as a rough 95% CI, and/or theory-based approaches, without needing to
introduce another type of simulation (e.g., bootstrap).
Nathan
On Tue, May 5, 2015 at 6:01 PM, Beth Chance <bchance(a)calpoly.edu> wrote:
> Hi,
>
>
>
> I of course have to argue with Robin J But not on all points
>
>
>
> In the one sample, quantitative variable case, instead of bootstrapping, I
> have students sample from a made-up population. So this is still a bit ad
> hoc, but I think helps them better see the sampling from population
> connection we are emphasizing at this point in the course.
>
>
>
> With proportions, I agree that you have to decide whether you want to use
> a hypothesized value or the sample proportion to estimate the SD. In my
> class I want students to think about both methods, partly to see it often
> doesn’t make a difference, especially with a large sample size, which I
> assume is what the CCSSM will focus on. And of course, “traditional”
> methods make the same “arbitrary” decision – use hypothesized if you have
> one, use sample if you don’t. We do have students try lots of different
> null values the first time we are creating a confidence interval of
> plausible values (and that’s when we introduce the idea of level of
> significance), but we have added a feature to the technology to make this a
> little more efficient once they get the idea (think slider).
>
>
>
> My hope, though I don’t have a lot of data and what I do have isn’t
> “great,” is that students will be better able to focus on the interval
> being for the *parameter* rather than the common misconception that it’s an
> interval for sample proportions that I worry bootstrapping might
> reinforce. Basically I want students to think about a confidence interval
> as estimate +- 2SD, which they seem to get pretty easily, and then we can
> worry about the details of how to estimate the (right) SD in different
> cases/use technology. This is what we carry over to other statistics later
> in the course. I think CCSSM is focused on having students understand
> sampling variability and the idea of margin-of-error, of the proportion
> being close to the parameter and that the “plus or minus part” depends on
> sample size. Lots of good ways to get those ideas across. Like them, I’ve
> been starting with proportion, I guess they thought mean would be too tough
> as they didn’t want to have them get into bootstrapping.
>
>
>
> There is more discussion on exactly this issue on the SBI blog:
> https://www.causeweb.org/sbi/?cat=14
>
>
>
> Beth
>
>
>
>
>
> *From:* sbi-bounces(a)causeweb.org [mailto:sbi-bounces@causeweb.org] *On
> Behalf Of *Robin Lock
> *Sent:* Tuesday, May 05, 2015 1:17 PM
> *To:* Simulation-Based Inference
> *Subject:* Re: [SBI] How to estimate parameters: To bootstrap or not?
>
>
>
> Daren -
> I'm a firm believer in using the bootstrap as the way to get a margin
> of error via simulation. The ideal way would be to form a sampling
> distribution but, in the real world, taking 1000's of new samples from the
> actual population is not a feasible way to assess the accuracy of the one
> estimate you have from an original sample!
>
> I think that the logic of hypothesis testing is already tricky for
> students to get a handle on, to have intervals depend on inverting that
> logic seems even trickier. The example you provided in the CCSS
> Progressions document is even more confusing. Here's what I gather is its
> "logic"
>
> Start with a sample of size 50 with a sample proportion of 0.40. You
> want to estimate its "margin of error".
> 1. Suppose the population proportion is really p=0.50. Simulate a
> sampling distribution using that p.
> 2. Observe that the sample phat=0.40 is not far in the tail of that
> distribution, so 0.50 is a "plausible" value for the population
> proportion. - Not bad so far.
> 3. Estimate the standard error by finding the std. dev. of the sample
> proportions in the distribution generated around p=0.50 (SE=0.07).
> 4. Use 0.4 +/- 2*(0.07) = 0.4 +/-0.14 = 0.26 to 0.54 to get a CI of
> similar "plausible values.
>
> Of course the SE for p=0.5 and p=0.4 are not a lot different, but I don't
> see the logic of picking some random "other" proportion, when you can do
> the simulation just as well aound p=0.40 (which is what the bootstrap would
> do in the first place!). I wonder what advice the document would give
> for finding a CI when phat=0.12?
>
> I think it is possible to do an interval more coherently by doing lots of
> tests for lots of null parameters and seeing which would be rejected for
> the sample data, but
> (a) That sort of guess/check process is not very efficient.
> (b) I'd like to downplay the hard 5% reject Ho decision, and would
> rather have a test p-value be interpreted as "strength of evidence"
> (c) Creating the simulations to test lots of nulls is more problematic
> (especially via simulation) for other parameter situations like a
> difference in proportions, difference in means, or correlation.
>
> The bootstrap procedure is pretty straightforward: take lots of samples
> (with replacement) from the original sample, calculate the statistics of
> interest, estimate SE as the std. dev. of all those bootstrap statistics.
> A rough margin of error as 2*SE is easy to find and the same process works
> for lots of different parameters.
>
> Robin
>
> On 5/3/2015 4:03 PM, Daren Starnes wrote:
>
> Happy May, everyone. There is an interesting thread on the AP Statistics
> Teacher Community about two distinct views on estimating parameters via
> simulation. This came up because the Common Core State Standards includes
> this Statistics and Probability standard
>
>
>
> S-IC.4 Use data from a sample survey to estimate a population mean or
> proportion; develop a margin of error through the use of simulation models
> for random sampling.
>
>
>
> View #1: Use bootstrapping.
>
>
>
> View #2: Determine whether "nearby" values of the parameter are plausible
> by simulating a "null distribution" with that parameter value and seeing if
> the observed statistic is a believable outcome from such a null
> distribution. Keep doing this for other nearby values until you have an
> interval of plausible values for the parameter.
>
>
>
> The attached CCSS Progression document seems to suggest View #2, at least
> as far as estimating a proportion is concerned. There is no discussion of
> how to estimate the margin of error for a mean in this way (I wonder why!).
>
>
>
> This seems like an issue that this experienced group of SBI folks would
> already have grappled with--both philosophically and pedagogically. So I
> thought I would ask what the prevailing wisdom is.
>
>
>
> Daren Starnes
>
>
>
>
>
> _______________________________________________
>
> SBI mailing list
>
> SBI(a)causeweb.org
>
> https://www.causeweb.org/mailman/listinfo/sbi
>
>
>
> --
>
> Robin Lock
>
> Burry Professor of Statistics
>
> St. Lawrence University
>
>
--
Nathan Tintle, Ph.D.
Associate Professor of Statistics and Dept. Chair
Director for Research and Scholarship
Dordt College
Sioux Center, IA 51250
nathan.tintle(a)dordt.edu
Phone: (712) 722-6264
Office: SB1612
Happy May, everyone. There is an interesting thread on the AP Statistics
Teacher Community about two distinct views on estimating parameters via
simulation. This came up because the Common Core State Standards includes
this Statistics and Probability standard
S-IC.4 Use data from a sample survey to estimate a population mean or
proportion; develop a margin of error through the use of simulation models
for random sampling.
View #1: Use bootstrapping.
View #2: Determine whether "nearby" values of the parameter are plausible
by simulating a "null distribution" with that parameter value and seeing if
the observed statistic is a believable outcome from such a null
distribution. Keep doing this for other nearby values until you have an
interval of plausible values for the parameter.
The attached CCSS Progression document seems to suggest View #2, at least
as far as estimating a proportion is concerned. There is no discussion of
how to estimate the margin of error for a mean in this way (I wonder why!).
This seems like an issue that this experienced group of SBI folks would
already have grappled with--both philosophically and pedagogically. So I
thought I would ask what the prevailing wisdom is.
Daren Starnes