Also - bootstrap tilting generalizes to other situations, including difference
in proportions, difference in means, or correlation.
Tim Hesterberg
(resampling, water bottle rockets, computers to Costa Rica, hot shower =
2650 light bulbs, ...)
What Teachers Should Know about the Bootstrap: Resampling in the
Undergraduate Statistics Curriculum
On Sat, May 9, 2015 at 12:00 AM, Tim Hesterberg <timhesterberg(a)gmail.com>
wrote:
1. <<Which bootstrap>>
2. <<Which method of simulating a null distribution>>
Bootstrap tilting corresponds to reweighting the observed data to satisfy
a null hypothesis, then sampling from the weighted distribution. You can
use this to obtain confidence intervals by inverting the hypothesis test.
There is a computationally-efficient way to do this, in which you generate
bootstrap samples only once, then use importance sampling reweighting to do
the test for different values of theta - you then numerically solve for the
value of theta that gives a one-sided P-value of 2.5%. This procedure is
second-order accurate (one-sided errors O(1/n)). In contrast, a bootstrap
percentile interval, or reverse bootstrap percentile interval, are only
first-order accurate (O(1/sqrt(n)). I think the method has good pedagogical
potential in a Math Stats course, but is too complicated for Intro Stats.
Tim Hesterberg
http://www.timhesterberg.net
(resampling, water bottle rockets, computers to Costa Rica, hot shower =
2650 light bulbs, ...)
What Teachers Should Know about the Bootstrap: Resampling in the
Undergraduate Statistics Curriculum
http://arxiv.org/abs/1411.5279
On Tue, May 5, 2015 at 4:01 PM, Beth Chance <bchance(a)calpoly.edu> wrote:
Hi,
I of course have to argue with Robin J But not on all points
In the one sample, quantitative variable case, instead of bootstrapping,
I have students sample from a made-up population. So this is still a bit
ad hoc, but I think helps them better see the sampling from population
connection we are emphasizing at this point in the course.
With proportions, I agree that you have to decide whether you want to use
a hypothesized value or the sample proportion to estimate the SD. In my
class I want students to think about both methods, partly to see it often
doesn’t make a difference, especially with a large sample size, which I
assume is what the CCSSM will focus on. And of course, “traditional”
methods make the same “arbitrary” decision – use hypothesized if you have
one, use sample if you don’t. We do have students try lots of different
null values the first time we are creating a confidence interval of
plausible values (and that’s when we introduce the idea of level of
significance), but we have added a feature to the technology to make this a
little more efficient once they get the idea (think slider).
My hope, though I don’t have a lot of data and what I do have isn’t
“great,” is that students will be better able to focus on the interval
being for the *parameter* rather than the common misconception that it’s an
interval for sample proportions that I worry bootstrapping might
reinforce. Basically I want students to think about a confidence interval
as estimate +- 2SD, which they seem to get pretty easily, and then we can
worry about the details of how to estimate the (right) SD in different
cases/use technology. This is what we carry over to other statistics later
in the course. I think CCSSM is focused on having students understand
sampling variability and the idea of margin-of-error, of the proportion
being close to the parameter and that the “plus or minus part” depends on
sample size. Lots of good ways to get those ideas across. Like them, I’ve
been starting with proportion, I guess they thought mean would be too tough
as they didn’t want to have them get into bootstrapping.
There is more discussion on exactly this issue on the SBI blog:
https://www.causeweb.org/sbi/?cat=14
Beth
*From:* sbi-bounces(a)causeweb.org [mailto:sbi-bounces@causeweb.org] *On
Behalf Of *Robin Lock
*Sent:* Tuesday, May 05, 2015 1:17 PM
*To:* Simulation-Based Inference
*Subject:* Re: [SBI] How to estimate parameters: To bootstrap or not?
Daren -
I'm a firm believer in using the bootstrap as the way to get a margin
of error via simulation. The ideal way would be to form a sampling
distribution but, in the real world, taking 1000's of new samples from the
actual population is not a feasible way to assess the accuracy of the one
estimate you have from an original sample!
I think that the logic of hypothesis testing is already tricky for
students to get a handle on, to have intervals depend on inverting that
logic seems even trickier. The example you provided in the CCSS
Progressions document is even more confusing. Here's what I gather is its
"logic"
Start with a sample of size 50 with a sample proportion of 0.40. You
want to estimate its "margin of error".
1. Suppose the population proportion is really p=0.50. Simulate a
sampling distribution using that p.
2. Observe that the sample phat=0.40 is not far in the tail of that
distribution, so 0.50 is a "plausible" value for the population
proportion. - Not bad so far.
3. Estimate the standard error by finding the std. dev. of the sample
proportions in the distribution generated around p=0.50 (SE=0.07).
4. Use 0.4 +/- 2*(0.07) = 0.4 +/-0.14 = 0.26 to 0.54 to get a CI of
similar "plausible values.
Of course the SE for p=0.5 and p=0.4 are not a lot different, but I don't
see the logic of picking some random "other" proportion, when you can do
the simulation just as well aound p=0.40 (which is what the bootstrap would
do in the first place!). I wonder what advice the document would give
for finding a CI when phat=0.12?
I think it is possible to do an interval more coherently by doing lots of
tests for lots of null parameters and seeing which would be rejected for
the sample data, but
(a) That sort of guess/check process is not very efficient.
(b) I'd like to downplay the hard 5% reject Ho decision, and would
rather have a test p-value be interpreted as "strength of evidence"
(c) Creating the simulations to test lots of nulls is more problematic
(especially via simulation) for other parameter situations like a
difference in proportions, difference in means, or correlation.
The bootstrap procedure is pretty straightforward: take lots of samples
(with replacement) from the original sample, calculate the statistics of
interest, estimate SE as the std. dev. of all those bootstrap statistics.
A rough margin of error as 2*SE is easy to find and the same process works
for lots of different parameters.
Robin
On 5/3/2015 4:03 PM, Daren Starnes wrote:
Happy May, everyone. There is an interesting thread on the AP Statistics
Teacher Community about two distinct views on estimating parameters via
simulation. This came up because the Common Core State Standards includes
this Statistics and Probability standard
S-IC.4 Use data from a sample survey to estimate a population mean or
proportion; develop a margin of error through the use of simulation models
for random sampling.
View #1: Use bootstrapping.
View #2: Determine whether "nearby" values of the parameter are plausible
by simulating a "null distribution" with that parameter value and seeing if
the observed statistic is a believable outcome from such a null
distribution. Keep doing this for other nearby values until you have an
interval of plausible values for the parameter.
The attached CCSS Progression document seems to suggest View #2, at least
as far as estimating a proportion is concerned. There is no discussion of
how to estimate the margin of error for a mean in this way (I wonder why!).
This seems like an issue that this experienced group of SBI folks would
already have grappled with--both philosophically and pedagogically. So I
thought I would ask what the prevailing wisdom is.
Daren Starnes
_______________________________________________
SBI mailing list
SBI(a)causeweb.org
https://www.causeweb.org/mailman/listinfo/sbi
--
Robin Lock
Burry Professor of Statistics
St. Lawrence University
_______________________________________________
SBI mailing list
SBI(a)causeweb.org
https://www.causeweb.org/mailman/listinfo/sbi