Estimation Principles

• Song: 99 Bottles of Beer

A song to discuss how a confidence interval made for a population parameter will be biased if the sample is biased (e.g. starting with a random sample of n=100 but then having individuals drop out one at a time based on a non-ignorable reason).  The song was written IN MARCH 2019 by Lawrence Lesser, The University of Texas at El Paso, and Dennis Pearl, Penn State University, using the mid-20th century recursive folk song "99 Bottles of Beer." The idea for the song came from an article by Donald Byrd of University of Indiana in the September 2010 issue of Math Horizons where he suggested using the song for various learning objectives in Mathematics Education.

• Cartoon: Fortune Teller

A cartoon that can be used in a discussion of prediction – and the difference between the accuracy of a single prediction and quantifying the level of accuracy for a prediction method. The cartoon was used in the May 2019 CAUSE cartoon caption contest and the winning caption was written by Mickey Dunlap from the University of Georgia. The cartoon was drawn by British cartoonist John Landers (www.landers.co.uk) based on an idea by Dennis Pearl from Penn State University. A co-winning caption in the May 2019 contest was “I see you come from a long line of statisticians," written by Douglas VanDerwerkenz from the U.S. Naval Academy. Doug's clever pun can be related to the multiple testing problem by talking about how a fortune teller will get some predictions right if they make a long line of them.

• Cartoon: Confidence Interval

A cartoon suitable for use in teaching about confidence intervals and the quality of estimates made by a model. The cartoon is number 2311 (May, 2020) from the webcomic series at xkcd.com created by Randall Munroe. Free to use in the classroom and on course web sites under a Creative Commons attribution-non-commercial 2.5 license.

• Cartoon: Error Bars

A cartoon suitable for use in teaching about the variability in estimates (including estimates of the variability of estimates). The cartoon is number 2110 (February, 2019) from the webcomic series at xkcd.com created by Randall Munroe. Free to use in the classroom and on course web sites under a Creative Commons attribution-non-commercial 2.5 license.

• Food: Capture-Recapture with Goldfish

Summary: This article describes the capture-recapture method of estimating the size of a population of fish in a pond and illustrates it with both a “hands-on” classroom activity using Pepperidge Farm GoldfiishTM crackers and a computer simulation that investigates two different estimators of the population size.  The activity was described in R. W. Johnson, “How many fish are in the pond?,”Teaching Statistics, 18 (1) (1996), 2-5

https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9639.1996.tb00882.x

Specifics: To illustrate the capture-recapture method in the classroom, two different varieties of Pepperidge Farm GoldfishTM crackers are used. The instructor places all of the Goldfish from a full bag of the original variety in a bowl to correspond to the initial state of the pond (the instructor should have previously counted the true number from the bag, which turned out to be 323 in the paper’s example). Students then captured c = 50 of these fish and replaced them with 50 Goldfish of a flavored variety of different color. After mixing the contents of the bowl, t=6 ‘tagged’ fish - fish of the flavored variety were found in a recaptured sample size of r = 41, giving the estimate cr/t= 341. This used the maximum likelihood (ML method. To examine the behavior of the MLE the capture-recapture ML  method is repeated 1000 times using a computer simulation. The distribution of the results will be heavily skewed since the MLE is quite biased (in fact, since there is positive probability that t = 0, the MLE has an infinite expectation). The simulation is then redone using Seber’s biased-corrected estimate = [(c+1)(r+1)/(t+1)] – 1.  After the true value of the population size is revealed by the instructor, students see that the average of the 1000 new simulations show that the biased-corrected version is indeed closer to the truth (and also that the new estimate has less variability).

(Resource photo illustration by Barbara Cohen, 2020; this summary compiled by Bibek Aryal)

• Food: The Probability of a Kiss

Summary: Through generating, collecting, displaying, and analyzing data, students are given the opportunity to explore a variety of descriptive statistical techniques and develop an understanding of the distinction between theoretical, subjective, and empirical (or experimental) probabilities. These concepts are developed with activities using Hershey KissesTM and may be extended to introduce the sampling distribution of a sample proportion. The activities are described in M. Richardson and S. Haller. (2002), “What is the Probability of a Kiss? (It's Not What You Think),” Journal of Statistics Education, 10(3), https://www.tandfonline.com/doi/full/10.1080/10691898.2002.11910683

Specifics: The main activity uses Hershey’s Kisses to explore the concept of probability. Three specific sub-activities are performed such as:

1. Students explore the empirical probability that a plain Hershey’s Kiss will land on its flat base when spilled from a cup.
2. Students make predictions about the probability of an almond Hershey’s Kisses landing on its base when spilled from a cup, after having experimented with the plain Kisses.
3. Students explore the properties of the distribution of a sample proportion to see whether the percentages of base landings have a specified distribution and whether they think that the number of Kisses tossed affects the shape or the mean and standard deviation of this distribution.

(Resource photo illustration by Barbara Cohen, 2020; this summary compiled by Bibek Aryal)

• Food: Capture-Recapture with Split Peas

Summary: A classroom activity using dried split peas exploring the reliability of a basic capture-mark-recapture method of population estimation is described using great whale conservation as a motivating example. The activity was described in C. du Feu, “Having a whale of a time,” Teaching Statistics, 31 (3) (2009), 66-71.

Specifics: The hands-on activity uses dried split peas and involves much larger populations and has two advantages. Firstly, the split-pea populations are too large for any sensible student to contemplate counting the full population. Secondly, unlike SmartiesTM, or M&M’sTM dried split peas will not suffer loss through eating (so there is a fixed population size to be estimated). Beforehand, soak some split peas in colored food dye or simply buy both green peas and yellow peas. Students add exactly 50 of the differently colored peas to each population of unmarked split peas. We now have hundreds, if not thousands, of members in each population of which 50 were ‘captured’ and marked in the first sampling event. The sampling can be done using a teaspoon of about 5mL capacity, which gives samples of about 50 individuals. The number of marked and unmarked split peas in the spoon are counted and a population estimate is made. The peas are replaced, the populations is mixed (stirring or shaking with the lid on) and the next sample is taken. This is repeated as long as required. Once there are sufficient estimates, the sampling can be drawn to a close and discussion of the estimates can take place.

Supplementary materials include expository material on the motivating example and student worksheets.

(Resource photo illustration by Barbara Cohen, 2020; this summary compiled by Bibek Aryal)

• Food: Capture-Recapture with Smarties

A hands-on activity using the capture-recapture method to estimate the number of SmartiesTM candy pieces in a population and to study the variability in individual estimates compared to an estimate based on the mean of many estimates.  The activity was described in B. Dudley, "A practical study of the capture/recapture method of estimating population size, Teaching Statistics, 5 (3) (1983), 66-70.

Summary: A hands-on activity to study the variability of the capture/recapture technique for estimating population sizes, demonstrated using a population of Smarties candy as an example.

Specifics: The capture/recapture technique is used to arrive at estimates of the size of population of mobile animals using the formula:
a/d = c/b, where
a = number marked and released into the population,
b = size of the second catch,
c = the number recaptured in the second catch,
d = the size of the population as a whole
The contents of a box of smarties are poured into a saucer and all the sweets of red colour were counted (=a). After that, all the sweets are poured into a paper bag and shaken thoroughly. With an egg cup, without looking at the bag, the second sample (=b) was scooped and the number of red ones recaptured were recorded (=c). This exercise was repeated ten times and the mean was calculated. Finally, the number of Smarties in the model population were counted and compared with the estimates derived from the sampling. Students learn about the variability of individual estimates, which is quite large (remember that the mean of the estimate here is actually infinite since an observation of zero tagged items results in an infite estimate).

(Resource photo illustration by Barbara Cohen, 2020; this summary compiled by Bibek Aryal)

• Food: Capture-Recapture with M&M’s

A hands-on activity using the capture-recapture method to estimate the number of M&M’sTM in a population The activity was described in G. D. Bisbee and D. M. Conway, “Studying proportions using the capture-recapture method”, Mathematics Teacher, 92 (3) (1999), 215-218.

Summary: Scientists use the capture-recapture method as a tool to estimate population size. Animals are captured, tagged, and then released back into the population. Later, a sample is captured and a proportion used to estimate population size.

Specifics: Let us say that we sample a beetle population of unknown size. We capture and mark ten of those beetles with a spot of India ink, then return them to the population and give them time to mix in with the population. We then recapture another sample consisting of eight beetles, one of which was previously marked. We substitute the numbers into the foregoing proportion to estimate the population size, getting 1/8 = 10/(Pop size). Solving for the Pop size gives us an estimated population of eighty beetles. Students are, predictably, less than enthusiastic about having to handle the creepy-crawly critters so this activity uses a population of M&M’s of unknown size to estimate. Each team of two to four students receives some M&M in a paper cup, which is covered on top with crumpled paper towels. The students “tag” the M&M’s from a random sample and then, after mixing them back in, sample again to estimate the number in the cup (they can later check how far off their estimates  were and compare to other teams).

(Resource photo illustration by Barbara Cohen, 2020; this summary compiled by Bibek Aryal)

• Cartoon: Brokerage

A cartoon that provides a nice avenue for facilitating discussions of the importance of modeling in making forecasts. The cartoon was used in the December, 2017 CAUSE cartoon caption contest and the winning caption was submitted by Larry Lesser from The University of Texas at El Paso. The cartoon was drawn by British cartoonist John Landers (www.landers.co.uk) based on an idea by Dennis Pearl from Penn State University.