Richardson, Curtiss, and Gabrosek (2002)

Student's version

Rectangularity

Mary Richardson, Phyllis Curtiss, John Gabrosek, and Diann Reischman
Department of Statistics
Grand Valley State University
1 Campus Drive
Allendale, MI 49401-9403

Statistics Teaching and Resource Library, September 1, 2002

© 2002 by Mary Richardson, Phyllis Curtiss, John Gabrosek, and Diann Reischman, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the authors and advance notification of the editor.

This article describes an interactive activity illustrating sampling distributions for means, properties of confidence intervals, properties of hypothesis testing, confidence intervals for means, and hypothesis tests for means. Students generate and analyze data and through simulation explore these concepts. The activity is completed in three parts. The three parts of the activity can be used in sequence or they can be used individually as “stand alone” activities. This allows the educator flexibility in utilizing the activity. Part I illustrates the sampling distribution of the sample mean. Part II illustrates confidence intervals for the population mean. Part III illustrates hypothesis tests for the population mean. This activity is appropriate for use in an introductory college or high school AP statistics course.

Key words: sampling distribution of a sample mean, confidence interval for a mean, hypothesis test on a mean, simulation

Objective

After completing the Rectangularity activity, students will understand:

How to construct and use the sampling distribution for the sample mean

How to construct and interpret a confidence interval for a mean

How to perform a hypothesis test on a mean

How to interpret the level of significance of a hypothesis test (type I error rate)

How to interpret the p-value of a hypothesis test

How to interpret the type II error rate of a hypothesis test

How to interpret the power of a hypothesis test

The relationship between type I and type II error rates and power

Materials and equipment

Each student needs a random number table or a calculator that generates random numbers, four sticky notes (for Parts II and III), and a copy of the activity (which includes statistical guides containing relevant notation, formulas, and definitions). Included in the student’s version of the activity is a sheet with a population of 100 rectangles having different areas. Each square counts as one unit towards a rectangle’s area.

Time involved

The activity is completed in three parts. The estimated completion time for each part is one class period (approximately one hour). The three parts of the activity can be used in sequence or they can be used individually as “stand alone” activities. Part I illustrates the sampling distribution of the sample mean and involves calculations that should be completed using either a computer software package or a graphing calculator. Part II illustrates confidence intervals for the population mean. Part III illustrates hypothesis tests for the population mean.

Activity description - Part 1: sampling distribution of the sample mean

To begin, the teacher draws a histogram of the population distribution of areas on the whiteboard. The population distribution of areas is skewed to the right (positively skewed).

Ten groups of two or three students are formed and the following tasks are assigned to each group.

Select two different random samples of n = 5 rectangles (with replacement)

Select two different random samples of n = 15 rectangles (with replacement)

Select two different random samples of n = 25 rectangles (with replacement)

Students calculate the average area of the rectangles for each sample drawn reinforcing the idea that the sample mean is a random variable.

To complete the data collection sheet, group results are combined to obtain 20 sample means for sample sizes n = 5, 15, and 25.

After data collection, students answer a series of questions based on the means and standard deviations of the sample means for the different sample sizes. Students discover properties of the distribution of a sample mean; namely, (i) the distribution of sample mean values is centered at the population mean, (ii) the distribution of sample mean values approaches a normal distribution as the sample size increases, (iii) the distribution of sample mean values has less variability than the original population, and (iv) the variability of sample mean values decreases as n increases.

Activity description - Part 2:confidence interval for the population mean

Each student selects a simple random sample of 25 rectangles (with replacement). Note that the population of rectangle areas does not have a normal distribution, but the t confidence interval procedure may be applied in this case since the sampling distribution of is approximately normal for samples of size 25. First, each student uses her sample to construct an 80% confidence interval for the population mean rectangle area. Each student writes her result on a sticky note and gives it to the instructor. Each student’s confidence interval is sketched horizontally on an overhead transparency leaving one blank horizontal line between intervals. The resulting overhead transparency displays all of the confidence intervals constructed by the students in the class.

Students see the results of drawing repeated samples from the same population and calculating 80% confidence intervals. Some of the confidence intervals will contain the population mean (6.26) and some will not. After graphing the class confidence intervals, their meaning is discussed. We stress that if we claim that we are 80% confident that a mean lies within the endpoints of a confidence interval, we are saying that the endpoints of the confidence interval were calculated by a method that gives correct results in 80% of all possible random samples. We are not saying that there is an 80% chance that a calculated interval contains the population mean. Students are asked to write a statement explaining how an 80% level of confidence should be interpreted.

Students are then asked to construct a 99% confidence interval for the population mean rectangle area. As above, the class confidence intervals are graphed and the results are discussed. We stress how to properly interpret a 99% confidence level and ask students to write a statement explaining how a 99% level of confidence should be interpreted. Students are asked to write a statement explaining how increasing the confidence level from 80% to 99% changed the width of their confidence intervals.

Activity description - Part 3:hypothesis test on the population mean

Each student selects a simple random sample of 25 rectangles (with replacement) or uses the simple random sample selected for Part II. Note that the population of rectangle areas does not have a normal distribution, but the t test may be applied in this case since the sampling distribution of is approximately normal for samples of size 25.

In question 1, students use their sample data to perform two hypothesis tests of H_o:m=9 versus H_a:m<9 with different levels of significance. Each student’s data is a different simulated sample. Since the true population mean rectangle area is m=6.26, the null hypothesis H_o:m=9 is false. Since H_o is false, performing these tests provides an opportunity to use simulation to illustrate properties of p-values, type II errors, and power.

The first test of H_o:m=9 versus H_a:m<9 is performed using level of significance a=.05. The instructor draws stems for a stem-and-leaf plot on the whiteboard. Each student writes her calculated p-value on a sticky note and places it on the stem-and-leaf plot.

Assuming a class size of 30 students, the plot will contain 30 calculated p-values. The p-values are calculated under the assumption that H_o:m=9 is true (when, in fact, m=6.26), so the p-values will tend to be small. We discuss with students that small p-values contradict H_o. Some students will not obtain small p-values. On the stem-and-leaf plot, a cut-off value is marked at a=.05. Each p-value falling at or below this cut-off represents a rejection of H_o (a correct decision). Each p-value falling above this cut-off represents a failure to reject H_o (a type II error). Since 30 samples are taken, and 30 tests are performed, students see that some samples result in a correct decision and other samples result in an incorrect decision (type II error). Students are asked to calculate the fraction of incorrect decisions to obtain a simulated value for b, the probability of a type II error, and a simulated value for the power = 1-b. An explanation is then given of how to interpret a type II error rate (and power) in terms of repeatedly performing the procedure of selecting a sample, then using the data to test a hypothesis about a population parameter, when the null hypothesis is false.

The second test is performed using a=.20. The p-value is the same as for the first test; however, the type I error rate is increased to 20%. On the stem-and-leaf plot of p-values, a new cut-off is marked at a=.20. Each p-value falling at or below this cut-off represents a rejection of H_o (a correct decision). Each p-value falling above this cut-off represents a non-rejection of H_o (a type II error). Students are asked to calculate the fraction of non-rejections of H_o out of the 30 tests to obtain a simulated value for b and a simulated value for the power. In examining the class results, students note that an increase in the type I error rate results in a decrease in the type II error rate and thus an increase in the simulated power.

In question 2, students use their sample data to perform two hypothesis tests of H_o:m=6.26 versus H_a:m¹6.26 with different levels of significance. Under the assumption that m=6.26, performing these tests provides an opportunity to illustrate properties of p-values and type I error.

The first test of H_o:m=6.26 versus H_a:m¹6.26 is performed using a=.05. The second test is performed using a=.20. As before, a stem-and-leaf plot of the class p-values is constructed.

The p-values are calculated under the assumption that H_o:m=6.26 is true, so the p-values will tend to be large. We discuss with students that large p-values do not contradict H_o. Some students will not obtain large p-values. On the stem-and-leaf plot, a cut-off value is marked at a. Each p-value falling at or below this cut-off represents a rejection of H_o (a type I error). Each p-value falling above this cut-off represents a failure to reject H_o (a correct decision). Since 30 samples are taken, and 30 tests are performed, students can see that some samples result in a correct decision and other samples result in an incorrect decision (type I error). For a=.05 and a=.20, students are asked to calculate the fraction of rejections of H_o out of the 30 tests to obtain a simulated value for a. An explanation is then given of how to interpret a type I error rate in terms of repeatedly selecting a sample, then using the data to test a hypothesis about a population parameter, when the null hypothesis is true.

Teacher notes

Students work with a population of 100 rectangles, drawing repeated simple random samples (with replacement). Prior to completing Part I, students should be familiar with descriptive statistics and probability distributions. Prior to completing Part II, students should be familiar with the basic mechanics of how to construct confidence intervals. Prior to completing Part III, students should be familiar with the basic mechanics of how to perform hypothesis tests, including the calculation of test statistics and p-values.

In this activity, we sample with replacement to preserve the independence of the sample observations. When sampling with replacement, it is possible for the same rectangle to be sampled more than once. If sampled rectangles are not replaced in the population, then each time a rectangle is withdrawn the probability of selection for the remaining rectangles will increase. In practice, we often either sample with replacement or we sample from a population that is so large that the withdrawal of successive items changes selection probabilities negligibly.

In this activity, we used the same data set to perform two different hypothesis tests at two different levels of significance. The instructor should emphasize that the level of significance, null hypothesis, and alternative hypothesis should be determined prior to data collection. We use the same data for multiple hypothesis tests to save time. Technically, we should have collected four separate data sets, one for each of the four tests conducted.

In addition, the instructor should stress to students that in reality one would not know the true value of the population mean m. If the parameter value were known, then there would be no point in utilizing sample data to draw an inference about the parameter. The instructor should stress that we assume that we know the parameter so that we can investigate the properties of hypothesis testing under different situations.

Assessment

For Part I: Students should write about the effect of sampling variability on the center, spread, and shape of the sampling distribution of the sample mean. Students should write about the effect of sample size on the shape and spread of the distribution of the sample mean.

The following questions can be used to assess student understanding or as challenge problems for students who complete the activity early.

1. What happens to the shape of the sampling distribution of the sample means for this non-normal population as the sample size increases?

2. How do you think the shape, mean, and standard deviation of the distribution of the sample means for samples of size 100 would compare to the shape, mean, and standard deviation for the samples of size 25 that the class took?

3. Widgets produced by a machine are known to have a mean diameter of 12 mm with a standard deviation of 0.31 mm. Suppose that we take a random sample of 90 widgets and measure each widget’s diameter. We calculate the mean diameter of the 90 widgets. We repeat this process every day for 365 days so that we have .

What would we expect the mean of the 365 daily means to be?
What would we expect the standard deviation of the 365 daily means to be?
What would we expect the shape of the histogram of the 365 daily means to be? Why?
Assuming that the machine continues to perform as it has in the past, what is the probability that for the next day the mean diameter of the 90 sampled widgets will be between 11.95 mm and 12.05 mm?
Why is simply looking at the mean diameter not enough to say that the machine is producing widgets with diameters close to the desired 12mm?

For Part II: Students should be able to explain how to interpret a confidence interval. Additionally, students should be able to describe the relationship between the confidence level and the width of a confidence interval.

The following questions can be used to assess student understanding or as challenge problems for students who complete the activity early.

For all of these questions, assume that the samples are large enough so that the sampling distribution of the sample mean is approximately normal.

1. Suppose a simple random sample (SRS) of 20 rectangles has sample mean, = 7.3, and sample standard deviation, s = 6.1. Based on the sample, we wish to estimate the value of the population mean, m.

What is the point estimate for m?
What is the standard deviation of the point estimate?
The mean of the sample will not be exactly equal to the mean of the population, thus there is error associated with the point estimate. With 95% confidence, what is the maximum error associated with the point estimate? (That is, what is the largest possible difference between and m) This value is often called the margin of error.
The margin of error in part (c) consists of how many estimated standard deviations of ?

2. Suppose the sample mean, , from a SRS of 40 rectangles is used to estimate m.

How would you expect the standard deviation of the sampling distribution of the sample mean of 40 rectangles to compare to the standard deviation of the sampling distribution of the sample mean of 20 rectangles? Explain.
How would you expect the 95% margin of error for the estimate of m for the 40 rectangles to compare to the 95% margin of error for the 20 rectangles in the previous problem? Explain.
Do you think using a sample mean from a sample of size 40 will give a more precise estimate of m than the sample mean from a sample of size 20? Explain.

3. In the activity, you selected a SRS of 25 rectangles and constructed an 80% confidence interval. Suppose you had selected a SRS of 40 rectangles and constructed an 80% confidence interval. How would you expect the confidence interval constructed from 40 rectangles to compare to the confidence interval constructed from 25 rectangles? Explain.

4. For a large population, a 90% confidence interval for m is found to be 23.5 to 28.9. Why is the following statement incorrect? “There is a 90% chance that m is between 23.5 and 28.9.”

5. Suppose you select a SRS of size 30 from a large population and find a 95% confidence interval for m to be 17.30 to 23.47. Your friend selects a separate SRS of size 30 from the population and finds a 95% confidence interval for m to be 18.64 to 24.81. Which confidence interval is better? Explain.

For Part III: Students should be able to explain type I error and type II error in a specific problem. Additionally, students should be able to describe the relationship between type I and type II error rates and power.

The following questions can be used to assess student understanding or as challenge problems for students who complete the activity early.

1. A company is trying to decide whether to buy a new Widget machine that costs $1 million. It is decided it will be worth buying the machine if there is overwhelming evidence that the mean number of defective Widgets will decrease from the current rate of 200 per day.

State the null and alternative hypotheses needed to test if the machine should be purchased.
Describe a type I error in the context of this problem.
Describe a type II error in the context of this problem.
Argue that a type I error is a more serious error in this problem.
For this situation, should the company run the test at the 1%, 5%, or 10% significance level? Explain.

2. Explain the fallacy in reasoning in the following statement. “I wanted to reduce the chance of committing an error, so I reduced the type I error rate to .001.”

3. A doctor claims that his patients wait an average of 10 minutes in his waiting room. A disgruntled patient claims it is really higher. For a random sample of patients, the sample mean is 10.8 with a standard deviation of 2.1.

If the sample consisted of 25 patients, perform the appropriate hypothesis test using a 1% level of significance (a=.01).
If the sample consisted of 50 patients, perform the appropriate hypothesis test using a 1% level of significance (a=.01).
Give an intuitive justification for why changing the sample size may result in changing the conclusion about a null hypothesis.
In general, what is the relationship between the sample size and the absolute value of the test statistic?
In general, what is the relationship between the sample size and the p-value? (To answer this question, refer to the t-curve.)
What do you think is the overall relationship between the sample size, the type II error rate, and the power, when H_o is false?

References

Aliaga, M. and Gunderson, B. (1999). Interactive Statistics. New Jersey: Prentice Hall.

Scheaffer, R., Gnanadesikan, M., Watkins, A., and Witmer, J. (1996). Activity-Based Statistics: Instructor Resources. New York: Key Curriculum Press; Springer.