 align=middle width=80% Student's version HTML Format Word Format

Regression on the Rebound

Trent D. Buskirk *and Linda J. Young**

*Department of Mathematics and Statistics
Lincoln, NE 68588-0323

**Department of Biometry
Lincoln, NE 68583-0712

Statistics Teaching and Resource Library, August 29, 2001

© 2001 by Trent D. Buskirk and Linda J. Young, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.

This activity is an advanced version of the “Keep your eyes on the ball” activity by Bereska, et al. (1999). Students should gain experience with differentiating between independent and dependent variables, using linear regression to describe the relationship between these variables, and drawing inference about the parameters of the population regression line. Each group of students collects data on the rebound heights of a ball dropped multiple times from each of several different heights. By plotting the data, students quickly recognize the linear relationship. After obtaining the least squares estimate of the population regression line, students can set confidence intervals or test hypotheses on the parameters. Predictions of rebound length can be made for new values of the drop height as well. Data from different groups can be used to test for equality of the intercepts and slopes. By focusing on a particular drop height and multiple types of balls, one can also introduce the concept of analysis of variance.

Key words: Linear regression, independent variable, dependent variables, analysis of variance

Materials

Each group of 3-5 students needs a measuring device (preferably a tape measure), a ball that will rebound when dropped, and graph paper. A pool of balls, such as super balls, tennis balls, racquetballs, basketballs, and soccer balls, should be available. It is better to have more balls available than groups as students always like to have a choice! Optional materials are chalk, post-it notes, and a measuring stick.

Time

A class period of 50 or 75 minutes is sufficient for collecting the data, plotting the data, and estimating the regression line. Additional class periods could be used to complete further analyses, such as setting confidence intervals on the parameters or testing equality of the regression lines obtained by different groups.

Objective

The objective of this activity is to estimate the population regression line relating the rebound height of a ball to the height from which it is dropped and to draw inferences using the fitted regression line. A variation of this activity allows students to use the analysis of variance to determine whether there is a difference in the mean rebound height of different balls dropped from a common height.

Description of Activity

This advanced version of the “Keep you eyes on the ball” activity by Bereska, et al. (1999) offers students an opportunity to explore the relationship between a ball’s rebound height and the height from which it is initially dropped. By setting their own drop heights and by collecting their own data, groups will gain experience with independent and dependent variables. Students will also use linear regression to draw inferences and to make predictions based on their fitted lines. For this activity, rebound height is defined to be the highest level of ascent that the ball makes after its impact with the floor.

To collect the regression data each group should drop its selected ball from each of ten heights five times. These numbers can be varied according to course time constraints. Students should determine the (ten) drop heights for the ball that their group has selected (one ball should be used per group). During a 50-minute class period, for instance, students may drop a basketball five times at each of ten heights. Actual student data are included after the prototype activity in the Example Student Output section.

To better understand the nature of the relationship between the drop and rebound heights, students should first plot their data. On this plot, the students should be able to see that a line is the best descriptor of this relationship. Students should also be able to identify outliers on this plot. Once identified the group should be able to investigate the nature of any outlying observations. Sometimes these outliers end up being the first observations recorded for a particular drop height and may simply be a function of the inexperience of the rebound height recorder. Students are then asked to use their data to fit a linear regression line and to use it to make predictions about the rebound heights of a ball dropped from a drop height for which no data were collected. Students are encouraged to select their own heights and should avoid extrapolation. Students are also asked to interpret the regression slope and intercept within the context of this activity as well as to comment on the scope of inference for their regression line.

Assessment

Below is a sample exam question to test an understanding of the basic concepts associated with linear regression:

POSSIBLE EXAM QUESTION: OFFICE-TEMPS Inc. wants to screen applicants for basic typing skills using a timed test. Applicants are required to type as many words (in the order in which they appear on a uniform list) as possible in the prescribed time. The allowable times range from 10 to 90 seconds. Data collected from all applicants interviewing last week are listed below:

 Time (Sec) 10 10 10 20 20 20 60 60 60 90 90 90 # of words 18.5 19 17.75 29 29.5 32 75 60.5 53.25 80.5 100 93.25
1. Identify the independent and dependent variables in this study.
2. Assuming that the assumptions of linear regression hold, fit a regression line to the data. Interpret the estimated slope and intercept in the context of this study.
3. Is the regression intercept significantly different from zero? Justify your answer.
4. Compute a 95% prediction interval for the number of words typed in 40 seconds and interpret it in the context of this study.
5. Compute a 90% confidence interval for the mean number of words that can be typed in 40 seconds and interpret it in the context of this study.
6. Clearly explain why the intervals in (d) and (e) are NOT the same in the context of the problem.

Teacher notes

Students often confuse dependent and independent variables and have difficulty grasping the concept of a population regression line that is being estimated by fitting a linear regression line. In addition, it is often difficult to find data that allow a careful consideration of the assumptions underlying regression. This activity was designed to permit the students to look at the underlying assumptions of regression and to estimate the population regression line. Clearly, taking a little more data will lead to changes in the estimated population regression line even though the population line remains unchanged. In addition, the differences in a confidence interval on the mean rebound height at a given drop height and a prediction interval for a new observation at a given drop height become more real to the students.

This activity will work best if students are arranged into groups consisting of 2 to 4 members. It will be difficult to complete the data collection if students work alone. A group of size three is optimal in that it allows one student to drop the ball, a second to observe the rebound height, and a third to record the data. If the groups are larger than three, additional observers on the rebound height can be helpful.

The most challenging part of the data collection is accurately recording the rebound heights. The rebound-height observer(s) must be eye level with the rebound height to record it accurately. Students should practice dropping the ball and recording the rebound heights. Some students will force the ball downward resulting in anomalous rebound heights. Other students will learn that they are better rebound recorders than they are droppers. Practice time should be allocated so that groups can assign duties, determine the range of drop heights to be used, and practice dropping the ball and recording its rebound height.

An additional concept that may be further discussed within the context of this experiment and its subsequent analysis is the idea of outlying or influential observations. Sometimes outliers are observed. This could cause the students to question the assumption of normality. Often students can identify reasons for the outlier. For example, “It was the first drop.”

To evaluate the assumption of equality of variances for the rebound heights at varying levels of the drop height, students can use the 5 rebound heights at each drop height to plot the sample standard deviation versus the drop height. Although five observations provide limited insight, this may help identify patterns in measurement error or groups with potential outliers. In addition to checking the homoscedasticity assumption, students can also use normal probability plots or residual plots to check violations of the normality assumption or to identify outliers.

The drop height should be a good predictor of rebound height so a discussion of high R2 values may be appropriate as well as a discussion of the cloud-like pattern that one would expect to see in the plot of residuals versus independent variable. Groups can compare their regression lines with other groups for particular balls of interest.

Students should generally conclude that inference could only be drawn to the ball that was dropped, to the particular surface on which it was dropped, and within the range of drop heights used to construct the line. Because the true relationship between drop height and rebound height is quadratic, the intercept is usually significantly different from zero. Thus, the problems associated with extrapolation are clear when interpreting the estimated intercept. This also serves as clear example of a model that is useful for the range of observed data, but is not the true underlying model. It could be instructive to ask students to predict the rebound height for a ball that is dropped from well above any observed drop heights, say 200 inches from the ground, based on their fitted regression line. Students should realize that inference ought to be restricted to the person doing the dropping or observing the rebound height unless this responsibility was rotated within the group.

Depending on the level of the course, subsequent class periods could be used to test for equality of the regression lines from two balls of the same type, or two balls of different types.

An extension of this activity is to use the data for a given drop height to test for differences in the mean rebound heights of different kinds of balls. If the data are kept from the regression activity and an effort is made to have at least one common height for all groups, it should not be necessary to collect more data.

Acknowledgements

This is Journal Paper No. 13310 of the Nebraska Agricultural Research Division, University of Nebraska at Lincoln. Research was supported in part by University of Nebraska Agricultural Experiment Station Project NEB-23-001.

References

Bereska, C., Bolster, C. H., Bolster, L. C., and Scheaffer, R. (1999). EQL Investigation 15: Keep your eyes on the ball. Exploring Statistics in the Elementary Grades: Dale Seymour Publications: White Plains, New York.

Editor's note: Before 11-6-01, the "student's version" of an activity was called the "prototype".

 © 2000-2002 STAR Library