Significance Testing Principles

Cartoon: Lefty's Gym

This cartoon caption can be used to discuss the difference between one- and two-sided tests (and why the gym in the cartoon might choose the former). The cartoon was used in the June 2025 CAUSE cartoon caption contest and the winning caption was written by Steve Wang from Swarthmore College. The cartoon was drawn by British cartoonist John Landers (www.landers.co.uk) based on an idea by Dennis Pearl from Penn State University.

View Resource

0

No votes yet
Song: Best Fit Lover

The lyrics for this song were written and the music was created and performed in 2021 by undergraduate student Jonathan F. Spencer from Miami University in Ohio. The song took second place in the song/video category of the 2021 A-mu-sing Contest. The lyric is designed to stimulate discussion of testing in an unusual situation with a “research question” creating a point alternative, the singers’ “best fit lover,” being compared with a null of “someone else.”

View Resource

0

No votes yet
Video: I Knew You Were Trouble

The lyrics and the direction for this video were by high school student Jordyn Gross with acting by the students in Mr. Schlaegel's 2018 AP Statistics course at Burlington Township High School. The video uses the music from Taylor Swift's 2012 hit song by the same name. The video earned fifth place in the song/video category of the 2019 A-mu-sing Contest and is designed to discuss the meaning of Type I and Type II errors in hypothesis testing situations.

View Resource

0

No votes yet
Cartoon: Data For Sale

This cartoon was created by Austin Boyd from University of Tennessee and took first place in the cartoon category of the 2019 A-mu-sing Contest. The cartoon provides a humorous way to facilitate conversation about the multiple comparisons caveat (that the chance of getting at least one significant result grows with the number of things being tested) and the large sample caveat (that it is more likely to see small p-values with smaller effect sizes when you have a larger sample size).

View Resource

0

No votes yet
Poem: I am the Null

A poem reflecting on Type I errors and the use of the null hypothesis in testing by Micah Wascher, a high school student at North Carolina School of Science and Mathematics. The poem won an honorable mention in the 2025 A-mu-sing Contest.

View Resource

0

No votes yet
Video: P-value's More than Alpha

"P-value's More than Alpha" is a music video by David Yew, an undergraduate student at Singapore Management University, that reviews introductory normal theory testing. The music is a fun parody of Billy Joel's 1989 hit song "We Didn't Start the Fire" and took second place in the 2025 A-mu-sing Contest. David also credits his statistics instructor, Rosie Ching, for providing feedback.

View Resource

0

No votes yet
Cartoon: A Statistician's Relationship

A cartoon to use in explaining how hypothesis testing typically includes a null hypothesis that nothing is going on except random chance and p-values are calculated under that assumption. The cartoon was created by Joy Reeves from the Rachel Carson Council of Duke University and took first place in the cartoon/joke category of the 2025 A-mu-sing competition.

View Resource

0

No votes yet
Cartoon: Brass Ring

A cartoon that can be used to discuss the multiple testing issue and the concept of p-hacking. The cartoon was used in the June 2021 CAUSE cartoon caption contest and the winning caption was written by Jim Alloway from EMSQ Associates. The cartoon was drawn by British cartoonist John Landers (www.landers.co.uk) based on an idea by Dennis Pearl from Penn State University.

View Resource

0

No votes yet
Cartoon: Fruit Stand

A cartoon that can be used to discuss the importance of using a paired analysis to reduce the variability in the response for a heterogeneous population. The cartoon was used in the February 2021 CAUSE cartoon caption contest and the winning caption was written by Jeremy Case from Taylor University.. The cartoon was drawn by British cartoonist John Landers (www.landers.co.uk) based on an idea by Dennis Pearl from Penn State University.

View Resource

0

No votes yet
Poem: Type Two

A poem about type II errors in diagnostic testing using a diabetes test context. The poem was written by Lawrence Lesser from The University of Texas at El Paso and received an honorable mention in the non-song category of the 2023 A-mu-sing Competition. The author also provided the following outline for a lesson plan:

Some sample questions (one per stanza) students can explore or discuss
as a practical application of statistics to a prevalent disease
that likely affects (or will) a friend or relative of almost everyone.

First stanza: Look up history of diabetes prevalence to explore questions such as: Is “1 in 10” roughly accurate for the United States and how does that compare to other countries? Was the 2003 lowering of the threshold for a prediabetes diagnosis based on updated medical understanding of the disease or more of a policy decision to give an “earlier warning”?

Second stanza: How does a hypothesis testing framework apply to an oral glucose tolerance test (OGTT)? It’s warned that a false positive is possible if the patient did not eat at least 150g of carbohydrates for each of the 3 days before the test. (This is likely what happened to the poet, whose diagnosis was overturned just 2 months later by an endocrinologist.)

Third stanza: Given the usual trend that the null hypothesis usually means no effect, no difference, nothing special, explain whether it seems consistent that a normality test such as Anderson-Darling would let normality be the null. When might it make sense for a doctor to view having a particular disease as the null hypothesis (and what would be the Type I and Type II errors?)?

Fourth stanza: Explain how having only a few individual values each day from a blood glucose meter (BGM) risks missing dangerously high variability of glucose (students can Google how high variability can be a risk factor for hypoglycemia and diabetes complications). Discuss how output from a Continuous Glucose Monitor (CGM) that records values every 5 minutes can be used to check, for example, that the coefficient of variation is sufficiently low (e.g., < 36%) and that “time in range” (e.g., 70-180 or 70-140 mg/dL) is sufficiently high. Example output is on page S86 of https://diabetesjournals.org/care/issue/45/Supplement_1.

Fifth stanza: Have students look up current FDA guidelines on how accurate over-the-counter BGM readings need to be (e.g., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7753858/) and have them connect this to margin of error, confidence intervals, etc.

Sixth stanza: Find online the diabetes “plate method” of taking a circular plate (9” in diameter) for a meal where half of the plate would have non-starchy vegetables, a quarter having lean protein, and a quarter with carbohydrate foods such as whole grains. How do this breakdown and total quantity compare to a pie chart of a typical meal that you (or typical college undergraduates) eat?

View Resource

0

No votes yet

Pages