# Testing

• ### You're My Null

Lyrics & Music © 2014 Greg Crowther

If every gal I’ve met is a hypothesis
Who might turn out to be the mate I’ve sought,
And if I were to yield
To the conventions of my field,
Then you would be referred to as H-naught...

CHORUS:
You’re my null, after all; you’re a theory in the making,
So robust to every test -- my hypothesis of choice.
You account for the past; you’re a vision of the future;
And when I feel confused, you’re my signal through the noise.

My life has been an uncontrolled experiment --
A source of all too many scattered plots.
But you provide a line
With an R-squared of point-nine;
Yes, you and you alone connect my dots...

CHORUS

One can never prove a null
In a finite length of time;
Each finding simply strengthens it or not.
But if I see trends emerge --
And let’s just say I do --
I’d be crazy not to publish what I’ve got...

CHORUS [twice]

• ### Simulation!

Lyrics © 2018 Lawrence M. Lesser and Dennis K. Pearl
may sing to the tune of Kool and the Gang's "Celebration!"

Yahoo!
Simulation
Yahoo!
This is our simulation

Sim-u-late new groups, again!
(let's simulate)
(then) rep-li-cate the test, again!
(let's rep-li-cate)

We're comparing group means right here,
A distribution of samples makes it clear.
So find the frac-tion of values further out:
That's going to the p-value, no doubt!

Come on now, sim-u-la-tion
Let’s all simulate how groups are assigned
Sim-u-la-tion
We will investigate how null does align

Do groups differ from each other?
It's up to you, and what you measure
It is time to simulate, again!

Yahoo!
It’s a simulation
Yahoo!

Sim-u-late new groups, again!
(it’s a simulation)
(then) rep-li-cate the test, again!
(let's rep-li-cate)

• ### Hypothesis on Trial

Lyrics © 2015 by Larry Lesser, music by Larry Lesser and Dominic Dousa

H-O is called the null,
It's the status quo, conservative and dull.
The null is that you're  innocent when you're on trial,
That's just our judicial style.

So evidence is gathered, the data summarized:
They deliberate on which verdict is wise.
"Fail to reject the null" is a verdict to  acquit.
"Rejection of the null" is a verdict to  convict.

If you "fail to reject" the null could still be false;
A "not guilty" verdict doesn't prove you have no faults.
A Type One error is to falsely  convict.
A Type Two error is to falsely  acquit.

If the trial has high power, the odds are good.
It will  convict  when it should.
Courtroom analogy helps us out
With hypothesis tests, beyond a reasonable doubt!

• ### Everything's Unusual

Lyrics © 2015 by Larry Lesser, music by Dominic Dousa

When it comes to bundles of joy,
Fifty one percent of births are boys.
That's so close to half
You might just laugh.

But with all the births every year,
It's statistically  significant  I fear!

CHORUS [2X]:

Said it before, I'll say it again:
Everything's  significant  for big enough n!

What was their price of their gas.
The calculation went:
The mean was up by 1 cent!

[REPEAT CHORUS 2X:]

Two million students took the SAT:
It's one measure of quality.
They said this year's average score

[REPEAT CHORUS 2X]

When you do a hypothesis test,
The statistic formula is expressed.
And if n is  bigger  that you'll see
then that means a  smaller  p.

[REPEAT CHORUS 2X]

Statistical significance
Only goes so far,
Find an effect size

[REPEAT CHORUS 2X]

Three out of four dentists ain't a big deal,
But three thousand of four thousand makes that same fraction
Statistically real!

• ### Chi-Squared Dance

Lyrics © 2015 Dominic Dousa and Lawrence M. Lesser, Music by Dominic Dousa

In a two-way table, blood types are the rows
And birth months are the columns – that's just how we chose.
The chi-squared test can tell us the answer to this question:
Do the categories interact? Is there a connection?

REPEAT 2X:
Compare expected counts to values we observed:
We will answer YES if a large gap occurred.

So birth months is in columns and blood types that's in rows.
The two are independent under our H-O.
Now we have 12 columns and also have 4 rows
That we have 33 degrees of freedom that we must know.

REPEAT 2X:
The chi-squared distribution has a long right tail:
That's the tail we focus on to see if H-O fails!
• ### ANOVA

Music and Lyrics © 2016 Monty Harper

Suppose you study 12 different herds of  elephants
Can the differences in their average weights
be explained by random error?
Or do any herds have weights that are truly different?
It's a difficult conundrum

CHORUS:
ANOVA, ANOVA
ANOVA, ANOVA
ANOVA, ANOVA
Analysis of Variance

The variance of the average weight
between the different herds
Is matched against the variance of
weights within each set
The ratio of "between" against
If the average weights are all equal, how
unlikely is the "between" / "within" ratio you get?

[REPEAT CHORUS]

If the variance between your herds
Dominates that within
You can reject the null hypothesis
And let the real work begin
At least one herd of your elephants features
weights  of a different mean
But ANOVA gives no details -
It's a one trick multiple mean comparing machine.

[REPEAT CHORUS]
• ### A Fitting Conclusion

Lyrics and Music © 2015 Lawrence M. Lesser

0.007  is the p that we get.
That's smaller than the alpha we set.
So when it comes to the null we reviewed,
Reject the null we conclude!

• ### Flight Simulators

Since a flight simulator uses simulations for learning decision-making in flying, how about using one to teach simulation-based inference?
That could make a great pilot program!

Dennis Pearl and Larry Lesser

• ### Take Me Out to the Brew'ry

Lyric © 2017 Lawrence M. Lesser
may sing to the tune of "Take Me Out to the Ball Game" (Jack Norworth and Albert Von Tilzer)

Take me out to the brew'ry,
biggest one in the world:
Guinness used data to lead the pack--
boost the taste and keep costs on track!
But with few, few samples for testing,
mean's error was so unexplained:
then came William Gosset's result
under Student's name!

• ### Type II Error

may sing to the tune of "Barbara Ann" by Beach Boys

Type II Error
Type II Error
Type II Error
Cause me to fail to reject
The false null hypothesis
Type II Error
Type II Error

No Evidence
No Evidence
No Evidence
Cause me to fail to reject
The false null hypothesis
Type II Error
Type II Error