Sorry, you need to enable JavaScript to visit this website.

Testing

  • Since a flight simulator uses simulations for learning decision-making in flying, how about using one to teach simulation-based inference?  That could make a great pilot program!

    Larry Lesser

  • Lyrics © 2015 by Larry Lesser, music by Larry Lesser and Dominic Dousa

    H-O is called the null,
    It's the status quo, conservative and dull.
    The null is that you're  innocent when you're on trial,
    That's just our judicial style.

    So evidence is gathered, the data summarized:
    They deliberate on which verdict is wise.
    "Fail to reject the null" is a verdict to  acquit. 
    "Rejection of the null" is a verdict to  convict. 

    If you "fail to reject" the null could still be false;
    A "not guilty" verdict doesn't prove you have no faults.
    A Type One error is to falsely  convict. 
    A Type Two error is to falsely  acquit. 

    If the trial has high power, the odds are good.
    It will  convict  when it should. 
    Courtroom analogy helps us out
    With hypothesis tests, beyond a reasonable doubt!

  • Lyrics © 2015 by Larry Lesser, music by Dominic Dousa

    When it comes to bundles of joy,
    Fifty one percent of births are boys.
    That's so close to half
    You might just laugh.

    But with all the births every year,
    It's statistically  significant  I fear!

    CHORUS [2X]: 

    Said it before, I'll say it again:
    Everything's  significant  for big enough n!

    A thousand stations were asked,
    What was their price of their gas.
    The calculation went:
    The mean was up by 1 cent!

    [REPEAT CHORUS 2X:] 

    Two million students took the SAT:
    It's one measure of quality.
    They said this year's average score
    Had declined by four.

    [REPEAT CHORUS 2X] 

    When you do a hypothesis test,
    The statistic formula is expressed.
    And if n is  bigger  that you'll see
    then that means a  smaller  p.

    [REPEAT CHORUS 2X] 

    Statistical significance
    Only goes so far,
    Find an effect size
    Before you show your charts.

    [REPEAT CHORUS 2X] 

    Three out of four dentists ain't a big deal,
    But three thousand of four thousand makes that same fraction
    Statistically real!

  • Lyrics © 2015 Dominic Dousa and Lawrence M. Lesser, Music by Dominic Dousa

    In a two-way table, blood types are the rows
    And birth months are the columns – that's just how we chose.
    The chi-squared test can tell us the answer to this question:
    Do the categories interact? Is there a connection?

    REPEAT 2X: 
    Compare expected counts to values we observed:
    We will answer YES if a large gap occurred.

    So birth months is in columns
    and blood types that's in rows.
    The two are independent under our H-O.
    Now we have 12 columns and also have 4 rows
    That we have 33 degrees of freedom that we must know.

    REPEAT 2X: 
    The chi-squared distribution has a long right tail:
    That's the tail we focus on to see if H-O fails!
  • Music and Lyrics © 2016 Monty Harper

    Suppose you study 12 different herds of  elephants 
    Can the differences in their average weights 
    be explained by random error?
    Or do any herds have weights that are truly different?
    It's a difficult conundrum
    Lucky for you there's a test that will help you compare!

    CHORUS:
    ANOVA, ANOVA
    ANOVA, ANOVA
    ANOVA, ANOVA
    Analysis of Variance

    The variance of the average weight 
    between the different herds 
    Is matched against the variance of
    weights within each set
    The ratio of "between" against
    "within" gives your statistic
    If the average weights are all equal, how
    unlikely is the "between" / "within" ratio you get?

    [REPEAT CHORUS] 

    If the variance between your herds 
    Dominates that within
    You can reject the null hypothesis
    And let the real work begin
    At least one herd of your elephants features
    weights  of a different mean
    But ANOVA gives no details -
    It's a one trick multiple mean comparing machine.

    [REPEAT CHORUS] 
  • Lyrics and Music © 2015 Lawrence M. Lesser

    0.007  is the p that we get.
    That's smaller than the alpha we set.
    So when it comes to the null we reviewed,
     Reject the null we conclude!

  • Since a flight simulator uses simulations for learning decision-making in flying, how about using one to teach simulation-based inference?
    That could make a great pilot program!

     

    Dennis Pearl

  • Lyric © 2017 Lawrence M. Lesser
    may sing to the tune of "Take Me Out to the Ball Game" (Jack Norworth and Albert Von Tilzer)

    Take me out to the brew'ry,
    biggest one in the world:
    Guinness used data to lead the pack--
    boost the taste and keep costs on track!
    But with few, few samples for testing,
    mean's error was so unexplained:
    then came William Gosset's result
    under Student's name!

     

  • Lyrics © Mary McLellan
    may sing to the tune of "Barbara Ann" by Beach Boys

    Type II Error
    Type II Error
    Type II Error
    Cause me to fail to reject
    The false null hypothesis
    Type II Error
    Type II Error

    No Evidence
    No Evidence
    No Evidence
    Cause me to fail to reject
    The false null hypothesis
    Type II Error
    Type II Error

  • Lyrics © Mary McLellan
    may sing to the tune of Queen's "We Will Rock You"

    If the null is really true and we get a sample
    That causes us to reject the null,
    That’s a Type I Error

    Type I, Type I Error
    Type I, Type I Error

    Repeat

Pages

list