Data Management

  • Lyrics & Music © 2017 by Greg Crowther

    When should the data be kept as is?
    Try to think about a real-life case.
    If you toss a point just to reach significance,
    That exclusion would be off-base.
    When can the data be clipped and trimmed?
    Try to picture a scenario.
    If you toss a point because the measurement was done wrong,
    That exclusion just might be the way to go.

    Throw that bad point out!
    Should we throw that bad point out?
    The IRB is not a group to flout.
    Should we throw that bad point out?

    When should a study be redesigned?
    Note the principles we must apply.
    If subjects don't provide their informed consent ,
    The risk to the subjects may be deemed too high.

    Throw that study out!
    Should we throw that study out?
    The IRB is not a group to flout.
    So should we throw that study out?

  • There are a lot of small data problems that occur in big data.  They don't disappear because you've got lots of stuff.  They get worse.

    David J. Spiegelhalter (1953 - )

  • Always expect to find at least one error when you proofread your own statistics. If you don't, you are probably making the same mistake twice.

    Cheryl Russell

  • The individual source of the statistics may easily be the weakest link. Harold Cox tells a story of his life as a young man in India. He quoted some statistics to a Judge, an Englishman, and a very good fellow. His friend said, "Cox, when you are a bit older, you will not quote Indian statistics with that assurance. The Government are very keen on amassing statistics ... they collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But what you must never forget is that every one of these figures comes in the first place from the chowty dar [village watchman], who just puts down what he damn pleases."

    Sir Josiah Charles Stamp (1880 - 1941)