# Chance News 6

September 16 2005 to September 31 2005

## Statistics could have spotted mass murderer

NewScientist.com news service, John Pickrell, 06 September 2005.
Plotting death Emma Young, New Scientist, 09 February 2001.

In 2000 Harold Shipman, a UK family doctor, was convicted of murdering 15 patients. Statistical quality-control methods might have allowed authorities to catch Shipman much sooner, according to a new study. David Spiegelhalter of the UK Medical Research Council's Biostatistics Unit, Cambridge, UK, has applied an industrial quality-control technique to data on the death of Shipman's patients to see if his heinous crimes could have been detected any earlier. "The method was first used in 1943 to ensure consistent quality in explosive shells and other wartime munitions production lines", says Spiegelhalter. The same statistics are used today in a wide range of industries, but have never before been applied to healthcare performance.

These methods on their own are not enough but cross-referencing other factors such as time of death or location might have set alarm bells ringing – many of Shipman's victims were murdered at around 3pm, during his afternoon rounds, and anomalous numbers died while in Shipman’s presence. "Maths has many crime-fighting applications," comments applied mathematician Chris Budd at the University of Bath, UK. "Statistics can be a useful approach for detecting anomalies in many contexts," he says.

The control-chart graphical method was developed by US physicist Walter Shewhart for use in the manufacturing industry. Scores are not ranked into a league table. Instead, the number of adverse outcomes is plotted against the total number of cases on a graph. A line is drawn through the mean, and all scores within three standard deviations (in practice, most of the scores) are considered to be down to inherent variation in the system. Any scores outside the 'control limits' suggest a special cause. Tom Marshall, of the University of Birmingham, told New Scientist.

This tells you where the real problems in a system are. In a league table, someone has to be at the top and someone has to be at the bottom, but that doesn't necessarily mean any kind of intervention should be taken.

However, Spiegelhalter believes his method is more powerful as it allows data from multiple years to be assessed.

## A remarkable story about the German Lotto

A Miracle in the German Lotto
Strange But True, Richmond.com
April 26, 2005
Bill Sones and Rich Sones
(reprinted with permission of Bill and Rich Sones)

Q. In the German Lotto on June 21, 1995, the numbers 15-25-27-30-42-48 were drawn out of 49. Bizarrely, it was later discovered that these same 6 had been drawn on Dec. 20, 1986, something that hadn't happened before in the 3,016 drawings of the Lotto. With nearly 14 million combos possible, how incredible was this really? Was someone rigging the picks?

A. Pardon the daunting math, but to figure the likelihood of getting a match in 3,016 drawings, first you need to figure the likelihood of getting NO matches, says Henk Tijms in his new book "Understanding Probability." Start by multiplying 13,983,816 x (13,983,816 - 1) x (13,983,816 -2)... x (13,983,816 - 3,015). There are 3,016 factors here.

Then divide by 13,983,816 to the 3016 power! Once your overworked computer cools off, you'll see an answer of 0.7224. But that's the likelihood of no matches; instead subtract 1 - 0.7224 to find the likelihood of at least one match, which equals 0.2776. This means there was a better than 1 in 4 chance that this would happen in 3,016 drawings!

What fools people is that the chance of matching a PARTICULAR 6-number-sequence is vanishingly small, but not the chance of matching SOME 6-number-sequence along the way. Says Tijms: This is basically another version of the classic birthday paradox, where all it takes is 23 people in a room for there to be a 50-50 chance of at least two of them having the same birthday. "In the lottery situation, it's analogous to there being 3,016 people in the room and 13,983,816 possible birthdays." So no rigging necessary, but just someone's incredible recordkeeping to track thousands of picks and spot the repeat.

### Discussion

The article says that the math is "daunting." Is it really so daunting? Where did the authors get the calculation that divides 13,983,816 x (13,983,816 - 1) x (13,983,816 -2)... x (13,983,816 - 3,015) by 13,983,816 to the 3016 power? Is there a better way to arrange the calculation so that your computer won't be so overworked?

## Self Experimentation

Does the Truth Lie Within? The New York Times, September 11, 2005 Stephen J. Dubner and Steven D. Levitt

Another article in the New York Times by the authors of Freakonomics discusses self experimentation. The authors highlight the work of Seth Roberts:

Seth Roberts is a 52-year-old psychology professor at the University of California at Berkeley. If you knew Roberts 25 years ago, you might remember him as a man with problems. He had acne, and most days he woke up too early, which left him exhausted. He wasn't depressed, but he wasn't always in the best of moods. Most troubling to Roberts, he was overweight: at 5-foot-11, he weighed 200 pounds.

When you encounter Seth Roberts today, he is a clear-skinned, well-rested, entirely affable man who weighs about 160 pounds and looks 10 years younger than his age. How did this happen?

The authors go on to say that all of Seth Roberts ills were cured by self experimentation. He tried various interventions and evaluated the results.

It took him more than 10 years of experimenting, but he found that his morning insomnia could be cured if, on the previous day, he got lots of morning light, skipped breakfast and spent at least eight hours standing.

Losing weight depended on a theory of weight regulation known as the set-point theory which says that your weight is determined by what you needed to survive during stone age times because

when food is scarcer, you become less hungry; and you get hungrier when there's a lot of food around.

This encouraged us to store food when it was plentiful as fat so as to help us survive when food becomes scarce. It worked well long ago, but today food is plentiful all year round, so the signal to stop storing fat never gets turned off. Seth Roberts experimented with flavors that might trick the set-point system into thinking that food was actually scarce. A bland, unflavorful diet might work, he theorized, but no one wants to be stuck eating like that. He found that a few tablespoons of unflavored oil worked as did several ounces of sugar water.

The results were astounding. Roberts lost 40 pounds and never gained it back. He could eat pretty much whenever and whatever he wanted, but he was far less hungry than he had ever been. Friends and colleagues tried his diet, usually with similar results.

These self experiments are a special case of the "N of 1 trials." A good example of this type of trial appears in the British Medical Journal: Jeffrey Mahon, Adreas Laupacis, Allan Donner, Thomas Wood. Randomised study of n of 1 trials versus standard practice. BMJ 1996; 312: 1069-1074 (27 April). Full free text The N of 1 trial is simply a crossover trial with a single patient.

Numerous references and additional details are at the Freakonomics website

There are several amusing examples of self experimentation at this website, but not mentioned is one of my personal favorites, the 1994 Ig Nobel prize winner in Entomology:

Robert A. Lopez of Westport, NY, valiant veterinarian and friend of all creatures great and small, for his series of experiments in obtaining ear mites from cats, inserting them into his own ear, and carefully observing and analyzing the results. [Published as "Of Mites and Man," The Journal of the American Veterinary Medical Association, vol. 203, no. 5, Sept. 1, 1993, pp. 606-7.]

### Discussion

One potential source of bias in self experimentation is the inability to adequately blind the patient. What other sources of bias are there in this type of study?

Most research requires the informed consent of the research subject. Is informed consent an issue in self experimentation?

Submitted by Steve Simon

## Text Stats - squeezing the life out of poetry and prose

Amazon's Vital Statistics Show How Books Stack Up, Linton Weeks, Washington Post, August 30, 2005.

Amazon.com has a new capability. called Search Inside, which allows users to search through the entire text of a book online. This functionality produces some elementary statistics such as the number of letters, words and sentences. The Fun Stats section tells you how many words you are getting per dollar and per ounce with each book. So Leo Tolstoy's War & Peace offers 51,707 words per dollar, while Obliviously On He Sails: The Bush Administration in Rhyme by Calvin Trillin delivers only 1,106 words per dollar. When The Washington Post confronted Trillin with this evidence, he replied,

"I don't mind being compared to Tolstoy literarily, but when it comes to Fun Stats it's a little humiliating."

Search Inside also tries to quantify the readability and complexity of a book and to graphically display the most common words, called concordance, by setting the font size to be proportional to the number of times that word occurs in the book, a very simple yet surprisingly effective technique. (Incidently, the same graphical tool is used by flickr.com to display its most popular images, based on the tags that have been added to them, a new way to organise information that may eventually replace folders (click on the pdf download button and see Death to folders! on page 18).) A book's readability is quantified using the three statistics: the Fog, Flesch and Flesch-Kincaid indices and it also graphically ranks your chosen book's stats, relative to others books, but in a rather uninspiring manner. Complexity is quantified as the percentage of words with three or more syllables, the average number of syllables per word and words per sentence.

The Washington Post quantifies it favourite books:

Ulysses by James Joyce (9 on the Fog Index) is more complicated than William Faulkner's The Sound and the Fury (5.7 on the Fog Index). Yes, Charlotte Bronte provides more words per ounce (13,959) in Shirley than her sister Emily (10,444) in Wuthering Heights. And, yes, Ernest Hemingway used fewer complex words (5 percent) in his short stories than F. Scott Fitzgerald (9 percent).

But they are distinctly unimpressed by the cold, harse statistics:

But in its pure form, Text Stats is a triumph of trivialization. By squeezing all the life and loveliness out of poetry and prose, the computer succeeds in numbing with numbers. It's the total disassembling of truth, beauty and the mysterious meaning of words. Except for the Concordance feature, which arranges the 100 most used words in the book into a kind of refrigerator-magnet poetry game.

### Discussion

Can you think of alternative ways to quantify readability and complexity?

What alternative graphics might be suitable to display the statistics?

Try searching for your favourite author's books to see if his/her stats are improving over time.

Submitted by John Gavin.

## Amazon Uses Chance to Target Customers

I received this email from Amazon.com:

Amazon.com has new recommendations for you based on 17 items you purchased

or told us you own.

We recommend LaTeX Companion, The (2nd Edition) (Addison-Wesley Series on Tools and Techniques for Computer T)

List Price : $59.99 Price :$45.86
You Save : \$14.13 (24%)

Because you purchased or rated: The LaTeX Graphics Companion: Illustrating Documents with TeX and Postscript(R) you are 9,290 times more likely to purchase this item than other customers.

We hope you like these recommendations and would love to hear your feedback

on our recommendations algorithm. Email us at cb-arr-lift@amazon.com.

Submitted by Bob Hayden.

## Warned, but Worse Off

According to an article in the New York Times (August 22, 2005), people whose lung cancers are detected by CT scans have a five-year survival rate of 80 percent, whereas those whose cancers are detected by conventional means have a five-year survival rate of only 15 percent. Is this a good reason for someone at risk for this disease to get a CT scan?

The answer is that no one knows. It may be that CT scans don't actually result in people living any longer than they would otherwise, and the epidemiology is at this point not understood well enough.

There are two issues: First, CT scans are capable of detecting a cancer in an earlier stage (since they can detect smaller tumors), but it's not clear that earlier detection and treatment of these cancers actually adds anything to the lifespan. For example, if a tumor was detected at a stage when the patient would have survived for six years, regardless of treatment, but would only have been detected conventionally three years later, then the result would be that this patient would be included in the statistics of those that survived five years after detection by CT scan, but a similar patient whose cancer was detected conventionally would not be included in the five-year survival rate for those patients.

The second issue involves the fact that CT scans may detect a large number of small tumors that do not progress. We know that smokers are 15 times more likely to die from lung cancer than nonsmokers, yet in a Japanese study, CT scans detected tumors at similar rates in both smokers and nonsmokers. Evidently, most of the tumors found in the nonsmokers don't progress to become life-threatening.

The consequences may be serious. If CT scans don't really result in much prolongation of life, the result of using them may actually be a net negative for most patients, who might undergo unnecessary biopsies and other followup treatments.

### Discussion

Often it is not enough to calculate only probabilities, for decisions must also be made, and decisions do not depend only on probabilities. In a medical context, patients need to know their risk of having a disease, given the results of the tests they have had; but even if the risk is very low, the patient also needs to know what the consequences of followup diagnostic tests or treatment are likely to be. How do you think a patient should go about making such a decision, based on both of these factors?