Chance News 6

September 16 2005 to September 31 2005

Statistics could have spotted mass murderer

NewScientist.com news service, John Pickrell, 06 September 2005.
Plotting death Emma Young, New Scientist, 09 February 2001.

In 2000 Harold Shipman, a UK family doctor, was convicted of murdering 15 patients. Statistical quality-control methods might have allowed authorities to catch Shipman much sooner, according to a new study. David Spiegelhalter of the UK Medical Research Council's Biostatistics Unit, Cambridge, UK, has applied an industrial quality-control technique to data on the death of Shipman's patients to see if his heinous crimes could have been detected any earlier. "The method was first used in 1943 to ensure consistent quality in explosive shells and other wartime munitions production lines", says Spiegelhalter. The same statistics are used today in a wide range of industries, but have never before been applied to healthcare performance.

These methods on their own are not enough but cross-referencing other factors such as time of death or location might have set alarm bells ringing – many of Shipman's victims were murdered at around 3pm, during his afternoon rounds, and anomalous numbers died while in Shipman’s presence. "Maths has many crime-fighting applications," comments applied mathematician Chris Budd at the University of Bath, UK. "Statistics can be a useful approach for detecting anomalies in many contexts," he says.

The control-chart graphical method was developed by US physicist Walter Shewhart for use in the manufacturing industry. Scores are not ranked into a league table. Instead, the number of adverse outcomes is plotted against the total number of cases on a graph. A line is drawn through the mean, and all scores within three standard deviations (in practice, most of the scores) are considered to be down to inherent variation in the system. Any scores outside the 'control limits' suggest a special cause. Tom Marshall, of the University of Birmingham, told New Scientist.

This tells you where the real problems in a system are. In a league table, someone has to be at the top and someone has to be at the bottom, but that doesn't necessarily mean any kind of intervention should be taken.

However, Spiegelhalter believes his method is more powerful as it allows data from multiple years to be assessed.

A remarkable story about the German Lotto

A Miracle in the German Lotto
Strange But True, Richmond.com
April 26, 2005
Bill Sones and Rich Sones
(reprinted with permission of Bill and Rich Sones)

Q. In the German Lotto on June 21, 1995, the numbers 15-25-27-30-42-48 were drawn out of 49. Bizarrely, it was later discovered that these same 6 had been drawn on Dec. 20, 1986, something that hadn't happened before in the 3,016 drawings of the Lotto. With nearly 14 million combos possible, how incredible was this really? Was someone rigging the picks?

A. Pardon the daunting math, but to figure the likelihood of getting a match in 3,016 drawings, first you need to figure the likelihood of getting NO matches, says Henk Tijms in his new book "Understanding Probability." Start by multiplying 13,983,816 x (13,983,816 - 1) x (13,983,816 -2)... x (13,983,816 - 3,015). There are 3,016 factors here.

Then divide by 13,983,816 to the 3016 power! Once your overworked computer cools off, you'll see an answer of 0.7224. But that's the likelihood of no matches; instead subtract 1 - 0.7224 to find the likelihood of at least one match, which equals 0.2776. This means there was a better than 1 in 4 chance that this would happen in 3,016 drawings!

What fools people is that the chance of matching a PARTICULAR 6-number-sequence is vanishingly small, but not the chance of matching SOME 6-number-sequence along the way. Says Tijms: This is basically another version of the classic birthday paradox, where all it takes is 23 people in a room for there to be a 50-50 chance of at least two of them having the same birthday. "In the lottery situation, it's analogous to there being 3,016 people in the room and 13,983,816 possible birthdays." So no rigging necessary, but just someone's incredible recordkeeping to track thousands of picks and spot the repeat.

Discussion

The article says that the math is "daunting." Is it really so daunting? Where did the authors get the calculation that divides 13,983,816 x (13,983,816 - 1) x (13,983,816 -2)... x (13,983,816 - 3,015) by 13,983,816 to the 3016 power? Is there a better way to arrange the calculation so that your computer won't be so overworked?

Self Experimentation

Does the Truth Lie Within? The New York Times, September 11, 2005 Stephen J. Dubner and Steven D. Levitt

Another article in the New York Times by the authors of Freakonomics discusses self experimentation. The authors highlight the work of Seth Roberts:

Seth Roberts is a 52-year-old psychology professor at the University of California at Berkeley. If you knew Roberts 25 years ago, you might remember him as a man with problems. He had acne, and most days he woke up too early, which left him exhausted. He wasn't depressed, but he wasn't always in the best of moods. Most troubling to Roberts, he was overweight: at 5-foot-11, he weighed 200 pounds.

When you encounter Seth Roberts today, he is a clear-skinned, well-rested, entirely affable man who weighs about 160 pounds and looks 10 years younger than his age. How did this happen?

The authors go on to say that all of Seth Roberts ills were cured by self experimentation. He tried various interventions and evaluated the results.

It took him more than 10 years of experimenting, but he found that his morning insomnia could be cured if, on the previous day, he got lots of morning light, skipped breakfast and spent at least eight hours standing.

Losing weight depended on a theory of weight regulation known as the set-point theory which says that your weight is determined by what you needed to survive during stone age times because

when food is scarcer, you become less hungry; and you get hungrier when there's a lot of food around.

This encouraged us to store food when it was plentiful as fat so as to help us survive when food becomes scarce. It worked well long ago, but today food is plentiful all year round, so the signal to stop storing fat never gets turned off. Seth Roberts experimented with flavors that might trick the set-point system into thinking that food was actually scarce. A bland, unflavorful diet might work, he theorized, but no one wants to be stuck eating like that. He found that a few tablespoons of unflavored oil worked as did several ounces of sugar water.

The results were astounding. Roberts lost 40 pounds and never gained it back. He could eat pretty much whenever and whatever he wanted, but he was far less hungry than he had ever been. Friends and colleagues tried his diet, usually with similar results.

These self experiments are a special case of the "N of 1 trials." A good example of this type of trial appears in the British Medical Journal: Jeffrey Mahon, Adreas Laupacis, Allan Donner, Thomas Wood. Randomised study of n of 1 trials versus standard practice. BMJ 1996; 312: 1069-1074 (27 April). Full free text The N of 1 trial is simply a crossover trial with a single patient.

Numerous references and additional details are at the Freakonomics website

There are several amusing examples of self experimentation at this website, but not mentioned is one of my personal favorites, the 1994 Ig Nobel prize winner in Entomology:

Robert A. Lopez of Westport, NY, valiant veterinarian and friend of all creatures great and small, for his series of experiments in obtaining ear mites from cats, inserting them into his own ear, and carefully observing and analyzing the results. [Published as "Of Mites and Man," The Journal of the American Veterinary Medical Association, vol. 203, no. 5, Sept. 1, 1993, pp. 606-7.]

Discussion

One potential source of bias in self experimentation is the inability to adequately blind the patient. What other sources of bias are there in this type of study?

Most research requires the informed consent of the research subject. Is informed consent an issue in self experimentation?

Submitted by Steve Simon

Text Stats - squeezing the life out of poetry and prose

Amazon's Vital Statistics Show How Books Stack Up, Linton Weeks, Washington Post, August 30, 2005.

Amazon.com has a new capability. called Search Inside, which allows users to search through the entire text of a book online. This functionality produces some elementary statistics such as the number of letters, words and sentences. The Fun Stats section tells you how many words you are getting per dollar and per ounce with each book. So Leo Tolstoy's War & Peace offers 51,707 words per dollar, while Obliviously On He Sails: The Bush Administration in Rhyme by Calvin Trillin delivers only 1,106 words per dollar. When The Washington Post confronted Trillin with this evidence, he replied,

"I don't mind being compared to Tolstoy literarily, but when it comes to Fun Stats it's a little humiliating."

Search Inside also tries to quantify the readability and complexity of a book and to graphically display the most common words, called concordance, by setting the font size to be proportional to the number of times that word occurs in the book, a very simple yet surprisingly effective technique. (Incidently, the same graphical tool is used by flickr.com to display its most popular images, based on the tags that have been added to them, a new way to organise information that may eventually replace folders (click on the pdf download button and see Death to folders! on page 18).) A book's readability is quantified using the three statistics: the Fog, Flesch and Flesch-Kincaid indices and it also graphically ranks your chosen book's stats, relative to others books, but in a rather uninspiring manner. Complexity is quantified as the percentage of words with three or more syllables, the average number of syllables per word and words per sentence.

The Washington Post quantifies it favourite books:

Ulysses by James Joyce (9 on the Fog Index) is more complicated than William Faulkner's The Sound and the Fury (5.7 on the Fog Index). Yes, Charlotte Bronte provides more words per ounce (13,959) in Shirley than her sister Emily (10,444) in Wuthering Heights. And, yes, Ernest Hemingway used fewer complex words (5 percent) in his short stories than F. Scott Fitzgerald (9 percent).

But they are distinctly unimpressed by the cold, harse statistics:

But in its pure form, Text Stats is a triumph of trivialization. By squeezing all the life and loveliness out of poetry and prose, the computer succeeds in numbing with numbers. It's the total disassembling of truth, beauty and the mysterious meaning of words. Except for the Concordance feature, which arranges the 100 most used words in the book into a kind of refrigerator-magnet poetry game.

Discussion

Can you think of alternative ways to quantify readability and complexity?

What alternative graphics might be suitable to display the statistics?

Try searching for your favourite author's books to see if his/her stats are improving over time.

Submitted by John Gavin.

Amazon Uses Chance to Target Customers

I received this email from Amazon.com:

Amazon.com has new recommendations for you based on 17 items you purchased
or told us you own.

We recommend LaTeX Companion, The (2nd Edition) (Addison-Wesley Series on Tools and Techniques for Computer T)

List Price : $59.99
Price : $45.86
You Save : $14.13 (24%)

Because you purchased or rated: The LaTeX Graphics Companion: Illustrating Documents with TeX and Postscript(R) you are 9,290 times more likely to purchase this item than other customers.

We hope you like these recommendations and would love to hear your feedback
on our recommendations algorithm. Email us at cb-arr-lift@amazon.com.

Submitted by Bob Hayden.

Warned, but Worse Off

According to an article in the New York Times (August 22, 2005), people whose lung cancers are detected by CT scans have a five-year survival rate of 80 percent, whereas those whose cancers are detected by conventional means have a five-year survival rate of only 15 percent. Is this a good reason for someone at risk for this disease to get a CT scan?

The answer is that no one knows. It may be that CT scans don't actually result in people living any longer than they would otherwise, and the epidemiology is at this point not understood well enough.

There are two issues: First, CT scans are capable of detecting a cancer in an earlier stage (since they can detect smaller tumors), but it's not clear that earlier detection and treatment of these cancers actually adds anything to the lifespan. For example, if a tumor was detected at a stage when the patient would have survived for six years, regardless of treatment, but would only have been detected conventionally three years later, then the result would be that this patient would be included in the statistics of those that survived five years after detection by CT scan, but a similar patient whose cancer was detected conventionally would not be included in the five-year survival rate for those patients.

The second issue involves the fact that CT scans may detect a large number of small tumors that do not progress. We know that smokers are 15 times more likely to die from lung cancer than nonsmokers, yet in a Japanese study, CT scans detected tumors at similar rates in both smokers and nonsmokers. Evidently, most of the tumors found in the nonsmokers don't progress to become life-threatening.

The consequences may be serious. If CT scans don't really result in much prolongation of life, the result of using them may actually be a net negative for most patients, who might undergo unnecessary biopsies and other followup treatments.

Discussion

Often it is not enough to calculate only probabilities, for decisions must also be made, and decisions do not depend only on probabilities. In a medical context, patients need to know their risk of having a disease, given the results of the tests they have had; but even if the risk is very low, the patient also needs to know what the consequences of followup diagnostic tests or treatment are likely to be. How do you think a patient should go about making such a decision, based on both of these factors?

Strong criticism of statisticians is a way of life in Britain

It's a great job - 99 per cent of the time, The Observer, July 24, 2005.

This article is about the recent retirement of Len Cook, the 56-year old director of the Office of National Statistics (ONS), in the UK, after five years in the job. A New Zealander, Cook had spent eight years in charge of the government statistical service in New Zealand prior to accepting the corresponding job in the UK.

The article offers an impressive list of the controversial statistical issues that he had to grapple with during his tenure, such as the impact of his £20 billion ($30bn) revision to the official overseas trade statistics caused by the discovery of so-called 'carousel fraud' on mobile phones and the reclassification of public spending on road maintenance.

The article comments,

with policymakers relying on statistics to bolster their decisions, and the markets betting on which way the figures will move, Cook has inevitably found himself under harsh scrutiny.

He has seen part of his job as to challenge his own number-crunchers.

My job is to take what is received wisdom, dust it up, and say: are we really sure this is what we should be doing?

Cook has come to the conclusion that

strong criticism of statisticians is part of the way of life in Britain.

And Lord Moser, head of statistics in the late 1960s and early 1970s, has reassured him that the nature of criticism of the statistical service hasn't changed, although the media and politicians have become more vociferous over the years. (Incidently, in 1965 Moser applied for a job at the Central Statistical Office but was rejected before being appointed its head in 1967, according to his Wikipedia entry.)

Cook believes greater independence from government would help the ONS. '

Statutory independence is a sensible step. But it would be very unwise for anyone - even the Statistics Commission - to believe that it's enough. The point is that it's hard working in the world of the British media, hard for lawyers; hard for doctors. Maybe that's just part of how life is.

Cook has found that, if you're mentioned in the British press, the rest of the world knows too. He comments

The fact is that criticism of statistics is a way of criticising the government.

Given the importance of accurate statistics to government, business and the financial markets, Cook believes firmly that when the ONS has doubts about a particular series, these should not be revealed until they have been resolved internally.

It would not help the Monetary Policy Committee any to know that something is uncertain about a particular statistic, and they've got no idea what that uncertainty is.

But the article finishes on a unbeat note:

I can't imagine anything I could have done that would be professionally more satisfying.
For a statistician this is the world stage - the top civil service job in statistics. Maybe I spent two years getting the confidence to do it. But I've had the one job everyone wants to have success in. Everybody wants my job to work well.

The gender gap in salaries

Exploiting the gender gap
New York Times, 5 September, 2005, A21
Warren Farrell

Farrell is the author of Why Men Earn More: The Startling Truth Behind the Pay Gap -- and What Women Can Do About It (AMACOM, 2004)

This article was published for Labor Day, and it opens by citing a demoralizing, often-heard statistic: women still earn only 76 cents for each dollar paid to their male counterparts in the workplace. Farrell maintains that such comparisons ignore important lurking variables. He claims to have identified twenty-five tradeoffs involving job vs. lifestyle choices, all of which men tend to resolve in favor of higher pay, while women tend to seek better quality of life.

Here are some the factors discussed in the article. Men more readily accept jobs with longer hours, and Farrell reports that people who work 44 hours per work earn twice as much as people who work 34 hours per week. Similarly, he finds that men are more willing to relocate or travel, to work in higher risk environments, and to enter technical fields where jobs may involve less personal interaction. Each of these choices is associated with higher pay.

Even head-to-head comparisons of men and women working in the “same job” can be tricky. Farrell observes, for example, that Bureau of Labor Statistics data consider all medical doctors together. But men opt more often for surgery or other higher paid specialties, while women more often choose general practice.

As indicated by the subtitle of his book, however, Farrell intends to provide some positive news for women. He claims that in settings where women and men match on his 25 variables, the women actually earn more than men. He also identifies a number of specific fields where women do better. One of these is statistics(!), where he reports that women enjoy a 35 percent advantage in earnings.

DISCUSSION QUESTION:

A story in the Chronicle of Higher Education this summer (The womanly art of negotiation, 22 July, 2005, Catherine Conrad) observed that women are less likely than men to negotiate on salary for their first jobs, and that this initial disadvantage tends to persist throughout their careers. Can be reconciled with Farrell’s analysis, or do you think he is missing something here?

Submitted by Bill Peterson, based on a posting from Joy Jordan to the Isolated Statisticians e-mail list.

Chance News 6

Contents

Statistics could have spotted mass murderer

A remarkable story about the German Lotto

Discussion

Self Experimentation

Discussion

Text Stats - squeezing the life out of poetry and prose

Discussion

Amazon Uses Chance to Target Customers

Warned, but Worse Off

Discussion

Further reading

Strong criticism of statisticians is a way of life in Britain

Further reading

The gender gap in salaries

Navigation menu