Chance News 20
- 1 Quotations
- 2 Forsooth
- 3 A car talk puzzle
- 4 A clumsy attempt at anonymization
- 5 Mean vs. Median
- 6 A Reader's Guide to Polls
- 7 Risk perceptions
- 8 Exit poll inventor dies aged 71
- 9 Estimating the diversity of dinosaurs
- 10 Man-made factors fuel hurricanes
Like dreams, statistics are a form of wish fulfillment. - Jean Baudrillard
According to an article in the WSJ by Dr. Jerome Groopman of the Harvard Medical School criticizing alternative medicine: on the wall of the office of Dr. Stephen Straus who directs NCCAM, (formerly the Office of Alternative Medicine which is within the National Institutes of Health) there exists the following framed quotation, "The plural of anecdote is not evidence." This useful and insightful aphorism appears in various versions as can be seen by this website here.
"People who live longer have a greater chance of developing cancer in old age." Heard on the "Today" news programme on BBC Radio 4 and reported to the MEDSTATS discussion group by Ted Harding.
The next two Forsooths are from the September RRS NEWS.
The number of motorists willing to pay to travel on Britain's roads is falling, a survey out today reveals. More than one in four drivers were will to pay to use city centre roads in 2002, but that figure fell to just 36 per cent in 2005, a study for the RAC said.
16 March 2006
At present, Labour has a majority of 64, which means it holds 32 more seats than the other parties combined.
Times on line20 March 2006
A car talk puzzle
Week of 08-21-07
The bullet holes were all over the place on the R.A.F. planes -- in the wings and the fuselage, and seemingly distributed randomly on the undersides. So, where did the R.A.F. mathematician recommend extra armor, to save future missions?
A clumsy attempt at anonymization
A Face is Exposed for AOL Searcher No. 4417749. Michael Barbaro and Tom Zeller, Jr. The New York Times (August 9, 2006).
Statisticians frequently deal with confidentiality issues when deciding what type of data and what amount of detail should be withheld to protect sensitive information about individual patients or institutions. It's not an easy task and there are some subtle traps. And sometimes there are not so subtle traps.
At the request of some researchers, America Online (AOL) released data on 20 million web searches performed 650 thousand AOL users over a three month span. They released the data, not just to those researchers, but to the general public. AOL quickly realized that this was a bad idea and removed the database, but it had already been copied to many locations. It is unlikely that they will ever be able to persuade the web owners at all the other locations to take the files offline.
The data was anonymized by replacing the user name with a random number. This is important, because some of the search terms are for rather sensitive items. Examples of things that people searched on are
- "can you adopt after a suicide attempt" or
- "how to tell your family you're a victim of incest."
But replacing a name by a number did not come even close to anonymizing all of the records. The problem is that people will do web searches about things that reveal hints about themselves. Actual searches listed in the data base included things like geographic locations:
- "gynecology oncologists in new york city,"
- "orange county california jails inmate information,"
- "employment needed- louisville ky," or
- "salem probate court decisions,"
or places where the searchers shopped or banked or got health care,
- "gerards restaurant in dc,"
- "st. margaret's hospital washington d.c.,"
- "l&n federal credit union," or
- "mustang sally gentlemans club,"
or products that the searchers owned,
- "cheap rims for a ford focus," or
- "how to change brake pads on scion xb,"
or their hobbies,
- "knitting stitches," or
- "texas hold'em poker on line seminars."
It gets even more revealing when people do web searches on their relatives or even themselves.
These individual searches are, according to one report, like individual pieces in a mosaic. Put enough of them together and you can get a really clear picture of who the searcher is. Can you actually identify people from their web searches? The answer is yes.
Accrdoing to the New York Times report, one user, with the id number 4417749 searched for
- "landscapers in Lilburn, Ga," and
- "homes sold in shadow lake subdivision gwinnett county georgia,"
as well as the names of several people, all of whose last names were Arnold. It didn't take long for the New York Times to track down a 62 year old widow named Thelma Arnold.
Ms. Arnold, who agreed to discuss her searches with a reporter, said she was shocked to hear that AOL had saved and published three months’ worth of them. “My goodness, it’s my whole personal life,” she said. "I had no idea somebody was looking over my shoulder."
This is an important lesson that statisticians have been aware of for some time. An individual piece of information by itself may not compromise someone's privacy, but will do so when it is combined with other pieces of information. Knowing that someone lives in a small town still preserves anonymity, but when that small town name appears in a database of all pediatric heart transplant cases, you have a problem.
1. List some of the other things that people might search on that would potentially reveal their identities.
2. Could this data set be cleaned up to the point where it could be truly thought to be anonymized?
3. Why would a researcher be interested in what people search for on the Internet? What sort of information would be useful for someone in Marketing?
Submitted by Steve Simon
Mean vs. Median
Who's Counting: It's Mean to Ignore the Median
ABCNews.com, 6 August 2006
John Allen Paulos
This latest installment of "Who's Counting" focuses on the distinction between the mean and median. Paulos begins with the familiar example of housing prices, and goes on to discuss the implications for interpreting newly released data on the performance of the US economy for 2004. Republicans point out that the economy grew at a rate of 4.2%, and complain that they are not getting enough credit for the good news. Democrats counter that real median income is falling and poverty is rising. How can both be true? Just as a few expensive houses in a neighborhood can pull the mean substantially above the median, gains by a wealthy few at the top of the income ladder can pull up the mean, even if most people are not benefiting.
To show that this is happening, Paulos cites work on income distribution by economists Thomas Picketty and Emmanuel Satz. According to their calculations, the the richest one percent, whose incomes exceed $315,000, gained on average nearly 17% over the year in question. However, the good news did not extend very far down the income distribution. Looking at the top five percent of all incomes, the average gain is described as "minimal." This means that the gains were concentrated near the very top. In fact, even among the top one percent, Picketty and Satz found that half of income gains went to the top tenth of the group.
Paulos points out that the pattern of the income distribution can be described mathematically in terms of so-called "power laws," which apply to a variety of observed phenomenon, including Internet surfing and investing. A general description of power laws from Wikipedia can be found here.
Submitted by Bill Peterson
A Reader's Guide to Polls
Precisely False vs. Approximately Right: A Reader's Guide To Polls
The New York Times, August 27, 2006, The Public Editor
Jack Rosenthal, a former New York Times senior editor filling in as the guest "Public Reader", is concerned that the media often reports the outcomes of a poll without explaining how the poll should be interpreted and without alerting the readers when there are serious problems with the way the poll is carried out. He provides the following example:
Last March, the American Medical Association reported an alarming rate of binge drinking and unprotected sex among college women during spring break. The report was based on a survey of "a random sample" of 644 women and supplied a scientific-sounding "margin of error of +/– 4.00 percent." Television, columnists and comedians embraced the racy report. The New York Times did not publish the story, but did include some of the data in a chart.
The sample, it turned out, was not random. It included only women who volunteered to answer questions — and only a quarter of them had actually ever taken a spring break trip. They hardly constituted a reliable cross section, and there is no way to calculate a margin of sampling error for such a "sample."
For more information about this AMA survey, Rosenthal refers readers to a polling blog Mystery Pollster maintained by Mark Blumenthal, a pollster for the Democratic Party. Here we read:
Cliff Zukin, the current president of the American Association for Public Opinion Research (AAPOR), saw the survey results printed in the Times, and wondered about how the survey had been conducted. He contacted the AMA and was referred to the methodology section of their online release. He saw the following description (which has since been scrubbed):The American Medical Association commissioned the survey. Fako & Associates, Inc., of Lemont, Illinois, a national public opinion research firm, conducted the survey online. A nationwide random sample of 644 women age 17 - 35 who currently attend college, graduated from college or attended, but did not graduate from college within the United States were surveyed. The survey has a margin of error of +/- 4.00 percent at the 95 percent level of confidence [emphasis added].
Zukin then contacted Janet Willams at the AMA asking for more details on how the study was carried out. She responded:The poll was conducted in the industry standard for internet polls -- this was not academic research -- it was a public opinion poll that is standard for policy development and used by politicians and nonprofits.
Zukin replied:I'm very troubled by this methodology. As an op-in non-probability sample, it lacks scientific validity in that your respondents are not generalizable to the population you purport to make inferences about. As such the report of the findings may be seriously misleading. I do not accept the distinction you make between academic research and a "public opinion" survey.
In her reply Williams said:As far as the methodology, it is the standard in the industry and does generalize for the population. Apparently I need to reiterate that this is not an academic study and will [not ?] be published in any peer reviewed journal; this is a standard media advocacy tool that is regularly used by the American Lung Association, American Heart Association, American Cancer Society and others.
Rosenthal gives another example:Another example surfaced last week in The Wall Street Journal. It examined a “landmark survey,” conducted for liquor retailers, claiming to show that “millions of kids” buy alcohol online. A random sample? The pollster paid the teenage respondents and included only Internet users.
This survey is critiqued in Carl Bialik's "Numbers Guy" column in the Wall Street Journal Online, August 18, 2006.
Rosenthal remarks:Such misrepresentations help explain why The Times recently issued a seven-page paper on polling standards for editors and reporters. "Keeping poorly done survey research out of the paper is just as important as getting good survey research into the paper," the document said.
Rosenthal says "readers, too, need to know something about polls--at least enough to sniff out good polls from bad" and so he provides a brief reader's guide. This includes understanding margin of error and being aware of problems in the way the questions are asked such as: use of double negatives, the order of the questions, the effect of strength of feeling about an issue etc.
The MysteryPollster remarks that the TIMES document on polling standards is apparently not in the public domain while ABC have made their standards public in their report News' Polling Methodology and Standards and suggested that the Times should also make their standards for editors and reporters public.
(1) The first item in the Reader's guide is to beware of too much precision. The following example is given:
A recent Zogby Interactive poll, for instance, showed that the candidates for the Senate in Missouri were separated by 3.8 percentage points. Yet the stated margin of sampling error meant the difference between the candidates could be seven points. The survey would have to interview unimaginably many thousands for that zero point eight to be useful.
Why should we beware of too much precision?
(2) The second item deals with sampling error. We read:
The Times and other media accompany poll reports with a box explaining how the random sample was selected and stating the sampling error. Error is actually a misnomer. What this figure actually describes is a range of approximation.
For a typical election sample of 1,000, the error rate is plus or minus three percentage points for each candidate, meaning that a 50-50 race could actually differ by 53 to 47.
Do you agree that the error in "sampling error" is a misnomer? Do you see anything wrong with the second sentance?
(3) Rosenthal says:
There’s also a formula for calculating the error in comparing one survey with another. For instance, last May, a Times/CBS News survey found that 31 percent of the public approved of President Bush’s performance; in the survey published last Wednesday, the number was 36 percent. Is that a real change? Yes. After adjustment for comparative error, the approval rating has gained by at least one point.
What was the sample size?
Submitted by Laurie Snell
One Million Ways to Die, Ryan Singel, Wired.com, 11 Sep 2006.
This on-line article compares official mortality data with the number of Americans who have been killed inside the United States by terrorism since 1995. It highlights that many threats are far more likely to kill an American than any terrorist -- at least, statistically speaking. For example, it claims that your appendix is more likely to kill you than al-Qaida is.
The rankings are:
S E V E R E Driving off the road: 254,419 Falling: 146,542 Accidental poisoning: 140,327
H I G H Dying from work: 59,730 Walking down the street: 52,000. Accidentally drowning: 38,302
E L E V A T E D Killed by the flu: 19,415 Dying from a hernia: 16,742
G U A R D E D Accidental firing of a gun: 8,536 Electrocution: 5,171
L O W Being shot by law enforcement: 3,949 Terrorism: 3147 Carbon monoxide in products: 1,554
- The rankings are based on the number of mortalities in each category throughout the 11-year period spanning 1995 through 2005 (extrapolated from best available data). What issues might arise from extrapolation of data? Is the past data a good guide to future exposures for all of these risks?
- Are the underlying populations from which the data are compiled really comparable? If you think the exposures to risk vary by threat, what adjustments might be made to standardise the data?
- Why do you think the risk from certain threats is perceived to be greater or less than the statistics suggest?
- If these point estimates included some estimates of variation, such as a full probability distribution, what differences might you expect to see between such distributions? Do you think that that extra information might influence your perception of risk, or even how you might define risk in the first place?
- National Highway and Safety Agency (.pdf)
- National Vital Statistics Reports, Vol. 50, No. 15 (09/16/2002) (.pdf)
- US Consumer Product Safety Commission
- the Insurance Information Institute.
Submitted by John Gavin.
Exit poll inventor dies aged 71
Warren Mitofsky considered by many to be the "father of exit polling", changed the way the media covers elections by pioneering the use of exit polls.
An exit poll is a poll of voters taken immediately after they have exited the polling stations. Unlike an opinion poll, which asks who the voter plans to vote for or some similar formulation, an exit poll asks who the voter actually voted for. (From Wikipedia).
CBS News said
Today, the methods behind the exit polls that give voice to America’s voters, and the mathematical models that help estimate election results, are in large part the result of his ingenuity and creativity. As Dan Rather once told the nation, as a heated election night’s results poured in, "I believe in God, Country, and Warren Mitofsky."
Mitofsky’s demand for the highest standards in those methods was legendary. Murray Edelman, Mitofsky’s colleague at CBS News from 1967-1992, said
people in the field knew Warren for his creativity, his dedication, and his passion … and they have the scars to prove it.
Mitofsky always sought to build outstanding teams of researchers. In his address to NYAAPOR in 2002, he emphasized that survey research was "an eclectic field" demanding many kinds of expertise, and that in turn demanded that many diverse experts be involved.
No one person I know possess all the various skills at a high enough level necessary to conduct a survey. It takes a team of people to encompass all the areas.
He also played a key role in developing the sample survey technique know as random digit dialing (RDD). RDD means a computer keeps picking numbers at random until it finds a valid one.
Such computer assisted telephone interviewing (CATI) techniques are widely used for surveys. Their advantages over face-to-face interviewing are timeliness and cost-reduction to achieve the same sample size and geographical coverage. Two common sampling procedures are random sampling from the telephone directory and RDD sampling. RDD sampling offers better coverage of households than telephone-book sampling and can be generated quickly.
For example, almost all telephone numbers in the US are a ten-digit number: a three-digit area code, a three-digit central office code and four-digit suffix number. For each central office included in the sample, random four-digit numbers between 0001 and 9999 yield the required random telephone numbers. This includes both listed and unlisted numbers. But unlisted households tend to cluster bimodally, among high and low income areas and they are also more prevalent in metropolitian areas. Also RDD tends to exclude small geographical areas, as a selected telephone exchange may contain several small geographical areas i.e. small towns.
- Exit polls have historically been used as a check against and rough indicator of the degree of fraud in an election. What are the potential flaws with such samples?
- In the US, exit polls can be reported before elections polls have closed. Why might this matter? For example, in the 2000 U.S. Presidential election, it was alleged that media organizations released exit poll results for Florida before the polls closed in the Florida panhandle. Could such exit polls influence the outcome that they are trying to predict and, if so, would it be a positive or negative feedback effect?
- Do you think that such sampling techniques could ever be so bad that they should be banned completely, as in New Zealand, or in the UK, where publication of exit polls before the polls close is a criminal offence?
- Why do you think looking up looking numbers via a telephone book might result in a biased sample?
- On the other hand, assuming that ex-directory numbers will be unhappy to be cold-called regarding some survey, how do think this attitude might influence the RDD sample?
- Do you think RDD is an invasion of privacy? How might you justify it against this charge?
Mitofsky, 'father of exit polling,' dies at 72, CNN 09.03.06.
Estimating the diversity of dinosaurs
Proceedings of the National Academy of Sciences,
Published online before print September 5, 2006
Steve C. Wang, and Peter Dodson
Fossil hunters told: Dig deeper
Philadelphia Inquirer, September 5, 2006
This study was widely reported in the media. Steve Wang is a statistician at Swarthmore College and Peter Dodson is a Penn paleontologist at the University of Pennsylvania.
In their paper the authors provided the following description of their results. Here are a few definitions that might be helpful: genera: a collective term used to incorporate like-species into one group, nonavian: not derived from birds, fossiliferous: containing a fossil, rock outcrop: the part of a rock formation that appears above the surface of the surrounding land
Despite current interest in estimating the diversity of fossil and extant groups, little effort has been devoted to estimating the diversity of dinosaurs. Here we estimate the diversity of nonavian dinosaurs at 1,850 genera, including those that remain to be discovered. With 527 genera currently described, at least 71% of dinosaur genera thus remain unknown. Although known diversity declined in the last stage of the Cretaceous, estimated diversity was steady, suggesting that dinosaurs as a whole were not in decline in the 10 million years before their ultimate extinction. We also show that known diversity is biased by the availability of .. Finally, by using a logistic model, we predict that 75% of discoverable genera will be known within 60-100 years and 90% within 100-140 years. Because of nonrandom factors affecting the process of fossil discovery (which preclude the possibility of computing realistic confidence bounds), our estimate of diversity is likely to be a lower bound.
Man-made factors fuel hurricanes
Man-made factors fuel hurricanes, study finds.
Boston Globe, September 12, 2006, A1
A study in the Proceedings of the National Academy of Sciences reports an 84 percent chance that human activities are responsible for most of the recent heating in the Atlantic and Pacific ocean regions where hurricanes form. Overall, oceans have warmed approximately 1 degree Fahrenheit over the last century, a change which the study says cannot be attributed to natural cycles. That claim is based on extentive computer simulations that try to model climate systems under different scenarios, including volcanoes, fluctuations in solar fluctations and human effects on the atmosphere. No combination of natural factors was able to reproduce the observed warming.
It is well known that warm water contributes to hurricane intensity, so the study helps bolster the case of scientists who warned that average hurricane intensity has been increasing as a result of global warming. Others caution, however, that the evidence is not yet clear. Among the objections cited in the article are concerns about underestimation of the strength of earlier storms, and questions about whether the observed warming is sufficient to explain the strength of recent storms.
Since Hurricane Katrina last year, there has been a great deal of public debate about possible human influences on hurricane intensity. The August 2006 issue of the Bulletin of the American Meteorological Society (BAMS) has an excellent review of the matter, entitled "Mixing Politics and Science in Testing the Hypothesis That Greenhouse Warming Is Causing a Global Increase in Hurricane Intensity". The authors analyze in considerable detail the structure of the arguments put forward by skeptics of climate change, taking care to distinguish valid criticisms from logical fallacies. After debunking the logical fallacies, they outline the kinds of scientific investigations that could be used to rationally settle the open questions. A sidebar in the article includes a taxonomy of logical fallacies (such as ad hominem fallacy, begging the question, etc.). For example, "Statistical special pleading occurs when the interpretation of the relevant statistic is 'massaged' by looking for ways to reclassify or requantify data from one portion of results, but not applying the same scrutiny to other categories". The article cites further online discussion from Wikipedia. This would make wonderful background reading in a CHANCE course (even if global warming was not on the course agenda)!
Here is one example from the BAMS article. One argument advanced by the skeptics held that the reported doubling of the annual number of major hurricanes (Category 4 and 5) between 1970 and 2004 goes away if Category 3 storms are included along with 4 and 5. There is a simple graphic in the BAMS article that shows why this argument fails. As explained by the authors (p. 1028):
Figure 1a shows the global trends for each hurricanecategory, and Fig. 1b shows the global trends for 3+4+5 and 4+5 hurricanes. The comparison in Fig. 1b indicates that an inability to discriminate between category-3, -4, and -5 hurricanes introduces a maximum uncertainty of ±30% to WHCC's finding of a 100% increase in the proportion of category-4+5 hurricanes. Hence, the null hypothesis must be rejected unless we cannot distinguish category-1 from category-4 storms.
Submitted by Bill Peterson