# Chance News 23

## Quotations

Light a Lucky and you’ll never miss sweets that make you fat.
Constance Talmadge, Charming Motion Picture Star, 1930.

When I talk to people about statistics, I find that they usually are quite willing to criticize dubious statistics--as long as the numbers come from people with whom they disagree.
Joel Best, More Damned Lies and Statistics, page XI

## Forsooth

This forsooth is from the Jan 2007 RSS News.

Carl Griffths' feet have grown to a massive size 18 - double the average for adult men in Britain.
The Times

6 October 2006

The "From the President" column of the March 2007 issue of Consumer Reports (page 5) discusses how CR uses statistics in its testing. The President states that in response to an article in the January issue about contamination in chickens, "The U. S. Department of Agriculture, whose job it is to keep our cacciatore clean, labeled our study "junk science," without even learning our methodology: 'There's virtually nothing or any conclusion that anyone could draw from 500 samples,' said a USDA spokesman."

Submitted by Jerry Grossman

## A Challenge

The mathematics department at Dartmouth has just moved to a new building and the previous math building is being demolished. The students called this building "Shower Towers" suggest by this picture of one wall of the building.

For at least 30 years we walked by this wall assuming that the tiles were randomly placed. One day, as we were walking by it, our colleaugue John Finn said "I see they are not randomly placed." What did he see?

Submitted by Laurie Snell

## Statz 4 life

Statz 4 life, homies!, Da Statz Krew, Google video.

This is an hilarious 5-minute hip-hop video on an introductory statistics course for phychology at the University of Oregon, last Summer. Graduate student Chuck Tate enlisted the help of other psychology graduate students to get students to enjoy statistics as much as they enjoy hip-hop. From anova to correlation, from Peason to Fisher, the whole syllabus is mentioned. Let's hope Da Statz Krew enjoy their real stats courses as much as they seemed to enjoy making the video.

(Note: This video was previously mentioned briefly in Chance News 18.)

Submitted by John Gavin.

## Hot streaks rarely last

The Man Who Shook Up Vegas, by Sam Walker, January 5, 2007; Page W1.

Since last fall (Autumn), Las Vegas has had a problem each Thursday morning at precisely 10 a.m. Nevada time. Casino sports betting operations around the world were being simultaneously pounded by thousands of bettors wagering millions of dollars on the same few college football games. Odder still, most of these lock step bets were turning out to be winners, costing the casinos a fortune. The global business of sports betting was being jolted every week by an obscure 41-year-old statistician from San Francisco, using the alias Dr. Bob.

The article explains the background

Gamblers wagering against a point spread must win more than half their bets (about 53%) to make a profit and must be closer to 55% to make a comfortable living. This is no small feat. Experts say there may be fewer than 100 people who can sustain these rates over time. Most of them belong to professional betting syndicates that hire teams of statisticians, wager millions every week and keep their operations secret.

Since 1999, Bob Stoll has recommended 658 bets on college football, or about 81 per season. Here are his results. (For comparison, when betting against a point spread in Las Vegas, bettors must win 52.4% of their wagers to make a profit.)

YEAR  WIN/LOSS/TIE  %
1999  49-31-1  61
2000  47-25-0  65
2001  35-28-0  56
2002  49-44-3  53
2003  46-55-2  46
2004  55-34-1  62
2005  51-21-2  71
2006  45-34-3  57


The article claims that in the last three months, Mr. Stoll has emerged to become one of the world's most influential sports handicappers. And when it comes to predicting the outcomes of college football games, he is peerless.

What separates Mr. Stoll from other professionals, and makes him so frightening to bookmakers, is that he distributes his bets to the public, for a fee. All that pandemonium on Thursdays was no coincidence: that's the day Mr. Stoll sends an email to his subscribers telling them which college football teams to bet on the following weekend. This makes it very difficult for bookmakers to maintain a balanced book.

His website discusses the tools he uses to analyze football games: a mathematical model to project how many points each team was likely to score in a coming matchup. He makes unapologetic use of terms like variances, square roots, binomials and standard distributions. Much of his time is spent making tiny adjustments. If a team lost 12 yards on a running play, he checks the game summary to make sure it wasn't a botched punt. He compensates for the strength of every team's opponents. It takes him eight hours just to calculate a rating he invented to measure special teams. Trivial as this seems, Mr. Stoll says the extra work makes his predictions 4% better.

He does not follow the standard business model. He has no employees and he declines to advertise or swap links with other handicapping sites. In online essays, Dr. Bob says

I have a very realistic approach to handicapping and consider sports betting an investment rather than a gamble. In case you haven't figured it out by now, there is no such thing as a sure thing and I don't respect anyone who does. But, in the long run, if you follow my Best Bet advice and use a disciplined money management strategy you will win.

Bob Stoll's handicapping career began at Berkeley when he entered a $2 NFL pool and, after doing a few minutes of simple math, won$100. From then on, his statistics classes became excuses to feed football data through campus mainframes. After winning 63% of his bets in three years, he quit school to become a tout.

Hot streaks rarely last. One handicapper says

He (Bob Stoll) needs to enjoy this while it's going on right now.

In 2005, Mr. Stoll noticed that a few minutes after he sent his advice, the lines on those games would shift slightly. By the beginning of the 2006 college football season, within 30 seconds of the moment he pressed "send" on his Thursday picks, every major casino in the world would fall into line.

The bookmakers had clearly subscribed, and were trying to change the lines before his clients could make bets. When a stock analyst moves the market with a recommendation, investors who get in early can make money on it regardless of its merits. It's just the opposite in my business. When he makes picks, it's as if brokers and traders collude to drive down the price.

It's a story Mr. Stoll says he's heard thousands of times from clients who don't look at the long term.

Even good bets lose 40% of the time but some clients don't grasp that. They think I'm either hot or I'm cold.

As for what motivates him, Stoll says:

I'm not flashy by nature. I don't need three houses and a boat. I just like to handicap. For me, it's about problem solving.

### Questions

• How likely is it that his past performance table could have happened by chance?
• Dr. Bob advises clients to bet in a disciplined pattern that leaves less than a 1% chance of exhausting their bankrolls. Is this an acceptable performance statistic? What other information would you like to know about how much you might lose?

Submitted by John Gavin.

## Amazon's Statistically Improbable Phrases

About a year ago, Amazon.com, a popular site for the online purchase of books and other items, listed a group of phrases for certain books with the label, Statistically Improbable Phrases (SIP). These were phrases identified from the full text of a book that were common in that book relative to other books.

Amazon describes how it selects the SIPs in very vague terms on one of its help pages. I presume that it is vague because Amazon considers their approach to be a trade secret. The August 23, 2006 entry on S Anand's blog outlines how you might compute SIPs and offers an example using the Calvin and Hobbes comic strip.

One use of SIPs is clustering. You could measure the similarity between books based on the number of common SIPs and then cluster the data using that similarity matrix. Another approach to clustering that is used for RSS feeds is available here.

### Questions

1. Find a well known statistics book on the Amazon web site that lists SIPs. Do these SIPs give you a good idea of the content of the book?

2. Would SIPs be valuable for a work of fiction?

3. Speculate on what book would have the highest number of SIPs.

Submitted by Steve Simon

Googling for a diagnosis—use of Google as a diagnostic aid: internet based study Hangwi Tang, Jennifer Hwee Kwoon Ng. BMJ 2006: 333; 1143-1145.

An article published in BMJ argues that Google searches can sometimes aid with developing an appropriate diagnosis of disease. The researchers selected a convenience sample of diagnostic cases presented in the New England Journal of Medicine in 2005. They extracted three to five search terms from these case studies, using "statistically improbable phrases" (see above) whenever possible. They then reviewed roughly the top thirty links suggested by Google (never more than the top fifty links) and extracted a diagnosis from the pages. The diagnoses were correct in 15 out 26 cases (58%, 95% CI 38% to 77%).

The authors admit that the success of a Google diagnosis depends on what you are looking for.

We suspect that using Google to search for a diagnosis is likely to be more effective for conditions with unique symptoms and signs that can easily be used as search terms.

and also note that

Searches are less likely to be successful in complex diseases with non-specific symptoms or common diseases with rare presentations.

The BMJ offers "Rapid Responses," a system that allows interested readers to offer their own comments on any article published. The Rapid Responses to this article include a number of criticisms as well as some suggestions for improvement.

### Questions

1. Is a 58% rate of correct diagnoses good?

2. The authors used blinding--the authors were unaware of the correct diagnosis during the search phase. Comment on whether this blinding is needed and whether it is effective.

3. The authors acknowledge the importance of skill in extracting information from the pages that Google identifies. There is also skill in selecting the "statistically improbable phrases" used as search terms. How would you redesign this experiment so that the skill of the authors did not influence the results?

Submitted by Steve Simon

## What can you do with 100 words?

Parrot's oratory stuns scientists Alex Kirby, BBC News, January 26, 2004.

An article about N'Kisi, a parrot with a vocabulary of 950 words, makes a rather dubious statistical claim.

About 100 words are needed for half of all reading in English, so if N'kisi could read he would be able to cope with a wide range of material.

There is a story about Dr. Seuss writing his famous book "The Cat in the Hat" using a limited vocabulary list and coming in at 220 unique words. His publisher wagered $50 that he could not write a book using only 50 words. Dr. Seuss did indeed accomplish this with "Green Eggs and Ham" which uses exactly 50 words. See the Snopes.com entry on Green Eggs and Ham for details. So if 100 words are needed for half of all reading, then the book with a median level of complexity is bracketed below and above by "Green Eggs and Ham" and "The Cat in the Hat". Another interpretation is that the 100 most common words represent 50% of the words used in a typical book. You can find a list of these words on the web, and if you remove anything except those 100 words, the text would be rather difficult to read. Here is an example of a paragraph taken from a previous Chance News. When a ? ? for a ? ? ?, he or she ? an ? ? with the ?. The ? may ?, but ? if he does not, others will. That is ? the ? will ? ? ?. But if the ? is ? ?, then the ? ? and ? are for ?. A separate critique of the claims about N'Kisi published at the Skeptic's Dictionary web page comments on the problems with confirmation bias. ### Questions 1. How would you interpret the phrase "100 words are needed for half of all reading"? How would you verify the accuracy of this statement? Submitted by Steve Simon ## Read before you cite Significance, Dec. 2006, Vol. 3 issue 4. Mikhail Simkin, Vwni Roychowdhry This is a popular account of work the authors carried out under the title "Copied citations create renowned papers". This article was suggested by Norton Starr who was enchanted by the author's story which might be called "What determines Great Generals? During the “Manhattan project” (the making of nuclear bomb), Fermi asked Gen. Groves, the head of the project, what is the definition of a “great” general. Groves replied that any general who had won five battles in a row might safely be called great. Fermi then asked how many generals are great. Groves said about three out of every hundred. Fermi conjectured that, considering that opposing forces for most battles are roughly equal in strength, the chance of winning one battle is 1/2 and the chance of winning five battles in a row is 1/32. “So you are right, General, about three out of every hundred. Mathematical probability, not genius.” The authors give as reference Deming's 1936 book "Out of the crisis." But Deming says that a student sent him the story and seems to suggest that it can be found in "The Face of Battle" by John Keegan. We could not find it there. It is in Carl Sagan's "The Demon-Haunted World" but without a reference. So we don't know if this is a true story Now just as generals might be great generals by chance so might great scientists be great by chance. The authors comment that "a commonly accepted measure of 'greatness' for scientists is the number of citations to their papers." Now most of us would admit that we often do not read all the citations we make in our articles. Also we would admit that we probably make mistakes occasionally in our citations: the date is wrong, the volume is wrong, we might misspell the authors name etc. Of course these errors get propagated when others copy our citations. To get any idea how many times this might occur the authors chose a renowned paper that had 4300 citations and found that of these citations 196 contained misprints, out of which only 45 were distinct. The most popular misprint in a page number appeared 78 times. The authors develop a model to measure the effect of citation copying on the distribution of the number of citations. This model uses a "random-citing scientist." who, when writing an article, picks up m random articles, cites them, and also copies some of their references each with probability p. So m and p are parameters. They say that a good agreement between this model and actual citation data is achieved with m = 3 and p = 1/4. They illustrate this with the following figure: http://www.dartmouth.edu/~chance/forwiki/citations.jpg Submitted by Laurie Snell ## Do Oscar Winners Live Longer? If you put "Oscar winners live longer" in Google you will get over 7,000 hits. Here is hit from the January 23, 2007 issue of Health and Aging Oscar winners live longer: Reported by Susan Aldridge, PhD, medical journalist It is Oscar season again and, if you're a film fan, you'll be following proceedings with interest. But did you know there is a health benefit to winning an Oscar? Doctors at Harvard Medical School say that a study of actors and actresses shows that winners live, on average, for four years more than losers. And winning directors live longer than non-winners. Source: Harvard Health Letter March 2006 The assertion that Oscar winners live longer was based on an article by Donald Redelmeier, and Sheldon Singh: "Survival in Academy Award-winning actors and actresses". Annals of Internal medicine, 15 May, 2001, Vol. 134,No. 10 p 955-962. This is the kind of study the news loves to report and Medical Journals enjoy the publicity. Another such claim, that is in the news as this is written, is that the outcome of the Superbowl game determines whether the stock market will go up or down this year. Unlike the Oscar winners story the author of this claim admits that it is all a joke see Chance News 13.04 A recent paper by James Hanley, Marie-Pierre Sylvestre and Ella Huszti, "Do Oscar winners live longer than less successful peers? A reanalysis of the evidence," "Annals of Internal medicine", 5 September 2006, Vol 145, No. 5, p 361-363, claims that the Redelmeier, Singh paper was flawed and their reanalysis of the data did not support the claim that Oscar winners live longer. For their study Redelmeier and Singh identified all actors and actresses ever nominated for an academy award in a leading or a supporting role up to the time of the study (n = 762). Among these there were 235 Oscar winners. For each nominee another cast member of the same sex who was in the same film and was born in the same area was identified (n= 887) and used as controls. They used the Kaplan-Meier method to provide a life table for the Oscar winners and the control group. A life table estimates for each x the probability of living longer x years. You can see how this was done in Chance News 10.06 Redeemer and Singh obtain the following life tables for the two groups. http://www.dartmouth.edu/~chance/forwiki/oscar.jpg The area under each curve is an estimate for the life expectance for the two groups. Using a test called the "log-rank test" they conclude that the overall difference in life expectancy was 3.9 years (79.7 vs. 75.8 years; P = .003. While the life tables look like standard life tables there is one big difference. We note that 100 percent of the Oscar winners live to be at least 30 years old. Of course this is not surprising because they are known to be Oscar winners. Thus we know ahead of time that the Oscar winners will live longer than a traditional life table would predict. This gives them an advantage in estimating their life time. This is called a selection bias. Of course the controls also have an advantage because we know that were in a movie at about the same age as a nominee. But there is no reason to believe that these advantages are the same. Here is a more obvious example of selection bias discussed in Robert Abelson's book "Statistics as Principled Argument'. As reported in Chance News 4.05: A study found that the average life expectancy of famous orchestral conductors was 73.4 years, significantly higher than the life expectancy for males, 68.5, at the time of the study. Jane Brody in her "New York Times" health column reported that this was thought to be due to arm exercise. J. D Caroll gave an alternative suggestion, remarking that it was reasonable to assume that a famous orchestra conductor was at least 32 years old. The life expectancy for a 32 year old male was 72 years making the 73.4 average not at all surprising. To avoid the possible of selective bias Redelmier and Singh did an analysis using time-dependent covariates, in which winners were counted as controls until the time of first they won the Oscar. This resulted in a difference of 20% (CI, 0% to 35%) Since 0 is in the confidence interval the difference not significant. In a letter to the editor in response to the study by Hanley et.al. Redelmier and Singh report that they did the same analysis with one more years data and obtained a result more obviously not significant. Sylvester and colleagues analyzed the data by comparing the life expectancy of the winners from the moment they win with others alive at that age. In the McGill press release Hanley remarks "The results are not as, shall we say, dramatic, but they're more accurate." We recommend reading this press release for more information about the study by Sylvester and her colleagues. When the Redelmier and Singh paper came out our colleague Peter Doyle recognized the problems with the paper and suggested a number of ways to show these Peter described one of the simplest ways as follows: The problem can be seen in a very colorful way in this case, because you can do a simulation to rewrite history, having the computer select at random new OSCAR winners from among each year's nominees. Each time you rewrite history you compute a new p-value, and you discover that you get a value less than .05 more than 5 percent of the time. You can do the same thing with data that are much more easily simulated, but still, it's kind of cool to have the computer churning out new OSCAR winners. Richard Burton would approve, because he generally comes out a winner! Here is another way to see that Oscar winners do not live longer that Peter described in the form of a game of points: We decide to compare those who have won an Oscar (call them ‘winners’) with those who have merely been nominated (call them ‘also-rans’). Our ‘null hypothesis’ is that having won an Oscar doesn’t help your health. We create a contest by associating a point to the death of anyone who has ever been nominated for an Oscar. Points are bad: the winners get a point if the deceased was a star; the also-rans get a point if the deceased was an also-ran. Suppose that the deceased died in their d'th day of life. Over the course of history, some number a of nominees will have made it to the kth. day of their lives, and been a winner on that day; some number b of nominees will have made it to the d'th day of their lives, and been an also-ran on that day. If our hypothesis is correct, and having won an Oscar really doesn’t help your health, then the probability that the winners get this point should be a/(a+b). So now we’ve got a game of points. with known probability of winning for each point. If you carry out this analysis correctly, you will that the winners win very nearly the expected number of points, leaving us no reason to suppose that winning an Oscar helps you live longer. Despite the fact that in their paper Redelmier and Singh said the data they used would be available on their website it never was. Thus Peter with the help of a student Mark Mixer had to determine their own data set which is available here incuded in a Mathematica program that Peter wrote. If you do not have Mathematica you can read this program and data using the free [1]. For the paper by Hanley and his colleagues the Redelmier and Singh did make their data available but it still was not the original data since it included the results of one more year of Oscars winners. This data is available here. ### Homework Carry out one of Peter's methods using your favorite math program and either of the two data sets. My homework assigned by Peter was to do this using True Basic which is the only program I know. Report your findings in the next Chance News. Submitted by Laurie Snell ## To live longer, choose fame over fortune Nobel's greatest prize, The Economist, 20 Jan 2007. Winning a Nobel prize not only brings fame and fortune to the holder but it also brings two extra years of life, according to Matthew Rablen and Andrew Oswald at the University of Warwick. The paper provides evidence that an increase in status, rather than wealth alone, raises a person's lifespan, based on data about the lives of about 520 Nobel Prize winners (135) and nominees (389). This idea was first proposed by Michael Marmot, of University College London, when he studied a large cohort of people, British civil servants, and found, against all expectations, that top civil servants were far healthier and less stressed than lower ranked civil servants. Other studies have confirmed this result and support the assertion that better health is not a result of higher salary. The Rablen-Oswald paper refines the approach by analysing people who are at the top of their profession, by virtue of being nominated for a Nobel, to measure the value of winning the prize, relative to merely being nominated. The authors correct for various biases, such as grouping the data by country. So American winners live over two years longer, German winners by just over a year and other European winners by 0.7 years, based on the empirical data. The fitted model suggests a two year difference overall. What causes the increase in longevity is not clear but it is not the cash that comes with a Nobel prize, as the inflation-adjusted purchasing power of the prize is not correlated with longevity. So status, rather than money, appears to be responsible for the effect, the authors claim. Marmot and others have previously suggested that stress hormones may be a potential factor: those at the bottom of the pile are more stressed than those at the top, even though the latter have to make decisions with more wide ranging impacts. Rablen and Oswald's paper goes further by suggesting a positive effect from having a high status, rather than the absence of a negative effect, as unsuccessful nominees never know that they were being considered. In the case of Oscar winners (see previous article), the winner may live longer but the other failed nominees know that they have failed to win. ### Questions • This result is based on data from the first half of the 20th century, only, due to the secrecy surrounding the nomination of potential prize winners. Are results based on historical data still applicable today? Speculate on what adjustments might be needed. • The data is based on men only, to avoid differences in life-span between the sexes. Do you think that the underlying idea can be extrapolated to women? • If the idea that social status improves lifespan is truly correct, might the size of the effect be larger in a more normal population of people? • Oddly, Oscar winning actresses and actors live 3.6 years longers than those who are merely nominated but Oscar winning scriptwriters live 3.6 years less than other nominess. Why might this be? ### Further reading Mortality and Immortality, Matthew D. Rablen and Andrew J. Oswald, University of Warwick, Jan 2007. Submitted by John Gavin. ## Momentous modelling Momentous modelling, Economics focus, The Economist, Feb 1st 2007. This article highlights a growing trend in economics to focus on the uncertainty surrounding a economic forecast, rather than the forecast level itself. Shocking is what economists do. They start with a model of the economy, administer a 'shock' to it - a sudden rise in the oil price - and work out what happens to output, prices, employment and so forth. Such models consider changes in a model's mean or expected value, such as what if the oil price doubles? In contrast, economists have focussed much less on variaton around the forecast, such as working out what will happen if the oil price is likely to range between, say,$20 and $100 rather than between$50 and \$60? The Economist's tentative explanation is that the latter question requires more difficult maths.

In a recent paper, based on his PhD thesis, Standford University's Nick Bloom claims such models are important if people's behaviour changes as a result of the world suddenly becoming a less (or more) certain place. Sudden big second-moment shocks, measured by the volatility of American share prices, are also fairly frequent: the terrorist attacks of September 11th 2001, the assassination of John Kennedy and the collapse of big companies such as Worldcom and Enron.

Bloom's model allows firms to choose how much to invest and how many workers to employ. The world in which they operate is uncertain because their revenues can vary. He shocks the model by suddenly increasing the variability of firms' revenues, based on data from shocks over the past 45 years. He does this by doubling the standard deviation of revenues, a common measure of variability, before it returns to its old level a few months later. The model predicts that firms wait and see what happens because the value of waiting increases. So expanding firms defer hiring new workers and failing firms tend to delay sacking employees in the hope of a turnaround in their circumstances. As a result, workers are no longer being shuffled from less productive to more productive firms, which is bad for the economy as a whole, a concern for policymakers.

So Bloom claims that for policymakers it is important to tell second-moment shocks, which seem not to last long, from the first-moment variety, where the effects endure for longer.

### Questions

• Is it plausible that people's perceptions might be more influenced by the uncertainty of a forecast rather than the forecast itself? Can you think of common examples where this is the case? For example, when you hear a weather forecast which attribute of the forecast do you tend to recall before venturing outside? If you are going on holiday for a week, does that change what you look for, from the weather forecast for your destination?
• Is standard deviation an appropriate measure of the shocks that are mentioned in the article? Would higher moments be more helpful?
• The data used to calibrate the model covers a period of 45 years. Is data from so long ago still relevant to today's economy? What adjustments might be applied to standardise the data accross time?
• Do you think that the duration of shocks might be a influential factor to consider? How might this be measured and subsequently simulated? What other information would you like to have at your disposal?