# Difference between revisions of "Chance News 1"

m (→Numbed by the numbers, when they just don't add up) |
|||

(5 intermediate revisions by one other user not shown) | |||

Line 1: | Line 1: | ||

+ | May 1 2005 to May 30 2005 | ||

+ | |||

<blockquote>We've heard that a million monkeys at a million keyboards could produce the complete works of Shakespeare. Now, thanks to the Internet, we know that is not true. ~ Robert Wilensky </blockquote> | <blockquote>We've heard that a million monkeys at a million keyboards could produce the complete works of Shakespeare. Now, thanks to the Internet, we know that is not true. ~ Robert Wilensky </blockquote> | ||

Line 79: | Line 81: | ||

We have seen that, under the null hypothesis, the probability that Vermont has 9 or more casualties is .0033, so this test does not lead to rejecting the null hypotheses. Consider now Massachusetts, the state with the second highest death rate. Massachusetts had 7146 deployed and 28 casualties. Making the same kind of computation we did for Vermont, we find that, under the null hypotheses, the probability that Massachusetts has 28 or more casualties is .0002. This is less than our confidence level .001, so for this more general test we can also reject the null hypothesis. | We have seen that, under the null hypothesis, the probability that Vermont has 9 or more casualties is .0033, so this test does not lead to rejecting the null hypotheses. Consider now Massachusetts, the state with the second highest death rate. Massachusetts had 7146 deployed and 28 casualties. Making the same kind of computation we did for Vermont, we find that, under the null hypotheses, the probability that Massachusetts has 28 or more casualties is .0002. This is less than our confidence level .001, so for this more general test we can also reject the null hypothesis. | ||

+ | |||

Incidentally, one occasionally sees a medical study, for example a study to test if a new drug is more effective than placebo, that starts off with a single test and a 5% confidence level, and along the way the researchers find other tests that can be used to test the effectiveness of the drug. They then report the drug to be effective if any of the individual tests reject the null hypothesis without changing the confidence level. As we have seen, this can give them a much better chance of rejecting the null hypotheses (showing the drug is effective) when in fact this is not the case. | Incidentally, one occasionally sees a medical study, for example a study to test if a new drug is more effective than placebo, that starts off with a single test and a 5% confidence level, and along the way the researchers find other tests that can be used to test the effectiveness of the drug. They then report the drug to be effective if any of the individual tests reject the null hypothesis without changing the confidence level. As we have seen, this can give them a much better chance of rejecting the null hypotheses (showing the drug is effective) when in fact this is not the case. | ||

+ | |||

We might think we have shown that we cannot explain the death rates as the result of chance. But Greg also points out that the assumption in the null hypothesis that the casualties are independent is probably not a good assumption since, for example, there might be incidences where several soldiers are killed all of whom are from the same National Guard unit and hence from the same state. In his Commentary Greg discusses models that can take this into account. This is an interesting discussion and we encourage our readers to read this in his [http//www.dartmouth.edu/~chance/ForWiki/GregComentary.pdf] Commentary. | We might think we have shown that we cannot explain the death rates as the result of chance. But Greg also points out that the assumption in the null hypothesis that the casualties are independent is probably not a good assumption since, for example, there might be incidences where several soldiers are killed all of whom are from the same National Guard unit and hence from the same state. In his Commentary Greg discusses models that can take this into account. This is an interesting discussion and we encourage our readers to read this in his [http//www.dartmouth.edu/~chance/ForWiki/GregComentary.pdf] Commentary. | ||

Line 86: | Line 90: | ||

Gregory Leibon, a visiting professor in Dartmouth College's mathematics department who reviewed the <i>Valley News</i> findings, said the numbers of soldiers killed or injured is too small to draw broad conclusions, including whether Vermont soldiers are more likely to die. He noted that the addition or subtraction of a few deaths or injuries could change rankings. | Gregory Leibon, a visiting professor in Dartmouth College's mathematics department who reviewed the <i>Valley News</i> findings, said the numbers of soldiers killed or injured is too small to draw broad conclusions, including whether Vermont soldiers are more likely to die. He noted that the addition or subtraction of a few deaths or injuries could change rankings. | ||

− | + | "On statistical grounds, you could not reject the notion that it's not just bad luck, said Leibon". | |

DISCUSSION QUESTIONS: | DISCUSSION QUESTIONS: | ||

Line 140: | Line 144: | ||

The public editor column appears twice monthly. The present commentary focuses on "complaints...about innumeracy at The Times." | The public editor column appears twice monthly. The present commentary focuses on "complaints...about innumeracy at The Times." | ||

− | It is easy for jounalists to uncritically accept numerical figures provided by an outside source. For example, in November 2004, a study by the New York City Comptroller's office asserted that New Yorkers spend more than | + | It is easy for jounalists to uncritically accept numerical figures provided by an outside source. For example, in November 2004, a study by the New York City Comptroller's office asserted that New Yorkers spend more than 23 billion dollars annually on counterfeit goods. This translates to a nonsensical $8000 per household, but apparently no one at the Times tried this arithemetic before running the story. Many other examples are presented. |

See the discussion of this article for some other interesting examples. | See the discussion of this article for some other interesting examples. |

## Latest revision as of 16:13, 7 July 2011

May 1 2005 to May 30 2005

We've heard that a million monkeys at a million keyboards could produce the complete works of Shakespeare. Now, thanks to the Internet, we know that is not true. ~ Robert Wilensky

## Contents

## Forsooth

As the stakes increase, Prime-Number theory Moves Closer to Proof

Wall Street Journal, Science Journal, April 8. 2005, B1

Sharon Begley

Follow the points to find a Super Bowl champ

New York Times, 223 January, 2005, p 11

Aaron Schatz

The explanation rests in a mathematical formula created by the baseball analyst Bill James and introduced in the 1980 Baseball Abstract. James determined that the record of a baseball team could be approximated by taking the square of team runs scored and dividing it by the square of team runs scored plus the square of team runs allowed.

Because of its similarity to the geometric method for determining the sum of the angles in a right triangle, he called it the Pythagorean theorem.

DISCUSSION QUESTION:

How close is the Pythagorean theorem to the theorem that the sum of the angles in a triangle is 180 degrees?

P.S. Norton Star provided this picture observed by a student Tosin while walking in New York. Evidently New Yorkers are determined to not forget the quadradic formula!

## Case leads to fight on Jewish representation on juries

This item was suggested by Peter Kostelec.

Case stirs fight on Jews, juries and execution

The New York Times, March 16, 2005

Dean E. Murphy

John R. Quatman was a prosecutor for 26 years in Alameda County California and is now a lawyer in Montana.

In 1987 Quatman was the prosecutor when Fred Freeman was found guilty of murder and robbery at a bar in Berkeley. The jury recommended the death penalty and Freeman was put on San Quentin's Death Row. He is now seeking to appeal his conviction.

Quatman has provided a habeas corpus petition (a petition typically used to appeal state criminal convictions to the federal courts when the petitioner believes his constitutional rights were violated by state procedure) stating that at the 1987 trial the late Judge Stanley Golde, during the jury selection, advised Quatman that no Jew would vote to send a defendant to the gas chamber. In his petition, Quatman said that Golde helped him keep Jews off the jury. And Quatman recommended the death penalty for Freeman.

Quatman claimed that it was standard practice to exclude Jewish jurors in death sentences and this practice extended to African-American women, though this was not a problem for the Freeman trial. In rulings going back to 1880 the United States Supreme Court has ruled that it is illegal to reject jurors on the basis of race, and the California Supreme court in 1978 extended that prohibition to religion.

On Tuesday the California Supreme Court will investigate Quatman's sworn declaration. If they find Quatman's claims are credible Freeman will likely get a new trial.

Quatman's declaration is being used in the appeal of another Alameda inmate, Mark Schmeck. The Habeas Corpus Resource Center (HCRC) provides counsel to represent indigent (poor) men and women under sentence of death in California.The HCRC is representing Freeman in his appeal. Working with Schmeck's lawyers, the HCRC reviewed the jury selection in 25 capital trials in Alameda from 1984 to 1994.

The review found that 12 people who identified themselves as Jews were called to the jury box and the prosecution rejected all 12. They also found that of the 17 who had surnames judged to be Jewish names, the prosecution rejected 15. Overall they found that non-Jews were excluded at a rate of 49.97% and Jews and those with Jewish surnames were excluded at a rate of 93.10%.

Cliff Gardner, a lawyer for Mr. Schmeck, said that the statistics from the 10-year review of the capital trials spoke for themselves. Mathematician Phillip Farmer said that the probability of randomly striking 27 of 29 Jews is less than 1 in 1.6 million.

DISCUSSION QUESTIONS:

(1) How do you think Farmer got his 1 in 1.6 million probability?

(2) What problems are there in trying to estimate the probability that an event in the past occurred?

(3) After this was written, it was reported in the New York Times (April 6, 2005) that Judge Kevin Murphy, of Santa Clara County Superior Court, concluded that Quatman lied when he said an Alameda County judge encouraged him to exclude Jews from a jury in the trial of a man sentenced to death in 1987. The article says that Judge Murphy's opinion will be forwarded to the Supreme Court for a final ruling.

Evidently, Murphy's conclusion was reached on the basis of interviews with Quatman and others involved in the case. The statistical evidence seems not to have played a role in his decision. Do you think it should have?

## Vermont pays heavy war burden

The price they paid: By several measures, Vermont bears heavy war burden.

*Valley News*, January 30, 2005

Jodie Tillman.

The *Valley News* is the local paper covering a region in New Hampshire and Vermont that includes Dartmouth College. Their writers often consult Dartmouth faculty. For this article, the writer Jodie Tillman consulted Greg Leibon from the Dartmouth Mathematics Department.

Tillman obtained data to see if Vermont soldiers and Marines deployed to Afghanistan and Iraq are subject to greater risk than from those from other states. She asked Greg to help her analyze the data. The article is available here. Links to her data are at the end of the article. Her data included both deaths per capita and deaths per deployment. We will use only the data related to deaths per deployment. This data set gives, for each state, the number of soldiers and Marines deployed to Afghanistan or Iraq from the beginning of the Iraq war on March 2003 to Oct. 31, 2004. This data, with the computations we use, is available here. We will discuss some of Greg's analysis but we encourage readers to also read his more complete analysis. His analysis can be found here.

From the data, we see that Vermont had 1,613 soldiers and Marines deployed in the period under consideration and had 9 casualties during this period, giving it the highest death rate for all the states. Let's consider how we might design a test to determine if the high death rate in Vermont is just bad luck. For this test the null hypothesis is that the causalities are independent and the probability that a particular soldier or a Marine is killed is the same for all those deployed. With this null hypotheses, the number of casualties in a particular state has a binomial distribution B(n,p) with n the number deployed from the state and p the proportion of casualties among those deployed in all the states.

Greg calls this the * naive* test. This is because it would be equally newsworthy if any other state had an apparent unusually high death rate. So we now consider a test to see if at least one of the 50 states has more casualties than could be explained by chance. For our first attempt we use the same null hypothesis and do a test for each state just the way we did for Vermont. Then we reject the null hypothesis if any of the individual states, tested as our previous test for Vermont, would reject the null hypothesis.

But if we do that, and the null hypothesis is true, the probability that we reject the null hypothesis is [math](1-(1-.05))^{50} = .92[/math] which makes this a ridiculous test. A more reasonable procedure is to choose a lower confidence level for each state and choose this so that the confidence level for the overall test is .05. For this we need to choose the confidence level [math] \alpha[/math] for the individual states to satisfy the equation [math](1 - (1 - \alpha)^{50}) = .05. [/math] Asking Mathematica to solve this we obtain [math] \alpha[/math] = .00102534. Thus we will choose the confidence level for each state to be .001.

We have seen that, under the null hypothesis, the probability that Vermont has 9 or more casualties is .0033, so this test does not lead to rejecting the null hypotheses. Consider now Massachusetts, the state with the second highest death rate. Massachusetts had 7146 deployed and 28 casualties. Making the same kind of computation we did for Vermont, we find that, under the null hypotheses, the probability that Massachusetts has 28 or more casualties is .0002. This is less than our confidence level .001, so for this more general test we can also reject the null hypothesis.

Incidentally, one occasionally sees a medical study, for example a study to test if a new drug is more effective than placebo, that starts off with a single test and a 5% confidence level, and along the way the researchers find other tests that can be used to test the effectiveness of the drug. They then report the drug to be effective if any of the individual tests reject the null hypothesis without changing the confidence level. As we have seen, this can give them a much better chance of rejecting the null hypotheses (showing the drug is effective) when in fact this is not the case.

We might think we have shown that we cannot explain the death rates as the result of chance. But Greg also points out that the assumption in the null hypothesis that the casualties are independent is probably not a good assumption since, for example, there might be incidences where several soldiers are killed all of whom are from the same National Guard unit and hence from the same state. In his Commentary Greg discusses models that can take this into account. This is an interesting discussion and we encourage our readers to read this in his [http//www.dartmouth.edu/~chance/ForWiki/GregComentary.pdf] Commentary.

Of course the *Valley News* article did not include any of this technical stuff. We read:

Gregory Leibon, a visiting professor in Dartmouth College's mathematics department who reviewed the *Valley News* findings, said the numbers of soldiers killed or injured is too small to draw broad conclusions, including whether Vermont soldiers are more likely to die. He noted that the addition or subtraction of a few deaths or injuries could change rankings.

"On statistical grounds, you could not reject the notion that it's not just bad luck, said Leibon".

DISCUSSION QUESTIONS:

(1) What does this last line really say? How do you think readers interpreted this statement? Do you think Greg was quoted correctly?

(2) Looking at the data we see that Florida had 62572 deployed in the period considered and only 54 casualities. The expected number of casualities under the null hypothesis is 113.87. We note that 54 casualities is 5.62 standard deviations below the expected value. Mathematica tells us that, under the null hypothesis, the probabiity of 54 or fewer casualities is 2.924099646x10^(-10). What do we make of that?

(3) A study carried out by Robert Cushing and reported by Bob Bishop in the Austin American Statesman, 12 October 2003, showed that the rural populations had a higher death rate per capita than those in urban populations. How might this be explained?

## Seven statistical cliches used by baseball anounncers

In a game of Statistics, Some Numbers Have Little Meaning

New York Times, April 3, 2005, Section 8, Pg 10

Alan Schwarz

The author writes: with statistics courtesy of Stats Inc., the following is a user's guide to the facts behind seven statistical cliches. We have included excerpts from his explanation and recommend reading his complete discussions.

(1) HAS A 75-6 RECORD WHEN LEADING AFTER EIGHT INNINGS

Teams leading after eight innings last year won about 95 percent of the time (translating to a 77-4 record in 81 games); that 75-6 record would be two full games worse than average. Even after seven innings, teams with leads typically win 90.1 percent of the time.

(2) HOLDS LEFTIES TO A .248 AVERAGE

Middle relievers have become ever more important in baseball, particularly left-handed specialists who jog in to face only one or two left-handed hitters. Last year, left-handed middle relievers held fellow lefties to a .249 collective average, 18 points lower than the major league-wide .267 average in all other situations. Someone yielding a .248 average sounds good but is merely doing his job.

(3) HAS HIT 9 OF HIS LAST 12 GAMES

Last year, each game's starting position players finished with at least one hit 67.1 percent of the time. So across any 12-game stretch, simple randomness will have almost half of them hitting safely in eight or nine games. More than half will wind up with hits in eight or more.

(4) HAS 31 SAVES IN 38 OPPORTNITIES

Relievers who were considered closers converted saves 84.8 percent of the time last season -- 32 times for every 38 chances.

(5) HAS STOLEN 19 BASES IN 27 ATTEMPTS (70%)

Players batting first and second in their lineups, usually speedy table-setters, stole bases 73.7 percent of the time last season.

(6) LEADS N.L. ROOKIES WITH A .287 AVERAGE

Interesting, perhaps, but most people do not realize how few rookies play enough to be considered for this type of list. Last year, six rookies reached the standard cutoff of 502 plate appearances to qualify for the batting title.

(7) HITS .342 ON THE FIRST PITCH

The stat line many people use to make these claims reads "on 0-0 counts". What people do not realize is that on "0-0 counts" includes only at-bats that end on the first pitch; in other words, the hitter put the ball in play. Removing every time a hitter swings through a pitch or fouls it off will make anyone look good.

## Numbed by the numbers, when they just don't add up

Numbed by the numbers, when they just don't add up

New York Times, 23 January 2005, The public editor

Daniel Okrent

The public editor column appears twice monthly. The present commentary focuses on "complaints...about innumeracy at The Times."

It is easy for jounalists to uncritically accept numerical figures provided by an outside source. For example, in November 2004, a study by the New York City Comptroller's office asserted that New Yorkers spend more than 23 billion dollars annually on counterfeit goods. This translates to a nonsensical $8000 per household, but apparently no one at the Times tried this arithemetic before running the story. Many other examples are presented.

See the discussion of this article for some other interesting examples.