Chance News 51: Difference between revisions

From ChanceWiki
Jump to navigation Jump to search
 
(18 intermediate revisions by 3 users not shown)
Line 5: Line 5:
-----
-----
Re remark about the “attitudes and prejudices of the famous philosophers” in Chance News 49 [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_49], a 1924 Virginia sterilization law (not repealed until 1976) was upheld by the Supreme Court in <i>Buck v. Bell</i> in 1927, with Justice Oliver Wendell Holmes Jr. writing the majority opinion.<br>
Re remark about the “attitudes and prejudices of the famous philosophers” in Chance News 49 [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_49], a 1924 Virginia sterilization law (not repealed until 1976) was upheld by the Supreme Court in <i>Buck v. Bell</i> in 1927, with Justice Oliver Wendell Holmes Jr. writing the majority opinion.<br>
<blockquote>“This woman [Carrie Bell] got railroaded.  And one of the giants of the Supreme Court was driving the train.</blockquote>
<blockquote>This woman [Carrie Bell] got railroaded.  And one of the giants of the Supreme Court was driving the train.</blockquote>
<div align=right>Paul Lombardo, quoted in "Terrible legacy of U.S. eugenics" [http://www.usatoday.com/news/health/2009-06-23-eugenics-carrie-buck_N.htm]<br><i>USA TODAY</i>, June 24, 2009</div align=right><br>
<div align=right>Paul Lombardo, quoted in "Terrible legacy of U.S. eugenics" [http://www.usatoday.com/news/health/2009-06-23-eugenics-carrie-buck_N.htm]<br><i>USA TODAY</i>, June 24, 2009</div align=right><br>


Line 11: Line 11:
<blockquote>Much of the fascination of statistics lies embedded in our gut feeling--and never trust a gut feeling--that abstract measures summarizing large tables of data must express something more real and fundamental than the data themselves.  (Much professional training in statistics involves a conscious effort to counteract this gut feeling.)  The technique of ''correlation'' has been particularly subject to such misuse because it seems to provide a path for inferences about causality (and indeed it does, sometimes--but only sometimes).</blockquote>
<blockquote>Much of the fascination of statistics lies embedded in our gut feeling--and never trust a gut feeling--that abstract measures summarizing large tables of data must express something more real and fundamental than the data themselves.  (Much professional training in statistics involves a conscious effort to counteract this gut feeling.)  The technique of ''correlation'' has been particularly subject to such misuse because it seems to provide a path for inferences about causality (and indeed it does, sometimes--but only sometimes).</blockquote>
   
   
<div align=right>Page 269 in Stephen Jay Gould's <i>Mismeasure of Man</i>, 2nd  edition </div align=right>
<div align=right>Page 269 in Stephen Jay Gould's <i>Mismeasure of Man</i>, 2nd  edition, 1996 </div align=right>
Submitted by Paul Alper
Submitted by Paul Alper
-----
For more precision in the definition of PoP, the probability of precipation, from two atmospheric/oceanic scientists at the University of Wisconsin at Madison:
<blockquote>The technical definition most commonly used by meteorologists says that PoP is the confidence probability that at least 1/100th of an inch of liquid-equivalent precipitation will fall in a single spot.</blockquote>
<div align=right>Steven A. Ackerman/Jonathan Martin<br>
<i>Capital Times</i>, Madison, WI, August 15, 2009</div align=right>


==Forsooths==
==Forsooths==
Line 30: Line 38:
[http://www.time.com/time/magazine/article/0,9171,1904129,00.html  “Is Your Credit Too Good? Why lenders are punishing those who borrow too little and always pay on time”]<br>
[http://www.time.com/time/magazine/article/0,9171,1904129,00.html  “Is Your Credit Too Good? Why lenders are punishing those who borrow too little and always pay on time”]<br>
by Cybele Weisser, <i>TIME</i>, June 22, 2009<br>
by Cybele Weisser, <i>TIME</i>, June 22, 2009<br>
<blockquote>[T]he formula for determining credit scores …  looks at something called your “utilization ratio,” the total amount of credit you use vs. the amount you have available.  If you have $25,000 worth of available credit and you put $5,000 on your cards every month, your utilization ratio is a healthy … 20%.  But cut down that credit line to $10,000 and suddenly your ratio jumps to 50%, making you look pretty overextended.</blockquote>
<blockquote>[T]he formula for determining credit scores …  looks at something called your “utilization ratio,” the total amount of credit you use vs. the amount you have available.  If you have 25,000 dollars worth of available credit and you put 5,000 dollars on your cards every month, your utilization ratio is a healthy … 20%.  But cut down that credit line to 10,000 dollars and suddenly your ratio jumps to 50%, making you look pretty overextended.</blockquote>


==Student-loan repayment for congressional staffers==
==Student-loan repayment for congressional staffers==
[http://online.wsj.com/article/SB124578152192043001.html#articleTabs%3Dcomments “Scrutiny Grows as U.S. Pays Staffers’ Student Loans”]<br>by Elizabeth Williamson, <i>The Wall Street Journal</i>, June 25, 2009<br>
[http://online.wsj.com/article/SB124578152192043001.html#articleTabs%3Dcomments “Scrutiny Grows as U.S. Pays Staffers’ Student Loans”]<br>by Elizabeth Williamson, <i>The Wall Street Journal</i>, June 25, 2009<br>


<blockquote>The House and Senate will spend $18 million this year repaying staffers' student loans. Last year, ... House lawmakers nearly doubled what the government can pay for their staffers' college bills. The yearly maximum repayment is $10,000 in fiscal 2009, which ends Sept. 30, up from $6,000 in fiscal 2008, with a lifetime maximum of $60,000, the same as in the executive branch.  The House appropriated $13 million in 2009 for the program; as of last month, more than 2,200 House employees were getting the money.</blockquote>
<blockquote>The House and Senate will spend $18 million this year repaying staffers' student loans. Last year, ... House lawmakers nearly doubled what the government can pay for their staffers' college bills. The yearly maximum repayment is 10,000 dollars in fiscal 2009, which ends Sept. 30, up from 6,000 dollars in fiscal 2008, with a lifetime maximum of 60,000 dollars, the same as in the executive branch.  The House appropriated 13 million dollars in 2009 for the program; as of last month, more than 2,200 House employees were getting the money.</blockquote>


<center>http://s.wsj.net/public/resources/images/NA-AY547_EXPENS_NS_20090624180410.gif</center>
<center>http://s.wsj.net/public/resources/images/NA-AY547_EXPENS_NS_20090624180410.gif</center>
Line 224: Line 232:


Two bloggers[http://online.wsj.com/article/SB124648494429082661.html#articleTabs%3Dcomments] commented.<br>
Two bloggers[http://online.wsj.com/article/SB124648494429082661.html#articleTabs%3Dcomments] commented.<br>
 
*Ms. Silverman should have mentioned the fact that she picked up the story from the March-April 2009 edition of <i>American Scientist</i>, "A Cipher to Thomas Jefferson" [http://www.americanscientist.org/issues/feature/2009/2/a-cipher-to-thomas-jefferson].<br>
(a)  Ms. Silverman should have mentioned the fact that she picked up the story from the March-April 2009 edition of <i>American Scientist</i>, "A Cipher to Thomas Jefferson" [http://www.americanscientist.org/issues/feature/2009/2/a-cipher-to-thomas-jefferson].<br>
*If you'd like to read a fun story in which involves a replacement code, frequency analysis, and buried treasure, see Poe's short story, "The Gold-Bug" [http://www.eapoe.org/works/tales/goldbga2.htm].<br>
 
(b)  If you'd like to read a fun story in which involves a replacement code, frequency analysis, and buried treasure, see Poe's short story, "The Gold-Bug" [http://www.eapoe.org/works/tales/goldbga2.htm].<br>


==Joltin’ Joe==
==Joltin’ Joe==
Line 246: Line 252:
Two bloggers [http://online.wsj.com/article/SB10001424052970204556804574261942466979118.html#articleTabs%3] commented:<br>
Two bloggers [http://online.wsj.com/article/SB10001424052970204556804574261942466979118.html#articleTabs%3] commented:<br>


(a)  Strogatz's simulation had Cobb out-hitting DiMaggio 300 out of 10000 times, or 3%. Dunno how long he played, but much longer than 3% of baseball. 10000 "seasons" is a sample 100 times greater than reality.<br>
*Strogatz's simulation had Cobb out-hitting DiMaggio 300 out of 10000 times, or 3%. Dunno how long he played, but much longer than 3% of baseball. 10000 "seasons" is a sample 100 times greater than reality.


(b)  ….  “Don’t give me brilliant generals; give me lucky generals.” –Caesar.  ….  As a former baseball player, I know how hard it is to get a hit on those days when you're just not feeling it. I don't think coins have those days.<br>
….  “Don’t give me brilliant generals; give me lucky generals.” –Caesar.  ….  As a former baseball player, I know how hard it is to get a hit on those days when you're just not feeling it. I don't think coins have those days.<br>


Discussion<br>
===Discussion===


1.  In a toss of 100 coins, what is the probability of seeing a streak of <i>6 or more heads</i>?  Here [http://www.bumblebeagle.org/horsehide/hitstreaks.html] is a website with an applet calculator and an explanation of the reasoning behind the calculations.<br>
1.  In a toss of 100 coins, what is the probability of seeing a streak of <i>6 or more heads</i>?  Here [http://www.bumblebeagle.org/horsehide/hitstreaks.html] is a website with an applet calculator and an explanation of the reasoning behind the calculations.<br>


2.  Show that, in a toss of 100 coins, the probability of seeing a streak of <i>6 or more heads or 6 or more tails</i> is more than 75%.<br>
2.  Do you agree that the probability of seeing a streak of <i>6 or more heads or 6 or more tails</i> in 100 coin tosses is more than 75%? 
 
3.  See the website [http://wizardofodds.com/askthewizard/images/streaks.pdf], which contains graphs of the probabilities of a streak of <i>N heads or N tails</i> out of 10 to 1,000 coin tosses.  (Scroll down to see graphs.)  Note that the graph for 100 coin tosses shows that the probability of getting a streak of <i>6 heads or 6 tails</i> is about 80%. How does 80% compare to your calculations in your answer to #2 above"? <br>


3.  Comment on blogger (a)’s response to the article.<br>
4.  Comment on blogger (a)’s response to the article.<br>


==Confidence intervals as public policy==
==Confidence intervals as public policy==
Line 262: Line 270:
PBS TV program, originally aired on August 14, 2007<br>
PBS TV program, originally aired on August 14, 2007<br>


Note:  This PBS program may be two years old; however, it has provided for good class discussion.<br>
Note:  This PBS program may be two years old; however, it has provided good class discussions for the contributor.<br>


This round table program about the 2002 No Child Left Behind Act covered two important aspects of the national assessment of student progress mandated by this act:  (1) each state has the discretion to set its own passing percentages and must raise its bars annually; and (2) states may use a confidence interval to capture the passing percentage of a subgroup of a school system that meets a pre-set minimum size.  Discussants were Jim Lehrer (moderator), John Merror (PBS Special Correspondent for Education), Margaret Spellings (U.S. Secretary of Education), Kevin Carey (Education Sector Policy Director), and Chester Finn (Fordham Institute President).<br>
This round table program about the 2002 No Child Left Behind Act covered two important aspects of the national assessment of student progress mandated by this act:  (1) each state has the discretion to set its own passing percentages and must raise its bars annually; and (2) states may use a confidence interval to capture the passing percentage of a subgroup of a school system that meets a pre-set minimum size.  Discussants were Jim Lehrer (moderator), John Merror (PBS Special Correspondent for Education), Margaret Spellings (U.S. Secretary of Education), Kevin Carey (Education Sector Policy Director), and Chester Finn (Fordham Institute President).<br>
Line 268: Line 276:
Merrow compared the assessment-of-progress system to 100-meter hurdle events, in which “all the hurdles are the same height.”  He stated that 9 states set the early NCLB bars “very close to the ground,” in order to show more progress toward what Finn called the unrealistic national goal of 100 percent proficiency by 2014.<br>
Merrow compared the assessment-of-progress system to 100-meter hurdle events, in which “all the hurdles are the same height.”  He stated that 9 states set the early NCLB bars “very close to the ground,” in order to show more progress toward what Finn called the unrealistic national goal of 100 percent proficiency by 2014.<br>


Merrow noted that, unlike hurdle events, states are evaluated by how well traditionally underserved groups of students progress; however, if a subgroup does not meet a minimum size requirement, those results are not reported.  Finn estimated that about 2 million minority students are not counted because, as subgroups in various municipalities, their size does not warrant reporting.  With the Department of Education’s approval, states can increase their subgroup size and avoid having to report a group’s progress.<br>
Merrow noted that, unlike hurdle events, states are evaluated by how well traditionally underserved groups of students progress; however, if a subgroup does not meet a minimum size requirement, those results are not reported.  Finn estimated that about 2 million minority students are not counted because, as subgroups in various municipalities, their sizes do not warrant reporting results.  With the Department of Education’s approval, states can increase their subgroup size and avoid having to report a group’s progress.<br>


Merrow described how schools, unlike athletic event judges, may use confidence intervals to capture a passing percentage for a subgroup, and Carey claims that some margins of error are as large as 30 points.<br>
Merrow described how schools, unlike athletic event judges, may use confidence intervals to capture a passing percentage for a subgroup, and Carey claims that some margins of error are as large as 30 points.<br>
Line 276: Line 284:
JOHN MERROW: Nearly all states use confidence intervals. In Illinois, 509 schools were saved from failing because confidence intervals added up to 12 points to their scores. </blockquote>
JOHN MERROW: Nearly all states use confidence intervals. In Illinois, 509 schools were saved from failing because confidence intervals added up to 12 points to their scores. </blockquote>


Carey felt that percent-passing scores measured in this way gave the public a false impression of the performance of their students.<br>
Carey felt that percent-passing scores measured in this way give the public a false impression of the performance of their students.<br>


Spellings defended the Act as the beginning of educational accountability, flawed as it may be, and she foresaw revisions in the requirements.<br>  
Spellings defended the Act as the beginning of educational accountability, flawed as it may be, and she foresaw revisions in the requirements.<br>  
   
   
Prior to this 2007 PBS program, a 2005 article “State gives schools extra leeway,” in the <i>Milwaukee Journal Sentinel</i>, June 15,  reporter Jamaal Abdul-Alim quoted an Illinois education official, “We have to ensure that we are as accurate as we can be, ….  That’s the reason we’re using a 99% confidence interval as opposed to a 95% confidence interval.”  A mathematics professor at the University of Wisconsin-Milwaukee stated, “The charitable way to view this is to say they chose 99% to make sure that anybody who they said was bad, really, really is bad ….  The uncharitable way to view this is to say they chose 99% so they would have to say as few people are bad as possible.”  A statistician at that university feels that the use of confidence intervals “appears to be reasonable, given the consequences of being flagged as a school failing to make progress.”<br>
Prior to this 2007 PBS program, a 2005 article “State gives schools extra leeway,” in the <i>Milwaukee Journal Sentinel</i>, June 15,  reporter Jamaal Abdul-Alim quoted an Illinois education official, “We have to ensure that we are as accurate as we can be….  That’s the reason we’re using a 99% confidence interval as opposed to a 95% confidence interval.”  A mathematics professor at the University of Wisconsin-Milwaukee stated, “The charitable way to view this is to say they chose 99% to make sure that anybody who they said was bad, really, really is bad….  The uncharitable way to view this is to say they chose 99% so they would have to say as few people are bad as possible.”  A statistician at that university felt that the use of confidence intervals “appears to be reasonable, given the consequences of being flagged as a school failing to make progress.”<br>

Latest revision as of 16:10, 20 February 2012

Quotations

Passion is inversely proportional to the amount of real information available.

Gregory Benford, Timescape, 1980

Re remark about the “attitudes and prejudices of the famous philosophers” in Chance News 49 [1], a 1924 Virginia sterilization law (not repealed until 1976) was upheld by the Supreme Court in Buck v. Bell in 1927, with Justice Oliver Wendell Holmes Jr. writing the majority opinion.

This woman [Carrie Bell] got railroaded. And one of the giants of the Supreme Court was driving the train.

Paul Lombardo, quoted in "Terrible legacy of U.S. eugenics" [2]
USA TODAY, June 24, 2009



Much of the fascination of statistics lies embedded in our gut feeling--and never trust a gut feeling--that abstract measures summarizing large tables of data must express something more real and fundamental than the data themselves. (Much professional training in statistics involves a conscious effort to counteract this gut feeling.) The technique of correlation has been particularly subject to such misuse because it seems to provide a path for inferences about causality (and indeed it does, sometimes--but only sometimes).

Page 269 in Stephen Jay Gould's Mismeasure of Man, 2nd edition, 1996

Submitted by Paul Alper


For more precision in the definition of PoP, the probability of precipation, from two atmospheric/oceanic scientists at the University of Wisconsin at Madison:

The technical definition most commonly used by meteorologists says that PoP is the confidence probability that at least 1/100th of an inch of liquid-equivalent precipitation will fall in a single spot.

Steven A. Ackerman/Jonathan Martin
Capital Times, Madison, WI, August 15, 2009

Forsooths

…. Let’s look at basketball …. The 1993 college basketball playoffs started with 64 teams. Of these, 15 were from schools with accredited library education programs.

That’s an amazing statistic by itself, when you consider that there are only slightly more than three times that many library education programs in the United States, and that some of these don’t compete athletically in Division I. However, those 15 schools also went on to win 28 of the 63 games played, while losing only 14. The reason that there were only 14 losses is that the championship school has a library education program. So does the runnerup. Indeed, what sportswriters call the Final Four included three schools with accredited library education programs.

…. Do I believe a single word of what I have just written? Of course not, although I have seen “research” studies … for which the hypotheses were no more credible.

Herbert S. White
"Is There a Correlation Between Library Education Programs and Athletic Success?
Library Journal, August 1993



During The Daily Show on June 30, TV’s Jon Stewart gave out RIPPY (Rest-In-Peace) Awards [3] to television commentators for various aspects of their coverage of Michael Jackson’s death.

It’s the award for attempts at mind-blowing analysis, and the winner is Extra’s Carlos Diaz [who stated on June 25]:
People don’t realize the proximity of this whole thing. Farrah Fawcett passed away 5 hours, almost to the minute that Michael Jackson passed away 5 miles away. Ed McMahon passed away 48 hours previous [sic] at the same hospital that Michael Jackson passed away.

Credit utilization ratio

“Is Your Credit Too Good? Why lenders are punishing those who borrow too little and always pay on time”
by Cybele Weisser, TIME, June 22, 2009

[T]he formula for determining credit scores … looks at something called your “utilization ratio,” the total amount of credit you use vs. the amount you have available. If you have 25,000 dollars worth of available credit and you put 5,000 dollars on your cards every month, your utilization ratio is a healthy … 20%. But cut down that credit line to 10,000 dollars and suddenly your ratio jumps to 50%, making you look pretty overextended.

Student-loan repayment for congressional staffers

“Scrutiny Grows as U.S. Pays Staffers’ Student Loans”
by Elizabeth Williamson, The Wall Street Journal, June 25, 2009

The House and Senate will spend $18 million this year repaying staffers' student loans. Last year, ... House lawmakers nearly doubled what the government can pay for their staffers' college bills. The yearly maximum repayment is 10,000 dollars in fiscal 2009, which ends Sept. 30, up from 6,000 dollars in fiscal 2008, with a lifetime maximum of 60,000 dollars, the same as in the executive branch. The House appropriated 13 million dollars in 2009 for the program; as of last month, more than 2,200 House employees were getting the money.

http://s.wsj.net/public/resources/images/NA-AY547_EXPENS_NS_20090624180410.gif

Measuring excess risk

“EPA study: 2.2M live in areas where air poses cancer risk”
by Brad Heath and Blake Morrison, USA TODAY, June 24, 2009

This article gives a brief report about the National-Scale Air Toxics Assessment for 2002 [4], an EPA study of excess cancer risks from breathing 181 air toxics over an assumed lifetime of 70 years. The EPA updates information about air toxics emissions every three years, after which it conducts an analysis which is reviewed by the states, evaluated for accuracy, and released - apparently a long process.

According to the EPA, the study found 2 million people with an increased cancer risk of greater than 100 in 1 million.

According to the article, the study found air pollution to be a health threat “around major cities … although some of the counties where the air was even worse were in rural areas ….” The worst neighborhood was outside Los Angeles, where the estimated excess cancer risk was “more than 1,200 in 1 million, 34 times the national average.” The article provided no information about rural areas; however, the EPA provides a map [5] of most affected counties.

http://www.epa.gov/ttn/atw/nata2002/images/NATARisks100inaMil.jpg

Discussion

1. How might one measure cancer risk?

2. What does it mean to measure excess, or increased, cancer risk?

3. Why does the EPA measure excess risk over a lifetime? How do you think they identified people who had lived in a region over a lifetime? Would the fact that air pollution levels might change over a lifetime affect any aspect of the study?

4. Estimate the national average excess cancer risk. Is it higher or lower than the EPA’s ceiling of 100 in 1 million? Do you think it makes sense to refer to a national average of excess cancer risk?

5. Referring to the map, are you surprised about any of the locales with the highest excess cancer risk? If so, can you find any potential reason for high excess cancer risks in those locales?

Too many cable TV channels?

“Time to Screen Out Unloved Channels”
by Martin Peers, The Wall Street Journal, June 27, 2009
(Full text may only be available to subscribers.)

The author suggests that there are too many TV channels available and that this situation is driving subscriber costs up. He reports that "the average household tuned into only 16 channels of the 118 channels available.” He feels that charging fees in proportion to the sizes of viewing audiences would lower the cost of cable TV.
He says that there is currently the “absence of correlation between the size of the fees paid to individual cable channels and their audiences.” Among non-premium channels, Nickelodeon was the most-watched cable channel in 2008, but its fees were not the highest (10th from the highest). Nickelodeon, with about 1.7 million daily household viewers, also had an annual affiliate revenue of about $300 per household, while Discovery Kids, with only 20,000 daily household viewers, had an annual affiliate revenue of about $1,900 per household.

http://s.wsj.net/public/resources/images/OB-DY540_TVHERD_NS_20090626193119.gif


True or false?

One hundred sleuthing statisticians running 100 different tests are about 100 times more likely than a lone investigator to find something fishy.

Carl Bialik, "Rise and Flaw of Internet's Election-Fraud Hunters"
The Wall Street Journal, July 1, 2009


New lottery study

“Want False Hope With That Lottery Ticket?”
by Rick Green, The Hartford Courant, July 3, 2009

A taxpayer-funded study by Spectrum Gaming Group [6] is said to have found “no correlation between lottery sales and poverty.” The study claims that “because most successful lottery retailers were not located in higher poverty neighborhoods, there is no connection between income and ticket sales.”

The Spectrum study contradicts many other studies, including one at Cornell University, where investigators “found ’a strong and positive relationship’ between lottery ticket sales and poverty rates after examining data from 39 states over 10 years.”

The Spectrum study also contradicts a 2002 analysis done by the column’s author, Rick Green, and a colleague. They identified, by zip codes, the locales in which the highest concentrations of winners resided, not the locales in which the highest-selling retailers were located. Not surprisingly, these areas were in the poorest cities of Connecticut.

30% chance for rain?

For many, meaning of rain forecast is cloudy at best.
USA TODAY, June 24, 2000
Doyle Rice

This news article begins with:

When your local weather forecaster announces that there is a 30% chance of rain tomorrow, not everyone knows what that means. Some think it means 30% of an area will get rain. Others think it will rain for 30% of the day. In fact, of all the forecast terms used by meteorologists, this remains one of the most baffling to the public.

Some people don't understand that the forecaster simply means there's a 30% probability it will rain at some point during the day. Susan Joslyn, a senior lecturer in the psychology department at the University of Washington in Seattle, and colleagues have been studying such confusion.

The news article explains the results of this study. There have been many studies like this. The following is a study which is often referred to when the "30% chance of rain problem" is brought up.

Misinterpretations of Precipitation.
Bulletin American Meteorological Society, Vol. 61, No 7,
July 1980, p.695-701.
Murphy, Licthenstein, Fischoff and Winkler

We reviewed this article in Chance News 3.08 In this review we wrote:

The authors wanted to see if there was a

misunderstanding about the event being predicted, the meaning of probability or both. To test the understanding of the event, subjects were asked if the event being predicted was "rain somewhere in the region", "rain at a particular point in the region" "rain 20% of the time etc. Their answers led the authors to the conclusion that there is considerable misinterpretation on the meaning of the event. On the other hand, the subjects' answers to questions on the possible meaning of "20% chance" led them to conclude that the subjects did understand what the probability

itself meant.

I also talked to a couple of meteorologists who stated that it is unlikely that the public could understand what a 20% chance of rain means. Harold Brooks provided the following statement:

According to the National Weather Service Operations Manual,

the Probability of Precipitation (PoP) is the likelihood of occurrence (expressed as a percent) of a precipitation event at any given point in the forecast area. the time period to which the PoP applies must be clearly stated (or unambiguously inerferred from the forecast wording) since, without this, a numerical PoP value is meaningless. That is, it is the average point probability within the forecast area and the same PoP is assigned to each point. It can be shown that the PoP is equal to the expected area coverage of the precipitation (Schaefer, J. T.

nd R. L. Livingston, 1990: Operational implications of the "Probability of Precipitation". Weather. Forecasting, 5, 354-356.).

An interesting study by Gerd Gigerenzer and friends can be found here

In there they write

The weather forecast says that there is a “30% chance of rain,” and we think we understand what it means. This quantitative statement is assumed to be unambiguous and to convey more information than does a qualitative statement like “It might rain tomorrow.” Because the

forecast is expressed as a single-event probability, however, it does not specify the class of events it refers to. Therefore, even numerical probabilities can be interpreted by members of the public in multiple, mutually contradictory ways. To find out whether the same statement about rain probability evokes various interpretations,we randomly surveyed pedestrians in five metropolises located in countries that have had different degrees of exposure to probabilistic forecasts––Amsterdam, Athens, Berlin, Milan, and New York. They were asked what a “30% chance of rain tomorrow” means both in a multiple-choice and a free-response format. Only in New York did a majority of them supply the standard meteorological interpretation, namely, that when the weather conditions are like today, in 3 out of 10 cases there will be (at least a trace of) rain the next day. In each of the European cities, this alternative was judged as the least appropriate. The preferred interpretation in Europe was that it will rain tomorrow “30% of the time,” followed by “in 30% of the area.” To improve risk communication with the public, experts need to specify the reference class, that is, the class of events to which a

single-event probability refers.


Our first introduction to this problem apparently was at a two-week summer workshop entitled 'Geometry and the Imagination', taught by Peter Doyle, Mark Foskey, Joan Garfield, Linda Green, and Laurie Snell at the Geometry Center in Minneapolis, 20 June-1 July, 1991. Despite the name this was a Chance Course.

Here the participants were asked to read the materials on weather prediction and answer these questions:

(1) What do you make of all this?

(2) What does Marilyn means when she says, `But rain doesn't obey the laws of chance; instead it obeys the laws of science.'

(3) If the PoP is 30% and it rains, was the forecaster correct?

(4) Suppose that Minneapolis gets precipitation 3 days out of 10 over the long haul. Why not report a PoP of 30% every single day?

(5) San Diego county is spread out over a large area, comprising the coastal strip and inland valleys, the mountains, and the deserts. Separate forecasts are given for each region. Suppose, however, that the weather bureau computes a single PoP for the whole area. On days on which this composite PoP is 20%, what is the probability that a randomly selected resident of San Diego county will get rained on?

(6) What do you think is the correct answer to the Reader reader's question?

(7) There are contests to reward the best predictor of the weather. If you were running such a contest, how would you decide the winner?

Readers might like to view two Chance video lectures about weather forecasting: (1) "How are Weather Predictions Determined by the National Weather Service?" [7], by Daniel Wilks, Cornell University; (2) "How are Local Weather Predictions Determined By Local Weather Forecasters?" [8], by Mark Breen, Fairbanks Museum.

Submitted by Laurie Snell

Need for evidence

“How to Cut Health-Care Costs: Less Care, More Data”
by Michael Grunwald, TIME, June 29, 2009

According to the author, President Obama has identified two major obstacles to more efficient health care delivery, the first of which is the current “fee-for-service” system in which hospitals and doctors are rewarded financially for ordering more tests and carrying out more procedures.

The other big barrier is information: evidence-based medicine is hard to practice without evidence. …. So the things we know are dwarfed by the things we don’t know. …. [The] Mayo [Clinic] … has an institutional obsession with evidence-based medicine, using electronic records for in-house effectiveness research, constantly monitoring its doctors on everything from infection rates to operating times to patient outcomes, minimizing the art of medicine and maximizing the science. “We try to drive out variation wherever we can,” says Charles (Mike) Harper, a neurologist who oversees Mayo’s clinical practice in Rochester. “Practicing medicine is not the same as building Toyotas, but you can still standardize. Uncertainty shouldn’t be an excuse to ignore data.”

Billions of almost-zeros

“Priced to Sell”
by Malcolm Gladwell, The New Yorker, July 6 & 13, 2009

In his new book, Free: The Future of a Radical Price, author Chris Anderson states:

Distribution [of online videos] is now close enough to free to round down. Today, it costs about $0.25 to stream one hour of video to one person. Next year, it will be $0.15. A year later it will be less than a dime. Which is why YouTube’s founders decided to give it away.

In this book review, Malcolm Gladwell notes, however:

Although the magic of Free technology means that the cost of serving up each video is “close enough to free to round down,” “close enough to free” multiplied by seventy-five billion is still a very large number.

A critique of Anderson's Free, and Ellen Ruppel's Cheap, can be found in The New York Times, July 5, 2009 [9].

Love (food) and marriage?

“First Comes Love, Then Comes Obesity?”
by Bonnie Rochman, TIME, July 6, 2009

This article discusses a University of North Carolina study of the relationship between romance and obesity. Published in the July issue of Obesity , the study found that “married individuals are twice as likely to become obese as are people who are merely dating.” The study “tracked changes over a handful of years in the weight and relationship status of 6,949 individuals.” The effect of increased risk of obesity appears to have affected women more than men, for folks who lived together, whether married or not.

When in the course of human events ...

“Two Centuries On, a Cryptologist Cracks a Presidential Code”
by Rachel Silverman, The Wall Street Journal, July 2, 2009

The author reports that Lawren Smithline, a mathematician at the Center for Communications Research in Princeton, NJ, has deciphered a coded message in an 1801 letter to President Thomas Jefferson from a math professor at the University of Pennsylvania.

The code [10], was not a “simple substitution cipher,” in which one letter of the alphabet is replaced with another, and so could not be cracked using ordinary frequency analysis. Nor was the code a “nomenclator,” which is a “catalog of numbers, each standing for a word, syllable, phrase or letter,” or a “wheel cipher,” which involves letters inscribed on the edge of a wheel that can be turned to scramble words.

Mr. Patterson claimed “the utter impossibility of deciphering” his code, which involved a grid of the text, broken into sections. He estimated that a de-coder might have to try “upwards of ninety millions of millions” of potential combinations in order to solve his coded message to Jefferson.

Dr. Smithline analyzed Jefferson’s State of the Union addresses and counted the frequency of every possible pair of letters in the speeches. He used a “dynamic programming” algorithm to test some “educated guesses.” Fewer than 100,000 calculations were needed to solve the cipher.

The following message emerged, a “little joke on Thomas Jefferson,” according to Dr. Smithline:

In Congress, July Fourth, one thousand seven hundred and seventy six. A declaration by the Representatives of the United States of America in Congress assembled. When in the course of human events ....

"Patterson played this little joke on Thomas Jefferson," says Dr. Smithline. "And nobody knew until now."

Two bloggers[11] commented.

  • Ms. Silverman should have mentioned the fact that she picked up the story from the March-April 2009 edition of American Scientist, "A Cipher to Thomas Jefferson" [12].
  • If you'd like to read a fun story in which involves a replacement code, frequency analysis, and buried treasure, see Poe's short story, "The Gold-Bug" [13].

Joltin’ Joe

“The Triumph of the Random”
by Leonard Mlodinow, The Wall Street Journal, July 3-5, 2009

This article discusses “streaks,” especially the 56 consecutive baseball games in which Joe DiMaggio had at least one hit, and people’s intuitions about them. The author [14] is a Caltech professor, who wrote The Drunkard’s Walk: How Randomness Rules Our Lives.

[R]andom processes do display periods of order. In a toss of 100 coins, for example, the chances are more than 75% that you will see a streak of six or more heads or tails, and almost 10% that you’ll produce a streak of 10 or more. As a result a streak can look quite impressive even if it is due to nothing more than chance. .... A few years ago Bill Miller of the Legg Mason Value Trust Fund was the most celebrated fund manager on Wall Street because his fund outperformed the broad market for 15 years straight. It was a feat compared regularly to DiMaggio’s, but if all the comparable fund managers over the past 40 years had been doing nothing but flipping coins, the chances are 75% that one of them would have matched or exceeded Mr. Miller’s streak.

The author argues that DiMaggio’s streak could have occurred by chance alone, based on DiMaggio’s lifetime batting average of 0.325, and the fact that hundreds of players had been trying for such a streak over a hundred years.

The author points out that there are many factors involved in analyzing baseball streaks, e.g., variations in batting averages over time. Samuel Arbesman and Stephen H. Strogatz, of Cornell, carried out a 10,000-case computer simulation based on baseball players’ actual statistics from each year 1871-2005. They found that streaks ranged from 39 games to 109 games, with 42% having streaks of DiMaggio’s length or longer.

In discussing people’s misconceptions about streaks, the author cites Thomas Gilovich, Robert Vallone, and Amos Tversky’s paper, “The Hot Hand in Basketball: On the Misperception of Random Sequences.” [15]

Other resources not cited in this article include Thomas Gilovich’s 1998 Chance video lecture "Streaks in Sports" [16], and Stephen Jay Gould’s 1988 book review "The Streak of Streaks" [17].

Two bloggers [18] commented:

  • Strogatz's simulation had Cobb out-hitting DiMaggio 300 out of 10000 times, or 3%. Dunno how long he played, but much longer than 3% of baseball. 10000 "seasons" is a sample 100 times greater than reality.
  • …. “Don’t give me brilliant generals; give me lucky generals.” –Caesar. …. As a former baseball player, I know how hard it is to get a hit on those days when you're just not feeling it. I don't think coins have those days.

Discussion

1. In a toss of 100 coins, what is the probability of seeing a streak of 6 or more heads? Here [19] is a website with an applet calculator and an explanation of the reasoning behind the calculations.

2. Do you agree that the probability of seeing a streak of 6 or more heads or 6 or more tails in 100 coin tosses is more than 75%?

3. See the website [20], which contains graphs of the probabilities of a streak of N heads or N tails out of 10 to 1,000 coin tosses. (Scroll down to see graphs.) Note that the graph for 100 coin tosses shows that the probability of getting a streak of 6 heads or 6 tails is about 80%. How does 80% compare to your calculations in your answer to #2 above"?

4. Comment on blogger (a)’s response to the article.

Confidence intervals as public policy

“School Districts Find Loopholes in No Child Left Behind Law”
PBS TV program, originally aired on August 14, 2007

Note: This PBS program may be two years old; however, it has provided good class discussions for the contributor.

This round table program about the 2002 No Child Left Behind Act covered two important aspects of the national assessment of student progress mandated by this act: (1) each state has the discretion to set its own passing percentages and must raise its bars annually; and (2) states may use a confidence interval to capture the passing percentage of a subgroup of a school system that meets a pre-set minimum size. Discussants were Jim Lehrer (moderator), John Merror (PBS Special Correspondent for Education), Margaret Spellings (U.S. Secretary of Education), Kevin Carey (Education Sector Policy Director), and Chester Finn (Fordham Institute President).

Merrow compared the assessment-of-progress system to 100-meter hurdle events, in which “all the hurdles are the same height.” He stated that 9 states set the early NCLB bars “very close to the ground,” in order to show more progress toward what Finn called the unrealistic national goal of 100 percent proficiency by 2014.

Merrow noted that, unlike hurdle events, states are evaluated by how well traditionally underserved groups of students progress; however, if a subgroup does not meet a minimum size requirement, those results are not reported. Finn estimated that about 2 million minority students are not counted because, as subgroups in various municipalities, their sizes do not warrant reporting results. With the Department of Education’s approval, states can increase their subgroup size and avoid having to report a group’s progress.

Merrow described how schools, unlike athletic event judges, may use confidence intervals to capture a passing percentage for a subgroup, and Carey claims that some margins of error are as large as 30 points.

JOHN MERROW: So if my school scored 30 and passing is 55, but the confidence interval is 30 points, we can say we passed?

KEVIN CAREY: Yes.

JOHN MERROW: Nearly all states use confidence intervals. In Illinois, 509 schools were saved from failing because confidence intervals added up to 12 points to their scores.

Carey felt that percent-passing scores measured in this way give the public a false impression of the performance of their students.

Spellings defended the Act as the beginning of educational accountability, flawed as it may be, and she foresaw revisions in the requirements.

Prior to this 2007 PBS program, a 2005 article “State gives schools extra leeway,” in the Milwaukee Journal Sentinel, June 15, reporter Jamaal Abdul-Alim quoted an Illinois education official, “We have to ensure that we are as accurate as we can be…. That’s the reason we’re using a 99% confidence interval as opposed to a 95% confidence interval.” A mathematics professor at the University of Wisconsin-Milwaukee stated, “The charitable way to view this is to say they chose 99% to make sure that anybody who they said was bad, really, really is bad…. The uncharitable way to view this is to say they chose 99% so they would have to say as few people are bad as possible.” A statistician at that university felt that the use of confidence intervals “appears to be reasonable, given the consequences of being flagged as a school failing to make progress.”