# Chance News 33

## Quotation

It is the mark of a truly intelligent person to be moved by statistics.

George Bernard Shaw

## Forsooth

The following Forsooths are from the January 2008 issue of RSS NEWS.

In terms of platform use trends among the respondents, 53% cited Windows as their primary technical computing platform, with Linux following closely at 51%.
NAGNews email (NAG User Survey 2006 on technical
computing trends)
August 2006

Clearly, any product with a large user base is going to throw up some problems. Dell, for example, is shipping almost 40m PCs a year, so even if 95% of it users are happy, there could still be 6m or so with significant gripes.
The Guardian
25 January 2007

## High altitude effects on athletic performance

Effect of altitude on physiological performance: a statistical analysis using results of international football games. Patrick E McSharry. BMJ 2007; 335: 1278-1281 (22 December).

There is a strong belief that athletes who live and train at high altitudes have an unfair advantage over those athletes visiting from lower altitudes. In response,

football’s governing body, the Federation of International Football Associations (FIFA), banned international matches from being played at more than 2500 m above sea level.

There is a plausible mechanistic explanation for this concern.

At high altitude hypoxia, cold, and dehydration can lead to breathlessness, headaches, nausea, dizziness, and fatigue, and possibly altitude illness including syndromes such as acute mountain sickness, high altitude pulmonary oedema, and cerebral oedema. Activities such as football can exacerbate symptoms, preventing players from performing at full capacity.

What would the data say. An ideal database exists to explore whether high altitude has a detrimental effect on athletes visiting from lower altitudes. In South America, which has three large cities at high altitude (Bogota, Columbia, Quito, Ecuador, and La Paz, Bolivia), there are records of 1460 football matches played over a 100 year period at a wide range of altitudes. This data set included four variables:

(i) the probability of a win, (ii) the number of goals scored, (iii) the number of goals conceded, and (iv) the altitude difference between the home venue of a specific team and that of the opposition.

as well as indicators for individual countries. This study used a logistic regression model to predict the probability of a win by the home team, and two Poisson regression models: one to predict number of goals scored by the home team and a second to predict the number of goals conceded by the home team.

The graph of the predicted equations appears above. These graphs show clearly that a thousand meter difference in altitude between the home team and the opposition produces a large change in the estimated probability of a win for the home team, the expected number of goals scored by the home team, and the expected number of goals allowed by the home team.

### Questions

1. Although the graphs are non-linear, a linear approximation is quite reasonable for the predicted values. Estimate how much change in probability of home team winning, goals scored by the home team, and goals allowed by the home team changes for each 1,000 meter change in altitude.

2. There are many variables that were not considered in this analysis. List some of the more important variables that were not included. Consider whether these variables are easy to measure or hard to measure.

3. Is there an alternate explanation other than change in altitude that could account for the differential in home team win probability, goals scored by the home team, and goals allowed by the home team?

4. Should international football matches be allowed in high altitude locations?

Submitted by Steve Simon

## What happened to the margin of error in New Hampshire?

We will not be able to answer this question until the Pollsters have time to analyze their data and perhaps not even then.

At the Polster.com website we find here the following graphs that show how far the polls were from the classical 5% margin of error. The origin is the percentages of the votes that Obama and Clinton obtained in the voting: 39.1% for Clinton and 36.4% for Obama. The dots indication the percentages predicted by the polls for Obama and Clinton. To have a margin of error 5% or less a poll's dot would have to fall in the first circle, a feat not accomplished by any of the polls.

http://www.pollster.com/blogs/1NHPollErrorDem19.png

And here is the corresponding graphic for the republican candidates with the highest percentage of votes: McCain 37% and Romney 31.5%:

http://www.pollster.com/blogs/2NHPollErrorRep19.png

From this you see that the pollsters did a pretty good job.

### Discussion

Andrew Kohut wrote an op-ed "Getting it Wrong" for the New York Times, October 10, 2008. Kohut is a well known Independent pollster and President of the Pew Research Center. In this op-ed he considers possible explanations and concludes that the the most likely cause was race. He writes:

Poorer, less well-educated white people refuse surveys more often than affluent, better-educated whites. Polls generally adjust their samples for this tendency. But here’s the problem: these whites who do not respond to surveys tend to have more unfavorable views of blacks than respondents who do the interviews.

I’ve experienced this myself. In 1989, as a Gallup pollster, I overestimated the support for David Dinkins in his first race for New York City mayor against Rudolph Giuliani; Mr. Dinkins was elected, but with a two percentage point margin of victory, not the 15 I had predicted.

(1) What do you think of this explanation?

(2) How did later polls do in predicting the Obama, Clinton votes? See Where was the error greater NH or SC by Mark Blumenthal.

(3) What other explanation might explain the pollsters getting it wrong?

Submitted by Laurie Snell

## Cholesterol Significance

The distinction between “statistical significance” and “practical significance” seems to evade the lay public despite valiant efforts on the part of instructors and textbook writers. Alex Berenson, writing in the New York Times on January 15, 2008, provides an excellent example for distinguishing the two forms of significance. The article also reveals the tenuous connection between a cholesterol count and heart attacks. And the strong connection between pharmaceutical companies and a profit motive.

The clinical trial known as Enhance “covered 720 patients and lasted two years.” Those in the control arm “received either Zocor—an older cholesterol drug” and those in the treatment arm received “a combination of Zocar and Zetia, in the pill form known as Vytorin.” During the course of “the two years of the trial, patients who took Zocor alone reduced their LDL [the bad cholesterol] by 41% on average, while patients who took Vytorin reduced their [LDL] cholesterol by 58%.” Assuming that 360 patients were in each arm, Minitab shows that the results are clearly statistically significant:

Test and CI for Two Proportions

 Sample X N Sample p 1 148 360 0.411111 2 209 360 0.580556

Difference = p (1) - p (2)
Estimate for difference: -0.169444
95% CI for difference: (-0.241429, -0.0974597)
Test for difference = 0 (vs not = 0): Z = -4.61 P-Value = 0.000

Fisher's exact test: P-Value = 0.000

Nevertheless, in spite of “the larger cholesterol reduction, patients taking Vytorin actually had more growth of fatty plaque in their carotid arteries than those patients on Zocor” alone and thus more likely to experience heart attacks. More strongly, a leading cardiologist warns that “Millions of patients may be taking a drug that does not benefit them, raising their risk of heart attacks and exposing them to potential side effects.” Clearly this is a situation in which the practical (i.e., clinical) significance is suspect.

Discussion

1. “Sales of the two drugs were \$5 billion in 2007.” In addition, “Worldwide, about one million prescriptions are written for Zetia and Vytorin each week, and about five million people are now taking the drugs worldwide.” The Enhance trial ended in April, 2006 but the announcement of the results was delayed until January, 2008. “The drug companies blamed the complexity of the data for the delay.” A spokesman for one of the drug companies “said the delay was unrelated to the negative findings and that the companies had not known the result until two weeks ago.” Determine the cost of a one month supply of Zocor and Vytorin, respectively, to see how that helps in explaining the announcement delay.

2. Zocor is a statin drug and Zetia is drug with a different mechanism and thus, the notion that combining the two would have a serendipitous effect. Do some research on prostate cancer, glaucoma or other afflictions to see if treatment combinations with different mechanisms are recommended and what side effects the patients are warned about due to the combination.

3. Merck and Schering-Plough submitted a large two-page advertisement in many newspapers on January 22, 2008 stating, "you may be worried about recent news stories questioning the benefit of these medicines...on the basis of a single study that has generated a lot of confusion." Find the advertisement and determine whether Zocor is ever mentioned. If not, why not? If so, how? Ask your local newspaper what the cost of such an advertisement is.

Submitted by Paul Alper