Chance News 13

From ChanceWiki
Revision as of 09:14, 30 January 2006 by Gavinj (talk | contribs) (new item - Using bill-tracking data to predict epidemics)
Jump to navigation Jump to search

Quotation

Do not put your faith in what statistics say until you have carefully considered what they do not say.

William W. Watt

Forsooth

Here is a Forsooth from the January 2006 issue of RSS News with a comment by the editor.

Alcohol is now 49% more affordable than it was in 1978

Sky News
20 November 2005

[readers are invited to send in their
suggestions as to the exact definitions
of a and b in the equation b = 1.49a]


Note from JLS.

The concept of an affordabiliity index is used in many fields but it is often difficulty to find a definition. Here are a couple of examples:

In the Ottawa Citizen Dec 23, 2005 we read:

The RBC (Royal Bank of Canada) housing affordability index, which measures the proportion of pre-tax household income needed to cover the costs of owning a home, was 24.6 per cent for a condominium, which remains the most affordable type of housing; 28.8 for a standard townhouse, 35.5 for a standard bungalow, and 41.3 for a standard two-storey home.

And in the US congress College access and opportunity act of 2005 we read:

The college affordability index shall be equal to

(A) the percentage increase in the tuition and fees charged for a first-time, full-time, full-year undergraduate student between the first of the 3 most recent preceding academic years and the last of those 3 academic years; divided by

(B) the percentage increase in the Consumer Price Index-All Urban Consumers (Current Series) from July of the first of those 3 academic years to July of the last of those 3 academic years.

Do Superfluous Medical Studies Exist?

In Chance News 12 can be found an item Superfluous Medical Studies, It references David Brown's articlein the Washington Post of January 2, 2006 in which he "looks at several instances where...the evidence is so convincing that no more studies need or should be done." His phrase is "What part of 'yes' don't doctors understand." In particular, "he cites the use of aprotinin in heart surgery" which since 1987 had 64 studies each conclusively showing that aprotinin reduced bleeding. Researchers were criticized for persisting in evaluating aprotinin without being fully aware of the previous research.

Less than four weeks later on January 26, 2006, the New York Times and the Wall Street Journal had respective headlines, Doctors Urge Ending Use of Heart Surgery Drug and "Serious Risks Are Found In Heart Drug." The heart drug, Trasylol, is, as you might guess, aprotinin! Some 4374 patients were in the study published by the New England Journal of Medicine--"1295 were given aprotinin and 1705 one of two other drugs, both generics [of older drugs]" There was also "A control group, 1374 patients" who "had no drugs to prevent bleeding."

According to the Wall Street Journal, "About 29% of the Trasylol patients suffered a stroke or heart-related complication, compared with about 21% of the patients taking the generic drugs and 19% getting no drugs." The breakdown comparison between Trasylol and the alternatives is as follows:

5% of patients on Trasylol required kidney dialysis vs. 1% of those on one of the two alternative drugs.

16% of Trasylol patients had a myocardial infarction (heart attack) vs. 12% and 13% of those on alternatives.

9% of Trasylol patients experienced heart failure vs. 6% and 5% on alternatives.

Furthermore, "The researchers also found that cheaper alternatives ("$10 to $50" per patient) to Trasylol ($1000 per patient) were just as effective in limiting blood loss." The New York Times wrote, "The [New England Journal of Medicine] article said that halting aprotinin use globally would prevent 10,000 to 11,000 cases of kidney failure a year and save more than $1 billion a year in dialysis costs as well as nearly $250 million spent on the drug itself." In addition, "The study is significant because it was conducted without drug-industry funding at 69 medical centers, including many of the top U.S. hospitals."

Naturally, Bayer, the manufacturer of Trasylol, a drug which "in the first nine months [of 2005]" had a "world-wide sales of just less than $200 million" and is used in 150,000 patients in the U.S., "believes that Trasylol is a safe and effective treatment." Though the study was large, Bayer points out that it was observational and not a randomized trial. According to the New York Times, the implication is, "Doctors might have assigned sicker patients to a particular drug that could make the results for that drug look bad." However, the Wall Street Journal claims that a randomized trial could not be done because "many hospitals have deemed Trasylol effective and would consider it unethical to allow a patient to get a placebo instead." The study thus resorted to " 'match' like patients across the study groups to control for other risk factors, such as age, gender and additional health problems."

About the only unambiguous conclusion that can be reached is to take good care of your heart so that you avoid the need for by-pass surgery and the associated medication.

Submitted by Paul Alper.

Using bill-tracking data to predict epidemics

Web game provides breakthrough in predicting spread of epidemics, BJS, Science Blog, 6-Jan-06.
Money-tracking web-game informs mathematical model of epidemics, BoingBoing, 26-Jan-06.
Statisticians Count Euros and Find More Than Money, Otto Pohl, New York Times, 2 Jul 2002.

The study of how epidemics spread worldwide is critical for controlling diseases, particularly pandemics like AIDS, bubonic plague, SARS and avian flu. Scientists hope to improve their models for epidemics by unravelling the statistical laws of human travel in the United States and elsewhere using novel data gathering techniques that exploit the ubiquity of the internet.

For the US source, the data is collected at the www.wheresgeorge.com bill-tracking website. This is a web-game that encourages people to track the serial numbers of dollar bills as they move around the US. Where's George? players mark their bills with WHERESGEORGE.COM then visitors to the site are encouraged to enter the serial number of the bill they've found and where they got it. In this way, the passage of a dollar-bill (or some other piece of infection) can be tracked around the country. One of the authors of the Nature article (Brockmann) says:

We recognized that the enormous amount of data, as well as the geographical and temporal resolution of bill-tracking, allowed us to draw conclusions about the statistical characteristics of human travel, independent of which means of transportation people use.

Movements of money is similar to phenomena like "on hold" times at call centers and stock price movements. These are systems whose development depends largely or entirely on the previous state the system was in. But how does one model this mingling? Until recently, models of the geographic spread of disease were based on the assumption that viruses disperse over geographic areas in a way similar to the diffusion of fine dust particles on the surface of water or gas molecules mixing in a room. Competing models include using a complex set of equations to describe interactions between people that depend on distance between them, the number of people involved and physical or economic constraints.

Using the US game data, the Science blog claim that scientists have developed a mathematical theory that describes the observed movements of travelers amazingly well over distances from just a few kilometers to a few thousand. By analysing the data from the bill-tracking website, they found that money follows what are known as universal scaling laws (from local to regional to long-distance scales). Like money, viruses are transported by people from place to place. Because the mechanisms of transmission of diseases from human to human are already well understood, the scientists can use these novel data sources to develop better models to better explain the global spread of a disease, during an epidemic. Another of the authors (Hufnagel) explained

Since we can't track people with tracking devices, like we do animals, we needed to get data that provided us with millions of movements of individuals. What is amazing about these particular scaling laws is the fact that they are determined by two universal parameters only. This result surprised us all.

In Europe, a similar effort has been underway since 2002 with the introduction of the Euro. On New Year's Day, the notes and coins, all of them valid across the entire euro zone, began to spread across national borders. While the euro is worth the same in every country that uses it, each one mints its own euro notes, with a distinctive design on the reverse and each country introduced its own notes exclusively within its borders. Every time a euro note from Finland appears in Greece, for example, it provides a tiny but precise data point about the relationship between the countries. Moreover, the total amount minted was roughly proportional to each country's economy compared with the overall European economy. So Germany, an economic powerhouse, minted 32.6 percent of all notes, France about 15 percent, and tiny Luxembourg just two-tenths of 1 percent. This Euro diffusion process shows how epidemics spread, to what extent are Europeans integrating and what their travel patterns are.

Dr. Dietrich Stoyan, a statistics professor at the University of Freiberg in Germany in charge of the Euro coin-counting project comments

I hope that studying this process will help people studying epidemics. What makes this special is that the precise launching date of the coins is known. We know when this `epidemic' broke out.

In contrast to the analysis of the US dollar data, Dr. Stoyan has gone the complex route, by assuming that the relationship between each Euro country depends on a complex set of equations between each country that considers the distance between the countries and the number of commuters, travelers and bank trucks going back and forth etc. His model is composed of 144 interdependent differential equations that take as many of the known variables into account as possible. Another group from the University of Amsterdam, has chosen a high-level model of money flow, based on a branch of probability theory called Markov chains. For example, they assume that a relatively constant percentage of Dutch Euros will leave the Netherlands each month, and that a different, smaller percentage of Dutch Euros that have already left the country will return.

So far, Dr. Stoyan and the Amsterdam group have been surprised to learn that large-denomination coins for one and two euros move much faster than smaller ones. Neither knows exactly why that is. Dr. Stoyan hypothesizes that people tend to dump their small coins out of their pockets at the end of the day and are therefore less likely to take them traveling. Mr. Nuyens believes that the coins are used much more often.

When will the Euro coins reach statistical equilibrium? The models have yielded different results. The Dutch group believes that half of all coins in Holland will be of foreign origin a year from now and that statistical equilibrium across Europe will be reached in five to seven years (from the 2002 launch date).

Questions

  • Coin collectors take rare coins out of circulation disproportionately more often e.g. Luxembourg Euro coins are rare relative to German Euro coins. Is this likely to affect the overall conclusions of the Euro 'experiment'? Is there an analagous problem for US dollars?
  • Is the self-selected group of participants who log their US dollar bills serial numbers each month a statistically representative sample?
  • Science and math teachers have latched onto these projects as a way to illustrate their subjects. They either ask all of the students to examine the change in their pockets or buy rolls of coins at the bank and count them in class. How are the results of these class experiments likely to differ from the web-based experiments?
  • Are people more likely to travel further in Summer, when on holidays? If so, might this cause a seasonal effect to emerge in the data and how might an adjustment be made to cater for this?

Further reading

To take part in one of the data gathering exercises:

Submitted by John Gavin.