## Quotation

Statistics are no substitute for judgment.

Henry Clay

## Forsooth

The following Forsooth from the Nov. 2007 issue of RSS NEWS.

The odds of an \$18 million Lotto win are one in 30 million but in the tiny Northland town of Kaeo they've been slashed to just one in 500. The town is abuzz with gossip that it could be home to New Zealand's biggest ever Lotto winner but Far North district councillor Sue Shepherd says the 500 residents are keeping their cards, and their tickets, close to their chest.

The Dominion Post, New Zealand
22 May 2006

Note: This article is available from Lexis Nexis. Later in the article it is stated that there was a single winner and the ticket was bought at Patel's Price Cutter in Kaeo but not yet claimed. (It was claimed later by a couple who do not live in Kaeo). So why is this a Forsooth? Laurie Snell

Of Italy's 151 Series A players, 52 or non-white, with Inter Fielding, 19, Juventus 12, AC Milan 13, AS Roma 12 and Udinese 10. Messina has eight.

The Times
30 November 2005

## Using Statistics to bust myths

The MythBusters Answer Your Questions Stephen J. Dubner, Freakonomics Blog, October 25, 2007.

"The MythBusters" is a television show on The Discovery Channel where Jamie Hyneman and Adam Savage examine commonly held myths and see if they have any validity. Their prior experience was in movie special effects and stunts, and sometimes their experiments lead to big (but carefully controlled) explosions. They were interviewed on the Freakonomics blog, and there were a pair of the questions asking why they didn't use more Statistics in their investigations.

"Q: Often, when testing a myth, you conduct one full scale test and then draw your conclusions. I know you are both aware of the scientific method and the need to run multiple trials to fully prove or disprove a theory. How confident are you that when you’ve run one test on a myth, you can then accurately capture whether or not it is true?"

and

"Q: How much statistics training do you guys have, and how much statistics do you use off camera? I get frustrated with the show over what appears to be a lack of statistical knowledge and rigor. (I’m thinking of the “football kick with helium” episode in particular, but the issue is sort of endemic to the show.) I realize that statistics makes for bad TV, while building machines that shoot things and break things make good TV. So the Freakonomics-y question would be: how much of this type of stuff is hidden off-camera?"

Both Jamie and Adam point out their time and budget limitations and remind us that the show has to be entertaining as well as illustrate a scientific approach to investigation. Adam does admit that he'd like to include more statistics, though.

ADAM: These two (very difficult), questions are similar, so I’ll answer them together. I would love to get more statistics into the show, and I’ve been talking to a statistician friend about just that. It’s true that statistics are not very telegenic, and are often difficult to get across.

We do worry about consistency, and it’s usually because our data sets are so small. With larger sets, we can work with things like standard deviation; but with a data set of 2, we don’t have that luxury.

Also, I sense a frustration in some of these questions. I’ll say this: I don’t pretend to be a scientist. We’re not deliverers of scientific truth. But I am curious. And if there’s one complaint I have about people, it’s that most of them aren’t curious enough to look around and figure stuff out for themselves. So if you’re yelling at me at the TV, you’re involved, and as such, I’ve done my job.

### Questions

1. Is it true that statistics are not very telegenic? Are there any aspects of Statistics that would lend themselves to a medium like television?

2. The Discovery Channel website has an episode guide. Select a show and explain how statistics could be used to investigate the myth(s) on that episode.

Submitted by Steve Simon

## Migration statistics

Stats office to improve data on migration flows, Reuters, 30th Oct 2007.
Smith apologises for foreign workers error, Guardian Unlimited, 30th October 2007.
Undercounted and over here, The Economist, 1st Nov 2007.
How many people live in Britain? We haven't the foggiest idea, The Guardian, 3rd November 2007.

UK politicians were recenly forced to answer the question how many foreign workers were in the country? but were unable to do so. The initial estimate (800,000) had to be revised upwards, not once, but twice (1.1 million, then the government's chief statistician said it was more like 1.5m), much to the government's embarrassment.

The shadow pensions secretary, Chris Grayling, said

This situation just gets worse. It's clear we simply can't trust the figures or statements put out by the Government on migrant workers in the UK. Ministers need to carry out an urgent review of how they handle this data and need to clear up once and for all how many people come to work in Britain.

Then just a few hours after the government was forced to admit it had hugely underestimated the number of immigrant workers, the (UK's) national statistics office (ONS) announced changes to the way it collects migration data. Publishing an interim report into the issue, the ONS said it would increase the sample sizes for its International Passenger Survey and consider making better use of administrative data, such as school and patient registers. The (UK's) International Passenger Survey currently samples around 0.3 percent of people entering and leaving the country at 16 airports, 21 ferry routes and the Channel Tunnel. The ONS said extra "filter shifts" would be introduced at specific airports from next April to reflect the higher number of migrants who arrived and departed from these airports in 2006.

How does the survey work? According to Michael Blastland writing in the Evening Standard

For ferry passengers, a team in blue blazers stands at the top of each of stairs into the passenger deck and scribbles a quick description of every 10th [passenger] aboard. As the ship sails, the blazers go hunting for their sample, the woman in the green hat, the trucker in overalls by the slot machine, and ask them if they plan to stay, then extrapolate.

One objective of this survey is to say how many of the 2.17m jobs created since 1997 have been filled by foreign nationals, the statistic that caused the furore.

Richard Alldritt, the Statistics Commission's chief executive, wants the government to spend more money on improved monitoring of travel movements: the international passenger survey has become a key estimate of migration levels, but Alldritt said it didn't cover every port and that there was

no guarantee that those surveyed give accurate answers and the results have to be scaled up enormously.

The lack of reliable data on migrant flows has been a major headache for policymakers, complicating everything from the allocation of government resources to the setting of interest rates.

US-born, National Statistician Karen Dunnell said

The ONS is engaged in a major programme to improve further the quality of its migration statistics. The International Passenger Survey is a vital source of data on this, so improving the sampling of migrants is a step forward in this very important area of our work.

This week on BBC's Question Time, David Dimbleby asked the audience if they would believe any statistic mentioned by a politician and the audience roared 'No!'.

### Questions

• Speculate on what questions might be asked in such a survey?
• What criteria might the ONS use to decide which airports to locate their extra 'filter shifts' at?
• The revised figure of 1.5m included children. What is the implication of counting them as 'workers'?
• Sir Andrew Green, chairman of Migration Watch, which campaigns against mass immigration, claimed that the rise was equivalent to a city the size of Coventry. Is it fair and unbiased to compare the size of the error in the initial estimate to a specific city? Can you think of alternative analogies?

• The International Passenger Survey is a survey of a random sample of passengers entering and leaving the UK by air, sea or the Channel Tunnel.
• Over a quarter of million face-to-face interviews are carried out each year with passengers entering and leaving the UK through the main airports, seaports and the Channel Tunnel.
• There are six versions of the questionnaire depending on the mode of transport (air, sea or Eurostar) and which direction the passenger is travelling in (arrivals or departures).
• The sampling procedures for air, sea and tunnel passengers are slightly different but the underlying principle for each is similar. In the absence of a readily available sampling frame, time shifts or crossings are sampled at the first stage. During these shifts or crossings, the travellers are counted as they pass a particular point (for example, after passing through passport control) then travellers are systematically chosen at fixed intervals from a random start.
• Interviewing is carried out throughout the year and over a quarter of a million face-to-face interviews are conducted each year, and represents about 1 in every 500 passengers.
• The interview usually take 3-5 minutes and contains questions about passengers’ country of residence (for overseas residents) or country of visit (for UK residents), the reason for their visit, and details of their expenditure and fares.
• There are additional questions for passengers migrating to or from the UK.
• While much of the content of the interview remains the same from one year to the next, new questions are sometimes added or appear periodically on the survey.

Submitted by John Gavin.

## The Unbreakable Wikipedia?

Creating, Destroying, and Restoring Value in Wikipedia Department of Computer Science and Engineering at the University of Minnesota, 2007.
Univ. of Minnesota: Less Than 1/2 Percent of Wikipedia Content is Damaged Fox News (Twin Cities), November 5, 2007.

The University of Minnesota computer science and engineering faculty and students found that only a few edits inflict damage on the integrity of content within Wikipedia and that damage is typically fixed quickly. The study estimated a probability of less than one-half percent (0.0037) that the typical viewing of a Wikipedia article would find it in a damaged state.

(This submission to Chance News 31 will be expanded, but it is important to ask incisive questions about this study, especially to demand a definition of what constitutes "vandalism" and "damage". The following passage from Wikipedia is downright horrid, but would it constitute a "damaged" piece of content? Our guess is that the Minnesota study would have accepted a passage like this as "undamaged", but we still need to read the white paper itself.)

From the "History of western Eurasia" article in Wikipedia:

As the Viking raids subsided the Magyars arrived. Crossing the Carpathians they, in 896, occupied the Upper Tisza river, from which they conducted raids through much of Western Europe. However, in 955 they were defeated by Otto of Germany at the Battle of Lechfeld. The defeat was so crushing that the Magyars decided that 'if you can't beat them join them' and in 1000 their King was accepting his royal regalia from the Pope. Otto on the strength of that victory was able to secure the tittle of Emperor. This German based Holy Roman Empire was to be the major power in Christian Europe for some time to come. As well as this "rebirth" of Western Roman Empire, the Eastern Roman Empire continued to be the up.

