From ChanceWiki
Revision as of 19:16, 24 September 2009 by Jls (talk | contribs)
Jump to navigation Jump to search

The Bulgarian Toto 6 of 42 lottery was the subject of an investigation after the same set of six numbers{4, 15, 23, 24, 35, 42} were drawn in two successive lotteries on September 6 and September 10.The article cites a mathematician as stating that the probability of picking the same six numbers twice in a row is 4,200,000:1.We wondered how he arrived at this number. Suppose that the lottery has been running continuously for m draws since its inception. What is the probability that a specified set ofsix numbers will repeat?

There are (42 choose 6) = 5245786 different sets of six numbers and the probability that a SPECIFIED set will occur in consecutive draws is 1/52457862

Suppose the lottery has been running continuously for m draws and consider a fixed set of six numbers.

There are m-1 opportunities for this set to be drawn twice in succession (beginning with the second drawing). The probability that this will happenis then the probability of the union P(A) = P(∪i Ai Ai+1)where Ai is the event that this set of numbers is drawn on the ith draw.

Bonferroni's inequality gives the upper bound P(A) ≤ ΣiP(AiAi+1) while Hunter's inequality gives the lower bound P(A) ≥ ΣiP(AiAi+1) - ΣiP(Ai Ai+1Ai+2).

We assume (!) that the events Ai are independent and identically distributed with probability p = 1/5245786 leading to (m-1) p2 ≤ P(A) ≤ (m-1) p2 - (m-2) p3.Since p is very small the p3 term can be ignored giving P(A) ≈ (m-1)/52457862.

It appears that the draws are held twice per week so for one year m = 104 giving the probability 3.74 × 10-12 that a specified set of numbers will be drawn twice in succession. According to spokeswoman the lottery has been taking place for 52 years. Using m = 104 × 52 = 5408, the probability that a specified set of numbers will be drawn twice in succession over this period is 1.89 × 10-10, still very small.

But now let's ask the question, not for a fixed set of numbers but for some set of numbers. After all, in discussing this coincidence the the repeated set arises by chance alone and is not specified in advance.

In m drawings what is the probability that SOME set of six numbers will be repeated in consecutive draws.

There are 5245786 possible sets of numbers that could be repeated. Enumerate the sets by integers 1 ≤ k ≤ 5245786 withEk the event that set k repeats consecutively sometime during these m drawings. The probability of the union P(∪Ek) is needed. Each of the 5245786 events Ek has probability (m-1)/ 52457862 and if they were independent we could evaluate the probability using complements as P(∪Ek) = 1 - (1- (m-1)/52457862)5245786 ≈ 1 - e-(m-1)/5245786. However, they are dependent, but as long as m is small relativeto 5245786, Bonferroni's and Hunter's bounds can once again be used to estimate P(∪Ek) ≈ (m-1)/5245786. For m = 5408 this is 0.0010302. (Note that assuming independence gives 0.0010307)

This probability relates to one lottery. Suppose we consider all lotteries worldwise and ask for the probability that in some lottery, somewhere, some set of numberswill be repeated consecutively. All lotteries are variant of Toto with different numbers involved. Each lottery will have had its own cumulative number of drawings. In order to gauge the magnitude of the probability wanted, assume that there are x lotteries, each one sharing the same numericalcharacteristics as the Bulgarian one.

This time we can use independence. The probability that some set will be repeated is 1 minus the probability that in no lottery is a setof numbers selected on two consecutive drawings = 1 - (1 - (m -1)/5245786)x . For x = 50 this is 0.0503 while for x = 100 the probability is 0.0980. (A approximation to one significant digit for this range of values of interest is x(m-1)/5245786.)

For a different problem that discusses "very big numbers" see the article aboutdouble lottery winners.

Questions (1) Instead of Hunter's lower bound, what would the second Bonferroni bound give? (2) How many years would the Bulgarian lottery need to be running in order to have the same probability that some set of numbers will appear three times in succession? (3) Instead of demanding that the same set of numbers appear twice in succession, what is the probability that some set of numbers will repeat during mdrawings (This is simpler and is the famous birthday problem). (4) The second application of Hunter's bound requires estimating∑P(Ek Ek+1) which involves terms of the form P(AiAi+1BjBj+1) where Ai is the event that theset k occurs on draw i and Bj is the event that the set k+1 occurs on draw j. Each of these has probability 1/52457864. Count the number of terms to validatethe claim that P(∪Ek) ≈ (m-1)/5245786.