Difference between revisions of "Sandbox"

From ChanceWiki
Jump to navigation Jump to search
Line 1: Line 1:
 
==The Bulgarian Toto 6 of 42 lottery==  
 
==The Bulgarian Toto 6 of 42 lottery==  
  
The Bulgarian Toto 6 of 42 lottery was the subject of an investigation after the same set of six numbers{4, 15, 23, 24, 35, 42} were drawn in two successive lotteries on September 6 and September 10.The article cites a mathematician as stating that the probability of picking the same six numbers twice in a row is 4,200,000:1.We wondered how he arrived at this number.
+
The Bulgarian Toto 6 of 42 lottery was the subject of an investigation after the same set of six numbers {4, 15, 23, 24, 35, 42} were drawn in two successive lotteries on September 6 and September 10, 2009. The article cites a mathematician as stating that the probability of picking the same six numbers twice in a row is 4,200,000:1. We wondered how he arrived at this number. Suppose that the lottery has been running continuously for  
Suppose that the lottery has been running continuously for m draws since its inception. What is the probability that a specified set ofsix numbers will repeat?
+
<math>m</math> draws since its inception. What is the probability that a specified set of six numbers will repeat?
  
There are (42 choose 6) = 5245786 different sets of six numbers and the probability that a SPECIFIED set will occur in consecutive draws is 1/52457862
+
There are <math>{42 \choose 6} = 5245786</math> different sets of six numbers and the probability that a SPECIFIED set will occur in consecutive draws is 1/52457862.  Suppose the lottery has been running continuously for <math>m</math> draws and consider a fixed set of six numbers.
  
Suppose the lottery has been running continuously for m draws and consider a fixed set of six numbers.
+
There are <math>m-1</math> opportunities for this set to be drawn twice in succession (beginning with the second drawing). The probability that this will happen is then the probability of the union <math>P(A) = P(\cup_i A_i A_{i+1}) </math> where <math>A_i</math> is the event that this set of numbers is drawn on the ith draw.
  
There are m-1 opportunities for this set to be drawn twice in succession (beginning with the second drawing). The probability that this will happenis then the probability of the union P(A) = P(∪i Ai Ai+1)where Ai is the event that this set of numbers is drawn on the ith draw.
+
Bonferroni's inequality gives the upper bound <math>P(A) \le \sum_i P(A_i A_{i+1})</math> while Hunter's inequality gives the lower bound <math>P(A) \ge \sum_i P(A_i A_{i+1}) - \sum_i P(A_i A_{i+1}A_{i+2}).</math>
  
Bonferroni's inequality gives the upper bound P(A) ≤ ΣiP(AiAi+1) while Hunter's inequality gives the lower bound P(A) ≥ ΣiP(AiAi+1) - ΣiP(Ai Ai+1Ai+2).
+
We assume (!) that the events <math>A_i</math> are independent and identically distributed with probability <math>p = 1/5245786</math> leading to <math>(m-1) p^2 \le P(A) \le (m-1) p^2 - (m-2) p^3</math>.  Since
 +
<math>p</math> is very small the <math>p^3</math> term can be ignored
 +
giving <math>P(A) \approx (m-1)/52457862.</math>
  
We assume (!) that the events Ai are independent and identically distributed with probability p = 1/5245786 leading to (m-1) p2 ≤ P(A) ≤ (m-1) p2 - (m-2) p3.Since p is very small the p3 term can be ignored giving P(A) ≈ (m-1)/52457862.
+
It appears that the draws are held twice per week so for one year <math>m = 104</math> giving the probability <math>3.74 \times 10^{-12}</math> that a specified set of numbers will be drawn twice in succession. According to spokeswoman the lottery has been taking place for 52 years.
 +
Using <math>m = 104 \times 52 = 5408</math>, the probability that a specified set of numbers will be drawn twice in succession over this period is <math>1.89 \times 10^{-10}</math>, still very small.
  
It appears that the draws are held twice per week so for one year m = 104 giving the probability 3.74 × 10-12 that a specified set of numbers will be drawn twice in succession. According to spokeswoman the lottery has been taking place for 52 years. Using m = 104 × 52 = 5408, the probability that a specified set of numbers will be drawn twice in succession over this period is 1.89 × 10-10, still very small.
+
But now let's ask the question, not for a fixed set of numbers but for some set of numbers. After all, in discussing this coincidence the the repeated set arises by chance alone and is not specified in advance.
  
But now let's ask the question, not for a fixed set of numbers but for some set of numbers. After all, in discussing this coincidence the the repeated set arises by chance alone and is not specified in advance.
+
In <math>m</math> drawings what is the probability that SOME set of six numbers will be repeated in consecutive draws.
 +
 
 +
There are 5245786 possible sets of numbers that could be repeated. Enumerate the sets by integers
 +
<math>1 ≤ k ≤ 5245786</math> with <math>E_k</math> the event that set <math>k</math> repeats consecutively sometime during
 +
these <math>m</math> drawings. The probability of the union <math>P(\cup E_k)</math> is needed. Each of the 5245786 events
 +
<math>E_k</math> has probability <math>(m-1)/ 52457862</math> and if they were independent we could evaluate the probability using complements as
 +
<math>P(\cup E_k) = 1 - (1- (m-1)/52457862)^{5245786} ≈ 1 - e^{-(m-1)/5245786}</math>. However, they are dependent, but as long as <math>m</math> is small relative to 5245786, Bonferroni's and Hunter's bounds can once again be used to estimate
 +
<math>P(\cup E_k) \approx (m-1)/5245786.</math> For <math>m = 5408</math> this is 0.0010302. (Note that assuming independence gives 0.0010307)
 +
 
 +
This probability relates to one lottery. Suppose we consider all lotteries worldwise and ask for the probability that in some lottery, somewhere, some set of numberswill be repeated consecutively. All lotteries are variant of Toto with different numbers involved. Each lottery will have had its own cumulative number of drawings. In order to gauge the magnitude of the probability wanted, assume that there are <math>x</math> lotteries, each one sharing the same numerical characteristics as the Bulgarian one.
 +
 
 +
This time we can use independence. The probability that some set will be repeated is 1 minus the probability that in no lottery is a set of numbers selected on two consecutive drawings
 +
<math>= 1 - (1 - (m -1)/5245786)x</math> . For <math>x = 50</math> this is 0.0503 while for <math>x = 100</math> the probability is 0.0980. (An approximation to one significant digit for this range of values of interest is <math>x(m-1)/5245786.</math>)
  
In m drawings what is the probability that SOME set of six numbers will be repeated in consecutive draws.
+
For a different problem that discusses``very big number'' see the article about double lottery winners.
  
There are 5245786 possible sets of numbers that could be repeated. Enumerate the sets by integers 1 ≤ k ≤ 5245786 withEk the event that set k repeats consecutively sometime during these m drawings. The probability of the union P(∪Ek) is needed. Each of the 5245786 events Ek has probability (m-1)/ 52457862 and if they were independent we could evaluate the probability using complements as P(∪Ek) = 1 - (1- (m-1)/52457862)5245786 ≈ 1 - e-(m-1)/5245786. However, they are dependent, but as long as m is small relativeto 5245786, Bonferroni's and Hunter's bounds can once again be used to estimate P(∪Ek) ≈ (m-1)/5245786. For m = 5408 this is 0.0010302. (Note that assuming independence gives 0.0010307)
+
Questions.
  
This probability relates to one lottery. Suppose we consider all lotteries worldwise and ask for the probability that in some lottery, somewhere, some set of numberswill be repeated consecutively. All lotteries are variant of Toto with different numbers involved. Each lottery will have had its own cumulative number of drawings. In order to gauge the magnitude of the probability wanted, assume that there are x lotteries, each one sharing the same numericalcharacteristics as the Bulgarian one.
+
1. Instead of Hunter's lower bound, what would the second Bonferroni bound give?
  
This time we can use independence. The probability that some set will be repeated is 1 minus the probability that in no lottery is a setof numbers selected on two consecutive drawings = 1 - (1 - (m -1)/5245786)x . For x = 50 this is 0.0503 while for x = 100 the probability is 0.0980. (A approximation to one significant digit for this range of values of interest is x(m-1)/5245786.)
+
2. How many years would the Bulgarian lottery need to be running in order to have the same probability that some set of numbers will appear three times in succession?
  
For a different problem that discusses "very big numbers" see the article aboutdouble lottery winners.
+
3.  Instead of demanding that the same set of numbers appear twice in succession, what is the probability that some set of numbers will repeat during <math>m</math> drawings (This is simpler and is the famous birthday problem).
  
Questions
+
4. The second application of Hunter's bound requires estimating <math>\sum_k P( E_k E_{k+1} )</math> which involves terms of the form <math>P(A_i A_{i+1} B_j B_{j+1} )</math> where <math>A_i</math> is the event that theset <math>k</math> occurs on draw <math>i</math> and <math>B_j</math> is the event  
(1) Instead of Hunter's lower bound, what would the second Bonferroni bound give?
+
that the  
(2) How many years would the Bulgarian lottery need to be running in order to have the same probability that some set of numbers will appear three times in succession?
+
set <math>k+1</math> occurs on draw <math>j</math>. Each of these has probability 1/52457864. Count the number of terms to validate the claim that <math>P(\cup_k E_k) \approx (m-1)/5245786</math>.
(3) Instead of demanding that the same set of numbers appear twice in succession, what is the probability that some set of numbers will repeat during mdrawings (This is simpler and is the famous birthday problem).
 
(4) The second application of Hunter's bound requires estimating∑P(Ek Ek+1) which involves terms of the form P(AiAi+1BjBj+1) where Ai is the event that theset k occurs on draw i and Bj is the event that the set k+1 occurs on draw j. Each of these has probability 1/52457864. Count the number of terms to validatethe claim that P(∪Ek) (m-1)/5245786.
 

Revision as of 02:27, 25 September 2009

The Bulgarian Toto 6 of 42 lottery

The Bulgarian Toto 6 of 42 lottery was the subject of an investigation after the same set of six numbers {4, 15, 23, 24, 35, 42} were drawn in two successive lotteries on September 6 and September 10, 2009. The article cites a mathematician as stating that the probability of picking the same six numbers twice in a row is 4,200,000:1. We wondered how he arrived at this number. Suppose that the lottery has been running continuously for <math>m</math> draws since its inception. What is the probability that a specified set of six numbers will repeat?

There are <math>{42 \choose 6} = 5245786</math> different sets of six numbers and the probability that a SPECIFIED set will occur in consecutive draws is 1/52457862. Suppose the lottery has been running continuously for <math>m</math> draws and consider a fixed set of six numbers.

There are <math>m-1</math> opportunities for this set to be drawn twice in succession (beginning with the second drawing). The probability that this will happen is then the probability of the union <math>P(A) = P(\cup_i A_i A_{i+1}) </math> where <math>A_i</math> is the event that this set of numbers is drawn on the ith draw.

Bonferroni's inequality gives the upper bound <math>P(A) \le \sum_i P(A_i A_{i+1})</math> while Hunter's inequality gives the lower bound <math>P(A) \ge \sum_i P(A_i A_{i+1}) - \sum_i P(A_i A_{i+1}A_{i+2}).</math>

We assume (!) that the events <math>A_i</math> are independent and identically distributed with probability <math>p = 1/5245786</math> leading to <math>(m-1) p^2 \le P(A) \le (m-1) p^2 - (m-2) p^3</math>. Since <math>p</math> is very small the <math>p^3</math> term can be ignored giving <math>P(A) \approx (m-1)/52457862.</math>

It appears that the draws are held twice per week so for one year <math>m = 104</math> giving the probability <math>3.74 \times 10^{-12}</math> that a specified set of numbers will be drawn twice in succession. According to spokeswoman the lottery has been taking place for 52 years. Using <math>m = 104 \times 52 = 5408</math>, the probability that a specified set of numbers will be drawn twice in succession over this period is <math>1.89 \times 10^{-10}</math>, still very small.

But now let's ask the question, not for a fixed set of numbers but for some set of numbers. After all, in discussing this coincidence the the repeated set arises by chance alone and is not specified in advance.

In <math>m</math> drawings what is the probability that SOME set of six numbers will be repeated in consecutive draws.

There are 5245786 possible sets of numbers that could be repeated. Enumerate the sets by integers <math>1 ≤ k ≤ 5245786</math> with <math>E_k</math> the event that set <math>k</math> repeats consecutively sometime during these <math>m</math> drawings. The probability of the union <math>P(\cup E_k)</math> is needed. Each of the 5245786 events <math>E_k</math> has probability <math>(m-1)/ 52457862</math> and if they were independent we could evaluate the probability using complements as <math>P(\cup E_k) = 1 - (1- (m-1)/52457862)^{5245786} ≈ 1 - e^{-(m-1)/5245786}</math>. However, they are dependent, but as long as <math>m</math> is small relative to 5245786, Bonferroni's and Hunter's bounds can once again be used to estimate <math>P(\cup E_k) \approx (m-1)/5245786.</math> For <math>m = 5408</math> this is 0.0010302. (Note that assuming independence gives 0.0010307)

This probability relates to one lottery. Suppose we consider all lotteries worldwise and ask for the probability that in some lottery, somewhere, some set of numberswill be repeated consecutively. All lotteries are variant of Toto with different numbers involved. Each lottery will have had its own cumulative number of drawings. In order to gauge the magnitude of the probability wanted, assume that there are <math>x</math> lotteries, each one sharing the same numerical characteristics as the Bulgarian one.

This time we can use independence. The probability that some set will be repeated is 1 minus the probability that in no lottery is a set of numbers selected on two consecutive drawings <math>= 1 - (1 - (m -1)/5245786)x</math> . For <math>x = 50</math> this is 0.0503 while for <math>x = 100</math> the probability is 0.0980. (An approximation to one significant digit for this range of values of interest is <math>x(m-1)/5245786.</math>)

For a different problem that discusses``very big number see the article about double lottery winners.

Questions.

1. Instead of Hunter's lower bound, what would the second Bonferroni bound give?

2. How many years would the Bulgarian lottery need to be running in order to have the same probability that some set of numbers will appear three times in succession?

3. Instead of demanding that the same set of numbers appear twice in succession, what is the probability that some set of numbers will repeat during <math>m</math> drawings (This is simpler and is the famous birthday problem).

4. The second application of Hunter's bound requires estimating <math>\sum_k P( E_k E_{k+1} )</math> which involves terms of the form <math>P(A_i A_{i+1} B_j B_{j+1} )</math> where <math>A_i</math> is the event that theset <math>k</math> occurs on draw <math>i</math> and <math>B_j</math> is the event that the set <math>k+1</math> occurs on draw <math>j</math>. Each of these has probability 1/52457864. Count the number of terms to validate the claim that <math>P(\cup_k E_k) \approx (m-1)/5245786</math>.