Stochasticity
June 19, 2009 at 11:40 AM by Dr. Drang
The most recent episode of Radio Lab was entitled “Stochasticity,” and had several stories concerning randomness, probability, noise, and chance. Something in one of the stories bothered me as I was listening to it, and I had to dig into it further.
The story started in the classroom of Deborah Nolan, a statistics professor at UC Berkeley. One of her classroom exercises is an attempt to teach her students the true nature of random data. She selects two groups of students. One is to flip a coin a hundred times and record that sequence. The other is to dream up a hundred coin flips and record that sequence. Prof. Nolan leaves the room while this is going on and returns when the two sequences have been written out on the blackboard. She looks at the two long strings of Hs and Ts and judges which on is the real record of coin flips and which is made up. Her assessment, which is almost always correct, is based on how smooth the sequence looks Real data streams are almost always clumpy, with relatively long runs of heads or tails. People who are faking data tend to alternate regularly between heads and tails, with only short runs of consecutive Hs or Ts.
In the Radio Lab story, the hosts, Robert Krulwich and Jad Abumrad, take part in the demonstration and do the actual coin flipping. At one point in their flipping, they had a run of seven consecutive tails. That run—which they thought looked unrealistic—was, in fact, what told Prof. Nolan that theirs was the real sequence.
So far, the story is proceeding nicely. Anyone who’s had to deal with real data, or runs of computergenerated pseudorandom data, knows that random data tends to be clumpy. One of my old professors used to tell us about his daughter having a clear container with several colors of glitter in it. When she shook the container to mix the glitter, she didn’t get a homogeneous blend, she got patterns of the different colors. It’s an image that’s stayed with me for over 25 years; I’m sure Prof. Nolan’s coin flipping exercise stays with her students, too.
Then came the part that bothered me. They went to another professor, Jonathan Koehler of Arizona State, and asked him how likely it was to get a run of seven tails. He does the simple calculation of seven tails in seven flips and gets the expected answer: \((\frac{1}{2})^7 = \frac{1}{128} = 0.008\), or a little less than one percent. When informed that this was a run of seven tails during a string of one hundred coin flips, he quickly said that that’s different and would be much more likely. After an edit, he comes back to say that the chance of getting a run of seven tails in one hundred flips would be about one in six.
This answer bothered me for two reasons. First, the one in six seemed too small. Second, and more important, I didn’t know how he calculated it. So I went searching for the solution method^{1}.
The answer is going to come from dividing the number of ways of getting seven consecutive tails in one hundred flips by the total number of ways of doing one hundred flips. The denominator is obviously \(2^{100}\) which is over \(1.2 \times 10^{30}\). But the numerator was not so obvious. The Run page from Eric Weisstein’s MathWorld web site gave the best information for generating a solution, and what follows is an expansion on what you’ll find there.
We’ll start by doing two things that are pretty common in probablility problems of this sort:
 First, we’ll solve the inverse problem. Instead of looking for how many possible sequences of 100 coin flips have runs of seven tails, we’ll look for sequences that don’t have runs of seven tails.
 Second, we’ll start by looking at small problems, ones for which we can easily generate all the possible coin flips and count. We’ll use the pattern developed in these small problems to generalize to the large problem.
Let’s start with the simplest case: runs of two tails. With a single coin flip, there are 2 possibilities, T and H, and neither of them have a run of two tails (duh), so our answer for this problem is 2. With two coin flips, there are four possibilities, TT, TH, HT, and HH, and 3 of them have no runs of two tails. With three coin flips, there are eight possibilities,
TTT HTT
TTH HTH
THT HHT
THH HHH
and 5 of them have no runs of two tails. With four coin flips, there are 16 possibilities,
TTTT THTT HTTT HHTT
TTTH THTH HTTH HHTH
TTHT THHT HTHT HHHT
TTHH THHH HTHH HHHH
and 8 have no runs of two tails. With five coin flips, the 32 possibilities are
TTTTT THTTT HTTTT HHTTT
TTTTH THTTH HTTTH HHTTH
TTTHT THTHT HTTHT HHTHT
TTTHH THTHH HTTHH HHTHH
TTHTT THHTT HTHTT HHHTT
TTHTH THHTH HTHTH HHHTH
TTHHT THHHT HTHHT HHHHT
TTHHH THHHH HTHHH HHHHH
and 13 have no runs of two tails. Finally, we’ll do six coin flips, which have 64 possibilities,
TTTTTT TTHTTT THTTTT THHTTT HTTTTT HTHTTT HHTTTT HHHTTT
TTTTTH TTHTTH THTTTH THHTTH HTTTTH HTHTTH HHTTTH HHHTTH
TTTTHT TTHTHT THTTHT THHTHT HTTTHT HTHTHT HHTTHT HHHTHT
TTTTHH TTHTHH THTTHH THHTHH HTTTHH HTHTHH HHTTHH HHHTHH
TTTHTT TTHHTT THTHTT THHHTT HTTHTT HTHHTT HHTHTT HHHHTT
TTTHTH TTHHTH THTHTH THHHTH HTTHTH HTHHTH HHTHTH HHHHTH
TTTHHT TTHHHT THTHHT THHHHT HTTHHT HTHHHT HHTHHT HHHHHT
TTTHHH TTHHHH THTHHH THHHHH HTTHHH HTHHHH HHTHHH HHHHHH
and 21 have no runs of two tails.
Summarizing, the answers we got were
2, 3, 5, 8, 13, 21
which is our old friend, the Fibonacci sequence without its two initial numbers, 1, 1
. Each number in the sequence is the sum of the two previous numbers. If we were to continue to generate possibilities and count the ones with no runs of two tails, we’d get
34, 55, 89, 144, 233, etc.
The Fibonacci sequence is the Zelig of mathematics, always popping up in unusual places.
Let’s now consider runs of three tails. We’ve already generated all the possibilities for 1, 2, 3, 4, 5, and 6 coin tosses, so we just need to go back and count how many do not have runs of three tails. The answers are
2, 4, 7, 13, 24, 44
which, if we prepend 1, 1
, is the tribonacci sequence, a generalization of Fibonacci in which each number is the sum of the previous three numbers. The tribonacci sequence is also called the Fibonacci 3step sequence.
What about four tails? Counting the sequences that don’t have runs of four tails, we get
2, 4, 8, 15, 29, 56
which, you will probably not be surprised to hear, is part of the tetranacci sequence, where each number is the sum of the previous four numbers. The tetranacci sequence is also called the Fibonacci 4step sequence. It, too, is missing its initial 1, 1
.
We’re now at the point where we can generalize:
Of all the possible sequences of \(n\) coin flips, \(F_{n+2}^{(k)}\) will not have a run of \(k\) tails, where \(F_{n+2}^{(k)}\) is the \((n+2)^\mathrm{th}\) Fibonacci kstep number.
From this we can derive the following:
 The probability of having no runs of \(k\) tails in \(n\) coin flips is \(F_{n+2}^{(k)}/2^n\).
 The number of sequences that have one or more runs of at least \(k\) tails is \(2^n  F_{n+2}^{(k)}\).
 The probability of have one or more runs of at least \(k\) tails in \(n\) coin flips is
\(1  F_{n+2}^{(k)}/2^n = (2^n  F_{n+2}^{(k)})/2^n\).
Phew! Let’s return to RadioLab.
The probability that Jad and Robert would not get a run of 7 or more tails in 100 coin flips is
\[\frac{F_{102}^{(7)}}{2^{100}} = \frac{8.65 \times 10^{29}}{1.268 \times 10^{30}} = 0.682\]and the probability that they would get at least one run of 7 or more tails in 100 flips is
\[\frac{2^{100}  F_{102}^{(7)}}{2^{100}} = 1  \frac{F_{102}^{(7)}}{2^{100}} = 1  0.682 = 0.318\]Obviously, I didn’t calculate \(F_{102}^{(7)}\) by hand, and I don’t know of any tables of Fibonacci 7step (heptanacci) numbers that go up to the 102^{nd} term, so I had to write a quick script to do the calculations.
1: #!/usr/bin/python
2: from __future__ import division
3:
4: r = 7
5: f = 100
6: F = [1,1]
7:
8: for i in range(2,f+2):
9: F.append(0)
10: for j in range(1, min(len(F),r+1)):
11: F[i] += F[ij]
12:
13: n = 2**f
14: a = n  F[1]
15: print F[1]
16: print a
17: print n
18: print a/n
19: print F[1]/n
The parameters are set up in Lines 4–6, the sequence is generated in the loop in Lines 8–11, and the counts and ratios calculated in Lines 13–19. The import statement on Line 2 allows me to divide integers and get a floating point result. Without that line, the ratios would have been performed via integer division and would have both been zero.
I was a bit worried that I’d made some fundamental error in my solution, so I decided to estimate the probability through Monte Carlo simulation. Here’s my Monte Carlo program.
1: #!/usr/bin/python
2: from __future__ import division
3: from random import choice
4:
5: f = 100
6: r = 7
7: N = 10000
8: runT = 'T' * r
9: n = 0
10:
11: for i in range(N):
12: seq = []
13: for j in range(f):
14: seq.append(choice('HT'))
15: s = ''.join(seq)
16: if s.find(runT) >= 0:
17: n += 1
18:
19: print '%d/%d = %.3f' % (n, N, n/N)
It generates a set of 100 simulated coin flips as a string of Hs and Ts, then looks to see if there’s a run of TTTTTTT in that string. It does this 10,000 times, keeping track of how many of the simulations had a seventail run. When I ran it, I got output of
3196/10000 = 0.320
which is close to the result using Fibonacci numbers, so I feel pretty confident in the results.
You’ll note my answer of 0.318 is about twice the probability given by Prof. Koehler. I think that’s because he was calculating the chance of getting a run of exactly seven tails, whereas I’m calculating the chance of getting a run of at least seven tails (this turned out to be wrong; see the update below). I prefer my way of looking at the problem, because the idea is to calculate the probability of a pattern that looks unusual. Certainly Jad and Robert would have considered runs of more than seven tails to be rare, so I think it’s fair to include those in the calculation. Perhaps it’s because I’m used to thinking in terms of “at least as unusual” rather than “exactly as unusual” that Prof. Koehler’s calculation seemed low to me^{2}.
In fact, since runs of seven or more heads would have seemed just as unusual as seven or more tails, a calculation that includes runs of heads as well as tails seems appropriate. I’m not sure how to do this calculation exactly^{3}, but it’s an easy matter to modify the Monte Carlo program to handle heads.
1: #!/usr/bin/python
2: from __future__ import division
3: from random import choice
4:
5: f = 100
6: r = 7
7: N = 10000
8: runT = 'T' * r
9: runH = 'H' * r
10: n = 0
11:
12: for i in range(N):
13: seq = []
14: for j in range(f):
15: seq.append(choice('HT'))
16: s = ''.join(seq)
17: if (s.find(runT) >= 0) or (s.find(runH) >= 0):
18: n += 1
19:
20: print '%d/%d = %.3f' % (n, N, n/N)
Running this, I get an output of
5432/10000 = 0.543
which means that in a sequence of 100 coin flips, runs of seven or more are about as likely to occur as not. Keep that in mind if you take a class from Prof. Nolan.
Update 6/20/09
A few things in the post need to be corrected. My math is fine, as far as I know, but my interpretation of Prof. Jay Koehler’s calculations is off. Based on a comment made by Soren Wheeler, the producer of the Radio Lab segment, on the Radio Lab blog and a brief email exchange with Prof. Koehler, here’s what really happened in his portion of the story:
Initially, he calculated the probability of seven tails or seven heads in a row. He took the \((\frac{1}{2})^7 = \frac{1}{128}\) for seven tails in a row and doubled it to \(\frac{1}{64}\) to account for seven heads in a row. This is why you’ll hear him say “little more than one percent” instead of “about one percent,” which is what I remembered him saying as I wrote the post. According to both Soren and Jay, the editing of this part cut out the doubling and the discussion about how a run of seven heads would be considered just as unusual.
As for the “about one in six” probability for a run of seven in 100 coin flips, that too was intended to handle runs of both heads and tails, but it was not based on a formal calculation. It was just a quick estimate made by treating 100 flips as 14 independent events, each with a probability of \(\frac{1}{64}\). Jay wanted to do \(1 (\frac{63}{64})^{14}\), but knew he couldn’t work that out in his head, so he just did \(\frac{14}{64}\) and, because he was doing this in his head while talking to the Radio Lab guys at the same time, thought this was closer to \(\frac{1}{6}\) than \(\frac{1}{5}\). He did not, as I had thought, interrupt the conversation to do a serious calculation. He knew his result would be on the low side because he was considering only runs in discrete groups of seven—flips 1–7, flips 8–14, flips 15–21, and so on. But that was OK, because he just wanted to get across the notion that a run of seven out of a hundred flips is no big deal. And, as we’ve seen, it isn’t.

Warning! Math ahead! As always, the equations in this post are rendered using jsMath. They’ll look best if you have the TeX fonts installed. ↩

The theme song for this post is Marvin Gaye’s “Ain’t That Peculiar.” ↩

You can’t just double the probability for tails because there are sequences that have long runs of both heads and tails, and doubling would count those sequences twice. ↩