Solutions to Exercises Marked with s
from the book
Introduction to Probability by
Joseph K. Blitzstein and Jessica Hwang
c Chapman & Hall/CRC Press, 2014
Joseph K. Blitzstein and Jessica Hwang
Departments of Statistics, Harvard University and Stanford University
Chapter 1: Probability and counting
Counting
s (a) How many ways are there to split a dozen people into 3 teams, where one team
has 2 people, and the other two teams have 5 people each?
8.
(b) How many ways are there to split a dozen people into 3 teams, where each team has
4 people?
Solution:
(a) Pick any 2 of the 12 people to make the 2 person team, and then any 5 of the
remaining 10 for the first team of 5, and then the remaining 5 are on the other team of
5; this overcounts by a factor of 2 though, since there is no designated “first” team of
/2 = 8316. Alternatively, politely ask the 12
5. So the number of possibilities is12
10
people to line up, and then let the first 2 be the team of 2, the next 5 be a team of 5,
and then last 5 be a team of 5. There are 12! ways for them to line up, but it does not
matter which order they line up in within each group, nor does the order of the 2 teams
of 5 matter, so the number of possibilities is
12!
2!5!5!·2 = 8316.
2
5
12!
4!4!4! ways to divide the people into a
(b) By either of the approaches above, there are
Team A, a Team B, and a Team C, if we care about which team is which (this is called
a multinomial coefficient). Since here it doesn’t matter which team is which, this over
counts by a factor of 3!, so the number of possibilities is
s (a) How many paths are there from the point (0, 0) to the point (110, 111) in the
plane such that each step either consists of going one unit up or one unit to the right?
4!4!4!3! = 5775.
12!
(b) How many paths are there from (0, 0) to (210, 211), where each step consists of going
one unit up or one unit to the right, and the path has to go through (110, 111)?
Solution:
9.
15.
U ’s to get to (210, 211), so by the multiplication rule the number of possible paths is
(a) Encode a path as a sequence of U ’s and R’s, like U RU RU RU U U R . . . U R, where
U and R stand for “up” and “right” respectively. The sequence must consist of 110 R’s
and 111 U ’s, and to determine the sequence we just need to specify where the R’s are
110
paths to (110, 111), as above. From there, we need 100 R’s and 100
110
possible paths.
located. So there are221
(b) There are221
·200
221
.
s Give a story proof thatn
Solution: Consider picking a subset of n people. There are n
= 2n.
Story proofs
n
k=0
110
100
k
the one hand, and on the other hand there are 2n subsets by the multiplication rule.
choices with size k, on
k
1
2
16.
18.
s Show that for all positive integers n and k with n ≥ k,
Chapter 1: Probability and counting
n
k
+
n
k − 1
=
n + 1
k
,
doing this in two ways: (a) algebraically and (b) with a story, giving an interpretation
for why both sides count the same thing.
Hint for the story proof: Imagine n + 1 people, with one of them pre-designated as
“president”.
Solution:
(a) For the algebraic proof, start with the definition of the binomial coefficients in the
left-hand side, and do some algebraic manipulation as follows:
n
k
+
n
k − 1
=
=
=
=
n!
n!
+
k!(n − k)!
(n − k + 1)n! + (k)n!
(k − 1)!(n − k + 1)!
k!(n − k + 1)!
n!(n + 1)
k!(n − k + 1)!
n + 1
.
k
(b) For the story proof, consider n + 1 people, with one of them pre-designated as
“president”. The right-hand side is the number of ways to choose k out of these n + 1
people, with order not mattering. The left-hand side counts the same thing in a different
way, by considering two cases: the president is or isn’t in the chosen group.
The number of groups of size k which include the president is n
out of the remaining n people. Similarly, there aren
the president as a member of the group, we only need to choose another k − 1 members
k−1
, since once we fix
groups of size k that don’t include
the president. Thus, the two sides of the equation are equal.
s (a) Show using a story proof that
k
k
k
+
k + 1
k
+
k + 2
k
+ ··· +
n
k
=
n + 1
k + 1
,
where n and k are positive integers with n ≥ k. This is called the hockey stick identity.
Hint: Imagine arranging a group of people by age, and then think about the oldest
person in a chosen subgroup.
(b) Suppose that a large pack of Haribo gummi bears can have anywhere between 30
and 50 gummi bears. There are 5 delicious flavors: pineapple (clear), raspberry (red),
orange (orange), strawberry (green, mysteriously), and lemon (yellow). There are 0 non-
delicious flavors. How many possibilities are there for the composition of such a pack of
gummi bears? You can leave your answer in terms of a couple binomial coefficients, but
not a sum of lots of binomial coefficients.
Solution:
(a) Consider choosing k + 1 people out of a group of n + 1 people. Call the oldest person
in the subgroup “Aemon.” If Aemon is also the oldest person in the full group, then
choices for the rest of the subgroup. If Aemon is the second oldest in the
choices since the oldest person in the full group can’t be
there aren
full group, then there aren−1
k
k
k
j
k
n
(b) For a pack of i gummi bears, there are 5+i−1
54
50
i + 4
j=k
=
i
n + 1
k + 1
.
=i+4
i
the situation is equivalent to getting a sample of size i from the n = 5 flavors (with
replacement, and with order not mattering). So the total number of possibilities is
=i+4
possibilities since
4
Chapter 1: Probability and counting
then there arej
possible choices for the rest of the subgroup. Thus,
chosen. In general, if there are j people in the full group who are younger than Aemon,
3
Applying the previous part, we can simplify this by writing
=
i=30
4
j
4
54
j=4
− 33
j=4
j=34
j
4
j
4
.
=
.
34
5
−
55
5
j
4
54
j=34
=
(This works out to 3200505 possibilities!)
22.
Naive definition of probability
s A certain family has 6 children, consisting of 3 boys and 3 girls. Assuming that all
birth orders are equally likely, what is the probability that the 3 eldest children are the
3 girls?
Solution: Label the girls as 1, 2, 3 and the boys as 4, 5, 6. Think of the birth order is a
permutation of 1, 2, 3, 4, 5, 6, e.g., we can interpret 314265 as meaning that child 3 was
born first, then child 1, etc. The number of possible permutations of the birth orders
is 6!. Now we need to count how many of these have all of 1, 2, 3 appear before all of
4, 5, 6. This means that the sequence must be a permutation of 1, 2, 3 followed by a
permutation of 4, 5, 6. So with all birth orders equally likely, we have
(3!)2
6!
P (the 3 girls are the 3 eldest children) =
= 0.05.
Alternatively, we can use the fact that there are 6
ways to choose where the girls
appear in the birth order (without taking into account the ordering of the girls amongst
themselves). These are all equally likely. Of these possibilities, there is only 1 where the
3 girls are the 3 eldest children. So again the probability is
s A city with 6 districts has 6 robberies in a particular week. Assume the robberies are
located randomly, with all possibilities for which robbery occurred where equally likely.
What is the probability that some district had more than 1 robbery?
= 0.05.
1
(6
3)
3
23.
Solution: There are 66 possible configurations for which robbery occurred where. There
are 6! configurations where each district had exactly 1 of the 6, so the probability of
the complement of the desired event is 6!/66. So the probability of some district having
more than 1 robbery is
1 − 6!/66 ≈ 0.9846.
Note that this also says that if a fair die is rolled 6 times, there’s over a 98% chance
that some value is repeated!
4
26.
27.
29.
Chapter 1: Probability and counting
s A college has 10 (non-overlapping) time slots for its courses, and blithely assigns
courses to time slots randomly and independently. A student randomly chooses 3 of the
courses to enroll in. What is the probability that there is a conflict in the student’s
schedule?
Solution: The probability of no conflict is 10·9·8
103 = 0.72. So the probability of there being
at least one scheduling conflict is 0.28.
s For each part, decide whether the blank should be filled in with =, <, or >, and give
a clear explanation.
(a) (probability that the total after rolling 4 fair dice is 21)
total after rolling 4 fair dice is 22)
(probability that the
(b) (probability that a random 2-letter word is a palindrome1)
random 3-letter word is a palindrome)
(probability that a
Solution:
(a) > . All ordered outcomes are equally likely here. So for example with two dice,
obtaining a total of 9 is more likely than obtaining a total of 10 since there are two ways
to get a 5 and a 4, and only one way to get two 5’s. To get a 21, the outcome must
be a permutation of (6, 6, 6, 3) (4 possibilities), (6, 5, 5, 5) (4 possibilities), or (6, 6, 5, 4)
(4!/2 = 12 possibilities). To get a 22, the outcome must be a permutation of (6, 6, 6, 4)
(4 possibilities), or (6, 6, 5, 5) (4!/22 = 6 possibilities). So getting a 21 is more likely; in
fact, it is exactly twice as likely as getting a 22.
(b) = . The probabilities are equal, since for both 2-letter and 3-letter words, being a
palindrome means that the first and last letter are the same.
s Elk dwell in a certain forest. There are N elk, of which a simple random sample of
size n are captured and tagged (“simple random sample” means that allN
sets of n
elk are equally likely). The captured elk are returned to the population, and then a new
sample is drawn, this time with size m. This is an important method that is widely used
in ecology, known as capture-recapture. What is the probability that exactly k of the m
elk in the new sample were previously tagged? (Assume that an elk that was captured
before doesn’t become more or less likely to be captured again.)
n
Solution: We can use the naive definition here since we’re assuming all samples of size m
are equally likely. To have exactly k be tagged elk, we need to choose k of the n tagged
elk, and then m − k from the N − n untagged elk. So the probability is
n
k
·N−n
N
m−k
,
m
for k such that 0 ≤ k ≤ n and 0 ≤ m − k ≤ N − n, and the probability is 0 for all
other values of k (for example, if k > n the probability is 0 since then there aren’t even
k tagged elk in the entire population!). This is known as a Hypergeometric probability;
we will encounter it again in Chapter 3.
s A jar contains r red balls and g green balls, where r and g are fixed positive integers.
A ball is drawn from the jar randomly (with all possibilities equally likely), and then a
second ball is drawn randomly.
31.
1A palindrome is an expression such as “A man, a plan, a canal: Panama” that reads the same
backwards as forwards (ignoring spaces, capitalization, and punctuation). Assume for this problem
that all words of the specified length are equally likely, that there are no spaces or punctuation,
and that the alphabet consists of the lowercase letters a,b,. . . ,z.
Chapter 1: Probability and counting
5
(a) Explain intuitively why the probability of the second ball being green is the same
as the probability of the first ball being green.
(b) Define notation for the sample space of the problem, and use this to compute the
probabilities from (a) and show that they are the same.
(c) Suppose that there are 16 balls in total, and that the probability that the two balls
are the same color is the same as the probability that they are different colors. What
are r and g (list all possibilities)?
Solution:
(a) This is true by symmetry. The first ball is equally likely to be any of the g + r balls,
so the probability of it being green is g/(g + r). But the second ball is also equally likely
to be any of the g + r balls (there aren’t certain balls that enjoy being chosen second
and others that have an aversion to being chosen second); once we know whether the
first ball is green we have information that affects our uncertainty about the second
ball, but before we have this information, the second ball is equally likely to be any of
the balls.
Alternatively, intuitively it shouldn’t matter if we pick one ball at a time, or take one
ball with the left hand and one with the right hand at the same time. By symmetry,
the probabilities for the ball drawn with the left hand should be the same as those for
the ball drawn with the right hand.
(b) Label the balls as 1, 2, . . . , g + r, such that 1, 2, . . . , g are green and g + 1, . . . , g + r
are red. The sample space can be taken to be the set of all pairs (a, b) with a, b ∈
{1, . . . , g + r} and a = b (there are other possible ways to define the sample space, but it
is important to specify all possible outcomes using clear notation, and it make sense to
be as richly detailed as possible in the specification of possible outcomes, to avoid losing
information). Each of these pairs is equally likely, so we can use the naive definition of
probability. Let Gi be the event that the ith ball drawn is green.
The denominator is (g +r)(g +r−1) by the multiplication rule. For G1, the numerator is
g(g + r − 1), again by the multiplication rule. For G2, the numerator is also g(g + r − 1),
since in counting favorable cases, there are g possibilities for the second ball, and for
each of those there are g + r − 1 favorable possibilities for the first ball (note that the
multiplication rule doesn’t require the experiments to be listed in chronological order!);
alternatively, there are g(g − 1) + rg = g(g + r − 1) favorable possibilities for the second
ball being green, as seen by considering 2 cases (first ball green and first ball red). Thus,
P (Gi) =
g(g + r − 1)
(g + r)(g + r − 1)
=
g
g + r
,
for i ∈ {1, 2}, which concurs with (a).
(c) Let A be the event of getting one ball of each color. In set notation, we can write
A = (G1 ∩ Gc
1 ∩ G2) . We are given that P (A) = P (Ac), so P (A) = 1/2. Then
2) ∪ (Gc
P (A) =
2gr
(g + r)(g + r − 1)
=
1
2
,
giving the quadratic equation
g2 + r2 − 2gr − g − r = 0,
i.e.,
(g − r)2 = g + r.
But g + r = 16, so g − r is 4 or −4. Thus, either g = 10, r = 6, or g = 6, r = 10.
6
Chapter 1: Probability and counting
32.
s A random 5-card poker hand is dealt from a standard deck of cards. Find the prob-
ability of each of the following possibilities (in terms of binomial coefficients).
(a) A flush (all 5 cards being of the same suit; do not count a royal flush, which is a
flush with an ace, king, queen, jack, and 10).
(b) Two pair (e.g., two 3’s, two 7’s, and an ace).
Solution:
the suit is Hearts); there are13
5
ways to choose the cards in that suit, except for one
(a) A flush can occur in any of the 4 suits (imagine the tree, and for concreteness suppose
way to have a royal flush in that suit. So the probability is
413
− 1
52
5
.
5
(b) Choose the two ranks of the pairs, which specific cards to have for those 4 cards,
and then choose the extraneous card (which can be any of the 52 − 8 cards not of the
two chosen ranks). This gives that the probability of getting two pairs is
13
2
2 · 44
·4
52
2
.
5
40.
s A norepeatword is a sequence of at least one (and possibly all) of the usual 26 letters
a,b,c,. . . ,z, with repetitions not allowed. For example, “course” is a norepeatword, but
“statistics” is not. Order matters, e.g., “course” is not the same as “source”.
A norepeatword is chosen randomly, with all norepeatwords equally likely. Show that
the probability that it uses all 26 letters is very close to 1/e.
Solution: The number of norepeatwords having all 26 letters is the number of ordered
arrangements of 26 letters: 26!. To construct a norepeatword with k ≤ 26 letters, we
first select k letters from the alphabet (26
word (k! arrangements). Hence there are 26
selections) and then arrange them into a
k! norepeatwords with k letters, with k
k
k
ranging from 1 to 26. With all norepeatwords equally likely, we have
P (norepeatword having all 26 letters) =
=
=
# norepeatwords having all 26 letters
# norepeatwords
26
26!26
k=1
k
k!
26
k=1
=
26!
k!(26−k)! k!
26!
1
1
25! + 1
24! + . . . + 1
1! + 1
.
The denominator is the first 26 terms in the Taylor series ex = 1 + x + x2/2! + . . .,
evaluated at x = 1. Thus the probability is approximately 1/e (this is an extremely
good approximation since the series for e converges very quickly; the approximation for
e differs from the truth by less than 10−26).
46.
Axioms of probability
s Arby has a belief system assigning a number PArby(A) between 0 and 1 to every event
A (for some sample space). This represents Arby’s degree of belief about how likely A
is to occur. For any event A, Arby is willing to pay a price of 1000 · PArby(A) dollars to
buy a certificate such as the one shown below: