Solutions of homework 3

Transcription

Solutions of homework 3
Math 5652: Introduction to Stochastic Processes
Homework 3: due Tuesday, March 10
(1) (10 points) This problem is taken from: J. Norris, Markov Chains, Cambridge University
Press 1997.
Construct a sequence of non-negative integers as follows. Let F0 = 0 and F1 = 1. Once
F0 , . . . , Fn are known, let Fn+1 be either the sum of Fn and Fn−1 , or (the absolute value
of) their difference, each with probability 1/2.
(a) Is (Fn )n≥0 a Markov chain?
(b) Let Xn = (Fn−1 , Fn ). Find the transition probabilities for this Markov chain. (Give
a formula, don’t try to write them into a matrix.) Using your formula, find the
probability that (Fn )n≥0 reaches 3 before first returning to 0.
Hint 1: draw the first few transitions of the chain Xn . What states correspond to “Fn
reaches 3”? What states correspond to “Fn reaches 0”?
Hint 2: to figure out the probability of reaching one set of states before another, consider
the Markov chain in which the states of interest are all absorbing.
Solution:
(a) Let’s look at some of the possible beginnings of the sequence:
F0 = 0, F1 = 1, F2 = 1(whether you add or subtract), F3 = 2 or 0, . . .
This is more or less enough to tell us that it’s not a Markov chain: from F1 = 1 we
went to F2 = 1 with probability 1, and from F2 = 1 we went to F3 = 2 or 0, not 1,
so
P(Fn+1 = 1|Fn = 1, Fn−1 = fn−1 , . . . , F0 = f0 )
definitely depends on history.
Actually, so far we only know that it’s not a time-homogeneous Markov chain, but
it’s clear that after 1,1 I will go to 2 or 0, and after 0,1 I will go to 1 at any time.
To prove that it’s not a Markov chain, we need to construct an example where
transition probabilities depend on history for a single value of n. For example:
F0 = 0, F1 = 1, F2 = 1, F3 = 2 or 0, F4 = (1 or 3) or 1
Now
P(F5 = x|F4 = 1, F3 = 2) 6= P(F5 = x|F4 = 1, F3 = 0).
(b) The transitions are of the form
(a, b) → (b, a + b) or (b, |a − b|)with probability 1/2
unless a = 0 or b = 0, in which case a+b = |a − b| and the corresponding probability
is 1. Equivalently,


c 6= b or d 6= a + b, |a − b|
0,
P(Xn+1 = (c, d)|Xn = (a, b)) = 1/2, c = b and d = a + b or |a − b| and a, b 6= 0

1,
c = b and d = a + b and one of a and b is 0.
1
Since we’re interested in whether we hit 0 or 3 first, we can stop drawing transitions
of X once F hits one of those states. Hence, the state transitions look like this: All
(0, 1)
(1, 1)
(1, 2)
(2, 3)
(1, 0)
(2, 1)
(1, 3)
arrows except the one from (0, 1) have probability 1/2, and the states without any
arrows coming out are absorbing: once we got to them, we’ve hit either 0 or 3.
Now the question is to find the probability, starting from (0, 1), of hitting (2, 3) or
(1, 3) (“before returning to 0” will be automatic: if I hit (1, 0) instead, it won’t be
possible for me to hit one of these two states). Let
h(x) = Px (hit one of (2, 3) or (1, 3)),
then
h(0, 1) = h(1, 1)
h(1, 1) = 1/2h(1, 2)
h(1, 2) = 1/2 + 1/2h(2, 1)
h(2, 1) = 1/2 + 1/2h(1, 1)
Solving this system of equations, we find h(0, 1) = 73 , so the probability that Fn
reaches 3 before returning to 0 is 3/7.
(2) (20 points) In this problem, we will look at the Wright-Fisher model of population genetics. Consider a Markov chain with state space {0, 1, . . . , m}, and transition probabilities
j m−j
m
i
m−i
pij =
.
j
m
m
The biological interpretation is as follows. We have k individuals, each of whom has one
gene, but two copies of it (as is usual for people). The total number of genes floating
around is m = 2k. Let’s call the two versions (alleles) of the gene A and a: so each
individual has either AA, Aa, or aa (the order doesn’t matter). Xn is the total number
of A alleles in the nth generation. To get from one generation to the next, pick two genes
uniformly at random, with replacement, from the gene pool in the parent generation:
equivalently, pick two individuals with replacement, and take one allele from each of
them. (This works well as a model of plant reproduction for self-pollinating plants, and
is only an approximation in bisexual species.)
For example, if in generation n the population has genotypes
AA Aa AA AA aa
then in generation n + 1, each gene will be A with probability 0.7 (the proportion of
A’s in the nth generation), and a with probability 0.3. For the Markov chain, we’re just
counting the proportion of each allele in the population, so the structure of the pairs
isn’t important.
2
(a) What are the closed irreducible classes for this Markov chain?
(b) In the long-run in this model, genetic diversity disappears, and we arrive into one
of the absorbing states. Calculate all the hitting probabilities of arriving into state
Xn = m for m = 3 and m = 4. That is, compute ρim for all i = 0, . . . , m for those
two cases. (Note: I’m asking for m = 3 or 4, not k = 3 or 4.) Can you guess the
form of the answer in general?
(c) Check that your guess in part (b) satisfies the equations for hitting probabilities for
general m. In a finite-state chain, the solution is unique, so you have just figured
out the probability that the final population will be all-A starting from every initial
distribution of genes.
It’s possible to extend this model to take various other biological features into account.
Solution: Note: there is a question here of what the transition probability should be
from 0 to 0, and from m to m, since that requires figuring out what 00 should be.
However, the other transition probabilities out of 0 and m are unambiguously 0, so here
we take the convention 00 = 1.
(a) The two closed irreducible classes contain the two absorbing states: the classes are
{0} and {m}. (A class is, by definition, a set of states – here they happen to be
one-element sets. This is why I’m enclosing single states in curly braces here.) All
other states are transient: if you transition to 0 (all alleles are a), you will stay at
0 (all alleles will stay a), and if you transition to m, you will stay at m.
(b) When m = 3, the transition probability matrix is
0
0
1
2 3
1
(
)
3
p= 
1 3

2 (3)
3
0

1
0
3 31 ( 23 )2
3 32 ( 13 )2
0
2
0
3( 13 )2 32
3( 23 )2 31
0
3

0
( 13 )3 

( 23 )3 
1
When m = 4, the transition probability matrix is
0
0
1
3 4
1
(
 41 )4
p = 2
 ( 21 )
3 ( 4 )4
4
0

1
0
4 14 ( 34 )3
4( 12 )4
4 34 ( 14 )3
0
2
0
6( 41 )2 ( 34 )2
6( 12 )4
6( 43 )2 ( 14 )2
0
3
0
4( 14 )3 43
4( 12 )4
4( 34 )3 41
0
4

0
( 14 )4 

( 12 )4 

( 34 )4 
1
We are solving for ρim = Pi (ever reach state m). Let h(i) = Pi (ever reach state m),
then for m = 3 we are solving
h(0) = 0
h(1) = 4/9h(1) + 2/9h(2) + 1/27
h(2) = 2/9h(1) + 4/9h(2) + 8/27
h(3) = 1
which gives h(i) = i/3 for i = 0, 1, 2, 3.
3
For m = 4 we’re solving
h(0) = 0
h(4) = 1
h(1) = 27/64h(1) + 27/128h(2) + 3/64h(3) + 1/256
h(2) = 1/4h(1) + 3/8h(2) + 1/4h(3) + 1/16
h(3) = 3/64h(1) + 27/128h(2) + 27/64h(3) + 81/256
which gives h(i) = i/4 for i = 0, 1, 2, 3, 4.
Note: it is much, much easier to enter this into WolframAlpha or another computer
algebra program of your choice than to try to solve the equations by hand!
(c) We guess h(i) = i/m, and verify that these satisfy the equations: h(0) = 0 and
h(m) = 1 are ok, so we need to check the rest,
j m−j
m X
X
i
m
i
m−i
j
?
= h(i) =
p(i, j)h(j) =
.
j
m
m
m
m
|{z}
j
j=0 |
{z
}
h(j)
p(i,j)
First, let’s change the summation to run from j = 1 to m, since the j = 0 term
is 0 anyway. Next, let’s factor out i/m from the right-hand side, and cancel j/m
against the factorials in the binomial coefficient:
j−1 m−j
m
i
i X
m−i
(m − 1)!
RHS =
.
m j=1 (j − 1)!(m − j)! m
m
|
{z
}
m!
·j
j!(m−j)! m
We’re now trying to show that the sum is equal to 1. Notice that there’s a bunch
of m − 1 and j − 1 hanging around, so let’s try to rewrite other things in terms of
m − 1 and j − 1: in particular, m − j = (m − 1) − (j − 1). Then we get
j−1 (m−1)−(j−1)
m
i
m−i
i X
(m − 1)!
RHS =
.
m j=1 (j − 1)!((m − 1) − (j − 1))! m
m
Notice that as j runs from 1 to m, the j − 1 runs from 0 to m − 1. Thus, the
summation is the binomial expansion of (i/m + (m − i)/m)m−1 = 1m−1 = 1. So, the
sum is equal to 1, and we have shown that the numbers h(i) = i/m satisfy the same
equations as the hitting probabilities. Since there’s as many equations as unknowns,
the solution should be unique, and we conclude that
h(i) = ρim = P(ever reach m|start in i) =
i
.
m
(3) (10 points) Let Y1 , Y2 , ... be a sequence of iid random variables with all moments finite.
Let N be a nonnegative-integer-valued random variable, independent from all the Yi ,
also with all moments finite. Let S = Y1 + . . . + YN , and define S = 0 if N = 0. Find the
mean E[S] and the variance Var(S) in terms of the moments (mean, variance, second
4
moment) of Y and N .
Hint: begin by conditioning on the value of N , i.e. write
E[S] =
∞
X
2
P(N = n)E[S|N = n],
E[S ] =
n=0
∞
X
P(N = n)E[S 2 |N = n].
n=0
2
Solution: Let µY , σY2 , µN , σN
be the mean and variance of Y and the mean and variance
of N respectively.
Since N and Yi are independent,
E[S|N = n] = E[Y1 + . . . + Yn ] = nµY ,
and
E[S 2 |N = n] = E[(Y1 + . . . + Yn )2 ] = Var(Y1 + . . . + Yn ) + (E[Y1 + . . . + Yn ])2 = nσY2 + n2 µ2Y .
Consequently,
E[S] =
∞
X
P(N = n)nµY = µY
n=0
∞
X
nP(N = n) = µY µN .
n=0
Also,
E[S 2 ] =
∞
X
P(N = n)(nσY2 + n2 µ2Y ) = σY2
n=0
∞
X
n=0
nP(N = n) + µ2Y
∞
X
n2 P(N = n)
n=0
= σY2 µN + µ2Y E[N 2 ].
Consequently,
2
Var(S) = E[S 2 ] − (E[S])2 = σY2 µN + µ2Y (E[N 2 ] − µ2N ) = σY2 µN + µ2Y σN
.
(4) (10 points) Durrett 2.10: Consider a bank with two tellers. Three people – Alice, Bob,
and Carol – walk into the bank at almost the same time, but in that order. Alice and
Bob go directly into service, while Carol waits for the first available teller. Suppose that
the service time of each of the three customers is independent, exponentially distributed,
with a mean of 4 minutes. (Careful: this is the mean, not the rate!)
(a) What is the expected amount of time Carol spends in the bank, including both the
wait and her service time?
(b) What is the expected time until the last of the customers leaves? (The last customer
may or may not be Carol.)
(c) What is the probability that Carol is the last customer to leave?
Solution:
(a) The time Carol waits to start service is the smaller of two independent exponentials
with parameter 1/(4min), so Carol’s waiting time is itself exponential with parameter λAlice + λBob = 1/(2min). She then spends an independent exponential time in
5
service, with parameter 1/(4min). Consequently,
E[time Carol in the bank] = E[waiting] + E[service] = 2min + 4min = 6min.
(b) Let’s split up the time for the last customer to leave into three pieces:
(i) Carol waits for service – this is exponential with mean 2 minutes;
(ii) The second-to-last customer leaves (this could be Carol or the person who
was there when she started service) – the time from Carol starting service to
one of here and the other person leaving is again the smaller of two independent exponentials with parameter 1/(4min), so the time for this to happen is
exponential with mean 2 minutes;
(iii) The last person around leaves – the time for this to happen starting from
when the second-to-last person left is exponential with parameter 1/(4min),
since that’s always how much time remains in a single exponential when we
know it’s still not over.
Consequently,
E[time for last customer to leave] = 2min + 2min + 4min = 8min.
(c) The probability that Carol is the last customer to leave is the probability that,
when she starts service, the other person who is still there finishes first. However,
since their service time is exponential, they still have an exponential time left with
parameter 1/(4min), which means that this is the probability that Carol’s is the
bigger of the two exponentials,
λCarol
1
1−
= .
λCarol + λother person
2
(5) (10 points) Durrett 2.24: Suppose that the number of calls to an answering service follows
a Poisson process with a rate of 4 per hour.
(a) What is the probability that (strictly) fewer than two calls come in the first hour?
(b) Suppose that six calls come in the first hour. What is the probability that (strictly)
fewer than two come in the second hour?
(c) Suppose that the operator gets to take a break after she has answered ten calls.
How long on average are her work periods?
Solution:
(a) The number of calls in the first hour is Poisson with parameter (4/hr) · (1hr) = 4.
The probability that 0 or 1 calls come is then
e−4
40
41
+ e−4 = 5e−4 ≈ 0.0916.
0!
1!
(b) The number of calls in the second hour is independent of the number of calls in the
first hour, since these are nonoverlapping time intervals. (A much more interesting
6
question is “given 6 calls in the first hour, what’s the distribution of the number
of calls from 30min to 90min” – you should think about it!) Consequently, the
probability of 0 or 1 calls in the second hour is still 5e−4 ≈ 0.0916.
(c) The operator’s work periods are the length of ten arriving calls. Since times between
calls are exponential with parameter 4/hr, the expected time between calls is 1/4hrs,
and the expected time to get 10 calls is
1
10 × hrs = 2.5hours.
4
(6) (10 points) Durrett 2.34: Edwin catches trout at the times of a Poisson process with a
rate of 3 per hour. Suppose that the trout weigh an average of 4 pounds, with a standard
deviation (careful: not variance!) of 2 pounds. Find the mean and standard deviation
(careful: not variance!) of the total weight of fish Edwin catches in two hours. (Note:
you will want to use your answer to problem 3 in this problem.)
Solution: The total weight of Edwin’s fish is
N (2hrs)
W =
X
wj ,
j=1
where wj is the weight of the jth fish he catches, and N (t) is the number of fish he has
caught by time t. We know that N (2hrs) is a Poisson random variable with parameter
2
. We also
2hrs · 3/hr = 6, which means that (in the terminology of problem 3) µN = 6 = σN
know, in that terminology, µY = 4pounds and σY = 2pounds. Consequently,
E[W ] = µY µN = 4pounds × 6 = 24pounds
and
q
q
p
2
2 2
Var(W ) = σY µN + µY σN = 36pounds2 × 6 + 16pounds2 × 6
√
= 2 78pounds ≈ 17.66pounds.
This is a rather large standard deviation compared to the mean: the reason is that it’s quite
likely that Edwin will catch a different number of fish than the mean, and the variability in
the number of fish introduces large variability into the total weight.
7