MATH 382 Chebyshev`s Inequality

Transcription

MATH 382 Chebyshev`s Inequality
Dr. Neal, WKU
MATH 382
Chebyshev’s Inequality
2
Let X be an arbitrary random variable with mean µ and variance σ . What is the
probability that X is within t of its average µ ? If we knew the exact distribution and
pdf of X , then we could compute this probability: P( X − µ ≤ t) = P( µ − t ≤ X ≤ µ + t ) .
But there is another way to find a lower bound for this probability. For instance, we
may obtain an expression like P( X − µ ≤ 2) ≥ 0.60 . That is, there is at least a 60%
chance for an obtained measurement of this X to be within 2 of its mean.
Theorem (Chebyshev’s Inequality). Let X be a random variable with mean µ and
2
variance σ . For all t > 0
σ2
P( X − µ > t) ≤ 2
t
 t 2
Proof. Consider Y = 
 0
and
σ2
P( X − µ ≤ t) ≥ 1 − 2 .
t
if X − µ > t
≤ X − µ 2 . Then
otherwise
t 2 × P( X − µ > t) = E[Y ] ≤ E[ X − µ 2 ] = Var( X ) = σ 2 ;
thus, P( X − µ > t) ≤ σ 2 / t 2 . Therefore, − P( X − µ > t) ≥ − σ 2 / t 2 which gives
σ2
P( X − µ ≤ t) = 1 − P( X − µ > t ) ≥ 1 − 2 .
t
Chebyshev’s Inequality is meaningless when t ≤ σ . For instance, when t = σ it is
simply saying P( X − µ > t) ≤ 1 and P( X − µ ≤ t) ≥ 0 , which are already obvious. So
we must use t > σ to apply the inequalities. We illustrate next with some standard
distributions.
Example. (a) Let X ~ Poi(9) . Give a lower bound for P( X − µ ≤ 5) .
(b) Let X ~ N(100, 15) . Give a lower bound for P( X − µ ≤ 20) .
Solution. (a) For X ~ Poi(9) , µ = 9 = σ 2 ; so σ = 3 . Then
σ2
9
P(4 ≤ X ≤ 14) = P( X − 9 ≤ 5) = P( X − µ ≤ 5) ≥ 1 − 2 = 1 −
= 0.64 .
25
5
Note: Using the pdf of X ~ Poi(9) we obtain P(4 ≤ X ≤ 14) ≈ 0. 9373 .
Dr. Neal, WKU
(b) For X ~ N(100, 15) , we have
152
P(80 ≤ X ≤ 120) = P( X − 100 ≤ 20) = P( X − µ ≤ 20) ≥ 1 − 2 = 0.4375
20
Note: Using a calculator, we obtain P(80 ≤ X ≤ 120) ≈ 0.817577.
From these examples, we see that the lower bound provided by Chebyshev’s
Inequality is not very accurate. However, the inequality is very useful when applied to
the sample mean x from a large random sample.
2
Recall that if X is an arbitrary measurement with mean µ and variance σ , and x
is the sample mean from random samples of size n , then
µ x = µ and
σ 2x =
σ2
.
n
Applying Chebyshev’s Inequality, we obtain a lower bound for the probability that x is
within t of µ :
σ2
σ2
P( x − µ ≤ t) = P( x − µ x ≤ t ) ≥ 1 − 2x = 1 − 2
t
nt
Suppose X is an arbitrary measurement with unknown mean and variance but with
known range such that c ≤ X ≤ d . Then σ ≤ (d − c ) / 2 and σ 2 ≤ (d − c)2 / 4 . Thus,
P( x − µ ≤ t) ≥ 1 −
(d − c)2
4n t 2
A special case of x is a sample proportion p of a proportion p for which
µ p = p and σ 2p =
p(1 − p) 0.25
.
≤
n
n
We then have
P( p − p ≤ t) ≥ 1 −
p(1 − p)
0. 25
≥1−
2
nt
n t2
Dr. Neal, WKU
Example. Let X ~ N(100, 15) . Let x be the sample mean from random samples of size
400. Give a lower bound for P( x − µ ≤ 2) .
Solution. For random samples of size 400, we have
P(98 ≤ x ≤ 102) = P( x − 100 ≤ 2) = P( x − µ ≤ 2) ≥ 1 −
152
= 0.859375
400 × 22
Thus, for samples of size 400, there is a relatively high chance that x will be within 2 of
the average µ = 100 .
Example. Let X be an arbitrary measurement with unknown distribution but with
known range such that 10 ≤ X ≤ 30 . For random samples of size 1000, give a lower
bound for P( x − µ ≤ 1) .
Solution. Here µ and σ are unknown, but we do know that σ ≤
30 − 10
= 10 so that
2
σ2
100
σ ≤ 100 . Then P( x − µ ≤ 1) ≥ 1 −
= 0.90 . So there is at least a
2 ≥1−
1000 × 1
1000 × 12
90% chance that a sample mean x will be within 1 of the unknown mean µ .
2
Example. Let p be an unknown proportion that we are estimating with sample
proportions p from computer simulations with samples of size 4000. Give a lower
bound for P( p − p ≤ 0. 02) .
Solution. For the proportion p and trials of size 4000, we have
P( p − p ≤ 0. 02) ≥ 1 −
0.25
0.25
= 0.84375 .
2 =1−
nt
4000 × 0. 022
Law of Large Numbers (a.k.a. Law of Averages)
Let x be the sample mean from random samples of size n for a measurement with
mean µ , and let p be the sample proportion for a proportion p .
As the sample size n increases,
the probability that x is within t of µ increases to 1, and
the probability that p is within t of p increases to 1.
So for very large n and small t ,
we can say that virtually all x are good approximations of µ
and virtually all p are good approximations of p .
Dr. Neal, WKU
Exercises
1. Let X ~ exp(20) . (a) Use Chebyshev’s Inequality to give a lower bound for
P( X − µ ≤ 25) . (b) Use the cdf of X to give a precise value for P( X − µ ≤ 25) .
2. Let X be a measurement with range 2 ≤ X ≤ 10 . For random samples of size 400,
give a lower bound for P( x − µ ≤ 0. 5) .
3. With samples of size 1200, let p be the sample proportion for an unknown
proportion p . Give a lower bound for P( p − p ≤ 0. 03) .