The t Distribution

Transcription

The t Distribution
Chapter 8
Statistical Intervals
for a Single Sample
LEARNING OBJECTIVES
• Construct confidence intervals on the
mean of a normal distribution
• Construct confidence intervals on the
variance and standard deviation of a
normal distribution
• Construct confidence intervals on a
population proportion
Confidence Interval
• Learned how a parameter can be
estimated from sample data
• Confidence interval construction and
hypothesis testing are the two
fundamental techniques of statistical
inference
• Use a sample from the full population to
compute the point estimate and the
interval
Confidence Interval On The Mean of a
Normal Distribution, Variance Known
– From sampling distribution, L and U
P (L ≤ μ ≤ U)= 1-α
– Indicates probability of 1-α that CI will contain
the true value of μ
– After selecting the sample and computing l and
u, the CI for μ
l≤μ≤u
– l and u are called the lower- and upperconfidence limits
Confidence Interval On The Mean of a Normal
Distribution, Variance Known
• Suppose X1, X2, , Xn is a random sample from a normal
distribution
• Z has a standard normal distribution
Z
X 

n
• Writing zα/2 for the z-value
• Hence
P z / 2 
X 
 z / 2 }  1  
/ n
1- α
-zα/2
zα/2
• Multiplying each term
x  z / 2 / n    x  z / 2 / n
• A 100(1-α )% CI on μ when variance is known
Example
•
•
A confidence interval estimate is desired
for the gain in a circuit on a
semiconductor device
Assume that gain is normally distributed
with standard deviation of 20
a)
b)
c)
d)
Find a 95% CI for μ when n=10 and x  1000
Find a 95% CI for μ when n=25 and x  1000
Find a 99% CI for μ when n=10 and x  1000
Find a 99% CI for μ when n=25 and x  1000
Example
a) 95% CI for
α=0.05, Z 0.05/2 =Z 0.025 = 1.96. Substituting the values
n  10,   20
x  1000, z  1.96
Confidence interval
x  z /
n    x  z /
n
987.6    1212.4
b) 95% CI for
,
n  25,   20 x  1000, z  1.96
x  z /
c) 99% CI for
n    x  z /
n
992.8    1007.8
,
n  10,   20 x  1000, z  2.58
x  z /
n    x  z / n
983.7    1016.3
d) 99% CI for
,
n  25,   20 x  1000, z  2.58
x  z / n    x  z / n
989.7    1010.3
Choice of Sample Size
•
•
•
•
•
•
•
(1-α)100% C.I. provides an estimate
Most of the time, sample X mean not equal to μ
Error E = X  
Choose n such that zα/2/√n = E
Solving for n
Results: n = [(Zα/2σ)/E]2
2E is the length of the resulting C.I.
Example
• Consider the gain estimation problem in
previous example
• How large must n be if the length of the
95% CI is to be 40?
• Solution
– α =0.05, then Zα/2 = 1.96
– Find n for the length of the 95% CI to be 40
One-Sided Confidence Bounds
• Two-sided CI gives both a lower and upper
bound for μ
• Also possible to obtain one-sided confidence
bounds for μ
• A 100(1-α )% lower-confidence bound for μ
X  Z / n  1  
• A 100(1-α )% upper-confidence bound for μ
  u  X  Z / n
A Large-Sample Confidence Interval for μ
• Assumed unknown μ and known  2
• Large-sample CI
• Normality cannot be assumed and n ≥ 40
• S replaces the unknown σ
• Let X1, X2,…, Xn be a random sample with unknown μ and
2
• Using CLT:
X 
S/ n
• Normally distributed
• A 100(1-α )% CI on μ:
x  Z / 2
S
S
   x  Z / 2
n
n
C.I. on the Mean of a Normal
Distribution, Variance Unknown
• Sample is small and 2 is unknown
• Wish to construct a two-sided CI on μ
• When 2 is known, we used standard normal
distribution, Z
• When 2 is unknown and sample size ≥40
– Replace  with sample standard deviation S
• In case of normality assumption, small n, and
unknown σ, Z becomes T=(X-μ)/(S/√n)
• No difference when n is large
The t Distribution
• Let X1, X2,..., Xn be a random sample from a normal
distribution with unknown μ and 2
• The random variable
T
X 
S/ n
• Has a t-distribution with n-1 d.o.f
• No. of d.o.f is the number of observation that can be
chosen freely
• Also called student’s t distribution
• Similar in some respect to normal distribution
• Flatter than standard normal distribution
• =0 and 2=k/(k-2)
The t Distribution
• Several t distributions
• Similar to the standard
normal distribution
• Has heavier tails than the
normal
• Has more probability in the
tails than the normal
• As the number d.o.f
approaches infinity, the t
distribution becomes
standard normal
distribution
The t Distribution
• Table IV provides
percentage points of
the t distribution
• Let tα,k be the value of
the random variable T
with k (d.o.f)
• Then, tα,k is an uppertail 100α percentage
point of the t
distribution with k
The t Confidence Interval on μ
• A 100(1-α ) % C.I. on the mean of a normal
distribution with unknown 2
x  t / 2,n 1S / n    x  t / 2,n1S / n
• tα/2,n-1 is the upper 100α/2 percentage point of
the t distribution with n-1 d.o.f
Example
• An Izod impact test was performed on 20
specimens of PVC pipe
• The sample mean is 1.25 and the sample
standard deviation is s=0.25
• Find a 99% lower confidence bound on Izod
impact strength
Solution
• Find the value of tα/2,n-1
• α=0.01and n=20, then the value of tα/2,n-1 =2.878
 s 
 s 
x  t0.005,19 
    x  t0.005,19 

 n
 n
 0.25 
 0.25 
1.25  2.878
    1.25  2.878

 20 
 20 
0.445    2.054
Chi-square Distribution
• Sometimes C.I. on the population variance is needed
• Basis of constructing this C.I.
• Let X1, X2,..,Xn be a random sample from a normal
distribution with μ and 2
• Let S2 be the sample variance
• Then the random variable:
X 
2
(n  1) S 2
2
• Has a chi-square (X2) distribution with n-1 d.o.f.
Shape of Chi-square Distribution
• The mean and
variance of the X2 are k
and 2k
• Several chi-square
distributions
• The probability
distribution is skewed
to the right
• As the k→∞, the
limiting form of the X2
is the normal
distribution
Percentage Points of Chi-square
Distribution
• Table III provides percentage points of X2 distribution
• Let X2α,k be the value of the random variable X2 with k
(d.o.f)
• Then, X2α,k

P( X 2  X 2,k ) 
 f (u)du  
X 2 ,k
C.I. on the Variance of A Normal
Population
• A 100(1-α)% C.I. on 2
2
( n  1) s 2
(
n

1
)
s
2


 2
2
X  / 2,n 1
X 1 / 2,n 1
• X2 α/2,n-1 and X2 1-α/2,n-1 are the upper and lower 100α/2
percentage points of the chi-square distribution with n-1
degrees of freedom
One-sided C.I.
• A 100(1 )% lower confidence bound or upper
confidence bound on 2
( n  1) s 2
2

2
X  ,n 1
and
2
(
n

1
)
s
2
  2
X 1 ,n 1
Example
• A rivet is to be inserted into a hole. A random sample of
n=15 parts is selected, and the hole diameter is measured
• The sample standard deviation of the hole diameter
measurements is s=0.008 millimeters
• Construct a 99% lower confidence bound for 2
• Solution
– For  = 0.01 and X20.01, 14 =29.14
14(0.008) 2
2
29.14
0.00003075   2
A Large Sample C.I. For A
Population Proportion
• Interested to construct confidence intervals on a
population proportion
• p̂ =X/n is a point estimator of the proportion
• Learned if p is not close to 1 or 0 and if n is relatively
large
• Sampling distribution of p̂ is approximately normal
• If n is large, the distribution of
Z
X  np

np (1  p )
pˆ  p
p (1  p )
n
• Approximately standard normal
Confidence Interval on p
• Approximate 100 (1-α) % C.I. on the proportion p of the population
pˆ  z / 2
pˆ (1  pˆ )
pˆ (1  pˆ )
 p  pˆ  z / 2
n
n
where zα/2 is the upper α/2 percentage point of the standard normal
distribution
• Choice of sample size
– Define the error in estimating p by p̂
– E= p  pˆ
– 100(1-α)% confident that this error less than
z / 2
– Thus
E  z / 2
p(1  p)
n
p(1  p)
n
n = (Zα/2/E)2p(1-p)
Example
• Of 1000 randomly selected cases of lung cancer,
823 resulted in death within 10 years
• Construct a 95% two-sided confidence interval on
the death rate from lung cancer
• Solution
– 95% Confidence Interval on the death rate from lung
cancer
pˆ 
832
 0.832
1000
pˆ  z / 2
0.832  1.96
n  1000
pˆ (1  pˆ )
 p  pˆ  z / 2
n
z / 2  1.96
pˆ (1  pˆ )
n
0.832(0.168)
0.832(0.168)
 p  0.832  1.96
1000
1000
0.8088  p  0.8552
Example
• How large a sample would be required in previous
example to be at least 95% confident that the error
in estimating the 10-year death rate from lung
cancer is less than 0.03?
• Solution
– E = 0.03,  = 0.05, z/2 = z0.025 = 1.96 and = 0.823 as
the initial estimate of p
z 
n
 pˆ (1  pˆ ) 
 E 
 1.96 

 0.832(1  0.832)
 0.03 
 596.62
2
 /2
2

Similar documents