DISTRIBUTION OF THE SAMPLE MEAN , X …
Transcription
DISTRIBUTION OF THE SAMPLE MEAN , X …
DISTRIBUTION OF THE SAMPLE MEAN X 1 , X 2 ,… , X n sample from a distribution/population with mean µ and standard deviation σ. 1 n X = ∑ Xi n i =1 We know: can take different values for different samples – sampling distribution. FACT1. The mean and standard deviation for the distribution of X are given by: σ µ X = µ and σ X = n The mean of X is the same as the population mean µ. So, X is unbiased for µ. The standard deviation of X is standard deviation. n times smaller than the population Averages have smaller variability than single observations! Law of Large Numbers Closer look at the standard deviation of X : σ X = σ n As n=sample size increases, σ X 0; i.e. as n increases, spread of the sample mean decreases to zero. What random variable has spread zero? Constant! comes arbitrarily close to µ 1.5 1.5 2.0 2.0 Conclusion- Law of Large Numbers: X for large enough n. Distribution with mean=10, st.dv.= 2/100^0.5=0.2 1.0 1.0 Distribution with mean=10, st.dv.= 2/10^0.5=0.63 0.0 0.0 0.5 0.5 Distribution with mean=10, st.dv.=2 5 10 5 15 n=10, n=100 10 15 DISTRIBUTION OF THE SAMPLE MEAN – NORMAL DATA X 1 , X 2 ,… , X n sample from a Normal distribution, N(µ, σ ). µX = µ and σX = σ From FACT 1, We know: FACT 2: If X 1 , X 2 ,… , X n are from N(µ, σ ), then X has a N(µ, σ/ √n ) distribution. n NOTE: Since X is normally distributed, with µX = µ and standardize it: σX = X −µ n( X − µ) = Z= . σ σ/ n σ n , then we may SAMPLING DISTRIBUTION OF THE SAMPLE MEAN EXAMPLE: Students in an university have a weight distribution that is known to be N(150, 20). Let X1, X2, …, X16 represent the weights of 16 randomly selected students from this university. If X is the average weight for this sample, find P( X > 160). Solution: Since the sample came from a normal distribution, by Fact 2, the sample mean has a normal distribution as well. X ~N(µ, σ/ √n )=N(150, 20/ √16)=N(150, 5). Thus, P( X > 160) = P ( X − 150 160 − 150 ) = P( Z > 2) = 1 − P( Z ≤ 2) = 1 − 0.9772 = 0.0228. > 5 5 EXAMPLE, CONTD. An elevator at this university has a capacity of 1500 pounds. What is the probability that 9 students who enter the elevator will have a safe ride, i.e. their total weight is less than 1,500 lb? Solution: Again, by Fact 2, the sample mean has a normal distribution: X ~N(µ, σ/ √n )=N(150, 20/ √9)=N(150, 6.67). Also, P( Total weight < 1500)=P( X <1500/9)=P( X <166.67). So, X − 150 166.67 − 150 P( X < 166.67) = P( > ) = P( Z < 2.5) = 0.9938. 6.67 6.67 DISTRIBUTION OF THE SAMPLE MEAN, CONTD. EXAMPLE. Suppose X is the score on a test and X~N(500, 100). Let X1, X2, …X16 be a sample of scores for 16 individuals and X their average score. Find P( 550 < X ≤ 600). Solution: Since the data come from a normal distribution, by Fact 2, X has a normal distribution with mean µ X = µ = 500 and σ X = σ / n = 100 / 16 = 25. Thus, P(550 < X ≤ 600) = P( 550 − 500 X − 500 600 − 500 < ≤ )= 25 25 25 = P(2 < Z ≤ 4) = P(Z ≤ 4) - P(Z ≤ 2 ) = = 1 – 0.9772 = 0.0228. The CENTRAL LIMIT THEOREM (CLT) What if the data does not come from the normal distribution? FACT 3. (CLT): If X1, X2, …Xn are any set of observations with mean µ and standard deviation σ, their sample mean X , has approximately normal N(µ, σ/√n) distribution, if n is sufficiently large. How large is sufficiently large? Depends on the distribution the data comes from. Definitely n should be at least 20 before we use this approximation. Difference between Fact 2 and Fact 3: Fact 2 holds only for samples from Normal distribution and gives exact distribution of X . Fact 3 holds for samples from any distribution, but gives an approximate distribution for X . The Central Limit Theorem contd. Example. Suppose X1, X2, …, X25 are lifetimes of electronic components, with µ=700 hours and σ=10 hours. Find P( X ≤ 702), where X is the sample mean of the lifetimes of 25 components. Solution. Usually lifetime data is skewed to the right, so not normal (Why?) Since n=25 (reasonably large), we will use CLT and the normal approximation of the distribution of the sample mean: X So, has approx. a N(µ, σ/√n) = N(700, 10/√25) = N(700, 2) distr. X − 700 702 − 700 P ( X ≤ 702) = P( ≤ ) = P( Z ≤ 1) = 0.8413. 2 2