Lecture 18 - One sample t-test and inference for the... mean Previously we have introduced estimation, confidence intervals and

Transcription

Lecture 18 - One sample t-test and inference for the... mean Previously we have introduced estimation, confidence intervals and
Lecture 18 - One sample t-test and inference for the population
mean
Previously we have introduced estimation, confidence intervals and
tests of significance by assuming a known population standard
deviation σ
This lecture formalises the procedures for the situation when σ is
unknown
This is the usual scenario in that it is very rare that we know what
the population standard deviation is
In light of this, we estimate σ with s and our inference for the
population mean(s) would usually be carried out using the
t-distribution
Inference for one population mean - formalities
Let Y1 , Y2 , . . . , Yn be a random sample from a population
with mean µ and variance σ 2 . In this section we shall use the
data to make inferences about µ.
We will assume that the sampling distribution of Y¯ is
N(µ, σ 2 /n). This assumption is valid if either (or both) of the
following are true:
Y1 , Y2 , . . . , Yn are normally distributed;
n is sufficiently large for the Central Limit Theorem to apply
One Sample t-Test
Aim: to investigate the credibility of some claim regarding µ
Hypotheses: H1 is a statement of the claim about µ; H0 is a
statement negating the claim.
Test Statistic: Suppose that we are testing
H0 : µ = µ0
against any one of the alternatives
H1 : µ > µ0
or
H1 : µ < µ 0
or
The appropriate test statistic is the t-statistic
T =
Y¯ − µ0
Y¯ − µ0
√ =
S/ n
SE (Y¯ )
H1 : µ 6= µ0
One Sample t-tests cont.
If H0 is true then T has as t sampling distribution with
ν = n − 1 degrees of freedom.
T is likely to take extreme values when H1 is correct.
P-Values for T-test
P-Value: Suppose that the observed value of the t-statistic is t.
The way of calculating the P-value depends on H1 :
H1
µ > µ0
µ < µ0
µ 6= µ0
P-value
P(T ≥ t)
P(T ≤ t)
2P(T ≥ |t|)
where T has a t distribution with ν = n − 1 degrees of freedom.
P-Value Interpretation and Significance Levels: The
interpretation of P-values, and the choice and application of
significance levels, is exactly the same as described in lectures 15
and 16.
One sample t-test — simple example
12 king-size chocolate bars weighed (in grams) giving
y¯ = 149.10
s = 1.73
The producer wants to know whether the (population) mean
weight differs from the advertised value of 150.0 grams. Write
down appropriate hypotheses and calculate the t-statistic to test
this. Write down a formula for the P-value and calculate a 95%
confidence interval for the true population mean µ
One sample t-test — simple example — solution
1. State hypotheses
We wish to test the following hypothesis
H0 : µ = 150
against
H1 : µ 6= 150
2. Specify significance level to test at
We will test this hypothesis at the α = 0.05 (5%) level of
significance
One sample t-test — simple example — solution cont.
3. Calculate an appropriate test statistic
In the inference for one mean scenario the appropriate test statistic
is
y¯ − µ0
√
s/ n
149.1 − 150
√
=
1.73/ 12
= −1.802134
t =
One sample t-test — simple example — solution cont.
4. Give a formula for the P-value
P = P(T < −1.802134) + P(T > 1.802134)
= 2P(T > 1.802134)
5. Calculate the P-value using standard statistical software
such as R Commander
Using R commander we calculate the P value to be 0.099
One sample t-test — simple example — solution cont.
6. Compare the P-value to the α level and make a
conclusion relating to the hypotheses
At the 5% significance level we would not have sufficient evidence
to reject the null hypothesis, and conclude that our data does not
provide evidence that the true weight of chocolate bars is different
to the advertised value of 150 grams
One sample t-test — simple example — solution cont.
7. Calculate a confidence interval for the true population
parameter Finally we would provide a confidence interval for the
true population mean. A 95% confidence interval is given by:
y¯ ± tcrit SE (¯
y)
√
149.1 ± 2.201s/ n
149.1 ± 2.201 × 0.499408
(148.0
,
150.2)
One sample t-test using R commander
Of course all of this becomes really easy if you use statistical
software to do such analyses
Let us revisit the Shoshoni rectangles example seen in lecture 15.
Recall we wish to investigate whether the Shoshoni Indians created
their beads in a rectangular shape with the golden ratio. We set up
our hypotheses in lecture 16:
Let µ be the population mean width-to-length ratio of Shoshoni
rectangles. We wish to test:
H0 : µ = 0.618
against
H1 : µ 6= 0.618
One sample t-test using R commander
The data tabulated below are the width-to-length ratios for
eighteen such rectangles sampled
0.693
0.672
0.668
0.662
0.628
0.601
0.690
0.609
0.576
0.606
0.844
0.670
0.570
0.654
0.606
0.749
0.615
0.611
Does this data provide evidence against the width to length ratio
being the golden ratio?
Summarising the data
7
●
4
3
2
1
0
frequency
0.70
0.65
0.60
w2lR
0.75
5
6
0.80
0.85
In any analysis we would first summarise the data numerically and
visually
0.55
0.65
0.75
w2lR
0.85
One sample t-test using R commander
We can use the menus in R Commander to carry out a one
sample-test
we need to specify both the value under the assumption that H0 is
true and the alternative hypothesis
One sample t-test using R commander - output
The output is shown below:
One sample t-test using R commander - output
We can see the following things from this output
The test statistic is t=2.1252
The degrees of freedom used is 17
The P-value for this test is 0.04853
We are using a two sided alternative hypothesis (not equal)
The estimated sample mean is 0.6513333
A 95% confidence interval for µ is given by (0.6182409,
0.6844258)
One sample t-test using R commander - assumptions
Whenever we do any inference we usually make assumptions. In
the case here we make the following assumptions
The sampling distribution of the estimator Y¯ is normal (or the
data is normal itself)
The observations are independent
There is no reason to doubt the independence assumption in this
example
With the normality assumption we can make this assumption if
either (or both) of the following apply
1
The data is normally distributed — do a plot and check
2
The sample size is large enough for the central limit theorem
to apply
Write up
A formal write up would include methods and results. We would
probably write this up in the following way:
Methods
We used a one sample t-test (two sided) to determine whether there was
evidence that the shoshoni beads were not created with the golden ratio.
We carried out a test at the 5% significance level and presented a 95%
confidence interval for the true population mean ratio of length to width.
Results
The observed mean was statistically significantly different to the
hypothesised value of 0.618. The sample mean (SE) was 0.651 (0.0157)
and a 95% confidence interval for the true mean was calculated to be
(0.6182409, 0.6844258)
Conclusion
Evidence is provided that the mean width to length ratio of the shoshoni
rectangles is not the golden ratio
What about the potential outlier
Redoing the analysis without the outlier gives the following results
We should be very careful when writing this up as it seems the
significance we have observed can be changed by the removal of
one value
What happens when our assumptions are not satisfied?
Sometimes we can not justify an analysis like a t-test as it is
obvious that the assumptions are not satisfied. What do we do in
these instances?
Fortunately there are many other techniques that can be applied to
examine our research question of interest. For example some
non-parametric statistics, or permutation tests.
All of these are beyond the scope of this course but it is worthwhile
knowing that we are not limited to the tests we see here!
Statistical significance vs practical significance
Even if we carry out our test in a appropriate way we still may have
a result that is not of practical importance
As an example let us think about a medical situation where a GP
has concerns over the blood pressure of his patients. He arbitrarily
sets a value of 150, and wants to know whether on average his
patients have a blood pressure different to this value 150. He takes
the blood pressure of his next 1,000 patients.
We could set up a significance test that examines
H0 : µ = µ0
against
H1 : µ 6= µ0
Statistical significance vs practical significance
His data looked like this:
100
50
0
Frequency
150
200
Histogram of y
144
146
148
150
y
152
154
156
Results from an analysis using R-commander gives the following
This suggests that at the 5% significance level there is
evidence against the null hypothesis in favour of his claim
How useful is this?
The sample mean is 149.8519, less than 0.2 lower than the
hypothesised mean
Is this of practical significance?
Margin of error and sample size calculations
To enable us to get useful statistical results we sometimes like to
set up our experiments and studies to enable them to statistically
show a certain difference from the null value
We can make use of our confidence interval formulae:
estimate ± criticalvalue × SE (estimate)
Margin of error and sample size calculations
The terms on the right of the ± is called the margin of error. This
is usually defined as a half width of the confidence interval. We
can set this to be a certain value and determine how many samples
we would need to achieve this.
For instance if in the previous example we set the margin of error
to 10, work out an appropriate sample size that would give that
margin of error, we are essentially saying we remove the chance of
getting a statistically significant result just because of having a
large sample size.
We will more formally examine sample size calculations and power
when we examine two sample t-tests for now please be aware
STATISTICAL SIGNIFICANCE DOES NOT ALWAYS
IMPLY PRACTICAL SIGNIFICANCE
Summary
In this lecture we have
Brought together the formal inference for one mean from the
previous lecture
Shown how to do a 1 sample t-test by hand and using R
commander
Discussed assumptions of the test and inference for one mean
Given an example analysis and write up for such a situation
Shown that we can get a statistically significant result just by
increasing the sample size