Lecture 8: Parametric Analysis of Variance

Transcription

Lecture 8: Parametric Analysis of Variance
Parametric Analysis of Variance
The same principles used in the two-sample means test can be
applied to more than two sample means.
In these cases we call the test analysis of variance.
The null hypothesis for analysis of variance (AOV or
ANOVA) is:
H0 : μ1 = μ2 … = μk
And this may be applied over both time and space.
Assumptions of analysis of variance testing:
1. Observations between and within samples are random and
independent (methodological issue).
2. The observations in each category are normally distributed (run
normality tests on each group).
3. The population variances are assumed to be equal (homogeneity
of variances test).
j
i
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Generic Data Set
Group1
X1,1
X1,2
X1,3
X1,4
X1,5
Group2
X2,1
X2,2
X2,3
X2,4
X2,5
Group3
X3,1
X3,2
X3,3
X3,4
X3,5
Note that i denotes rows and j denotes columns.
AOV:
• Compares the variation within the groups (columns)
to the variation among the group means.
− If the variation among the group means is much
greater than within the groups, reject H0.
− If the variation among the group means is not
much greater than within the groups, accept H0.
There are 3 pieces of information that need to be determined to
perform AOV:
TSS (total sum of squares) – the sum of the squared deviations of
the observation from the overall mean.
BSS (between sum of squares) – the sum of the squared deviations
between the groups.
WSS (within sum of squares) – the sum of the squared deviations
with the groups.
We will use this set of equations because they are
easier to calculate.
k
TSS = ∑
i =1
ni
2
X
∑ ij − C
j =1
2

 ni
 X ij 

ni  ∑
j =1
 −C

BSS = ∑
ni
i =1
WSS = TSS − BSS
(
X )
∑∑
where C =
ij
N
2
We test for differences among the groups using the F test.
• The standard F test is simply a ratio of variances.
• Here the F test is slightly modified to take into account:
1. The number of groups.
2. Differences in the number of group members.
The F statistic is calculated as:
F=
BSS
k −1
WSS
N −k
where k is the number of groups and N is the total sample size.
This is basically the sample size weighted ratio of the within group
variation to the between group variation.
There are 2 types of degrees of freedom for the F statistic:
• k – 1 (where k is the number of columns) for the numerator.
• N – k (where N is the total sample size) for the denominator.
Note that the table from the book is different than the web table.
• Book p values are < or > a set alpha level.
• Web table p values can be a range.
A few words on Sum of Squares…
• In all cases, the sum of squares is a measure of total variation
from the mean in a data set.
• If the sum of squares is large, then the data set has a lot of
variation. This makes it more difficult to reject Ho (find a
difference).
• If the sum of squares is small, then the data set has little
variation. This makes it earier to reject Ho (find a difference).
• Once the sum of squares has been determined there are
many ways of using the information.
In AOV the sum of squares is partitioned into 3 distinct types.
• This is because we are working with subsets of the total
data set (3+ groups).
It can be thought of as performing multiple and simultaneous
difference in means tests (t-tests).
Understanding the role of the total sum of squares (TSS) is
straightforward: it is the total amount of variation in the data
set.
TSS = ∑ ∑ ( X ij − X ij )
i
2
j
is equivalent to
k
TSS = ∑
i =1
ni
∑X
j =1
2
ij
−C
(
X )
∑∑
where C =
ij
N
2
For TSS the data are pooled together.
The between sum of squares (BSS) is the “treatment”
variance.
• We want to compare these to each other.
BSS = ∑ n j ( X j − X ij ) 2
j
is equivalent to
2

 ni
 ∑ X ij 

ni 
j =1
 −C

BSS = ∑
ni
i =1
We are examining the distance between the means of each group (A and B).
The within sum of squares (WSS) can be thought of
as the “error” variance.
• We want to factor out variation within the groups.
WSS =
∑∑
i
( X ij − X j ) 2
j
is equivalent to
WSS = TSS − BSS
The F test is simply the ratio between the error variance (within
groups) and the treatment variance (between groups).
F=
BSS
k −1
WSS
N −k
IF the ratio is small then the difference is also small (accept Ho).
IF the ratio is large then the difference is also large (reject Ho).
Tarascan Village Percent Population Employed
Mountain Group
Meseta Group
Lake Group
Mountain
Meseta
Lake
We will drop the towns along the highways since we are interested only in rural towns.
Lake
Quiroga
San Andres
San Jeronimo
Erongaricuaro
Ihuatzio
Jaracuaro
Puacuaro
Santa Fe
Uricho
Zurumutaro
Tzintzuntzan
Ajuno
51.55
54.12
36.61
48.73
59.91
49.09
50.52
71.19
33.22
54.68
46.38
37.52
Meseta
Arantepacua
Comachuen
La Mojonera
Nahuatzen
Pichataro
Quinceo
San Isidro
San Juan
Sevina
Tingambato
Turicuaro
Mountain
37.88
36.72
31.08
44.39
46.90
32.64
44.32
32.76
40.76
44.72
37.63
Ahuiran
Angahuan
Charapan
Cocucho
Corupo
Nurio
Pomacuaran
San Felipe
San Lorenzo
Urapicho
47.56
47.21
45.92
24.41
38.92
39.14
48.09
36.11
38.64
29.90
Normality Tests by Group
Ho: Village economic activity measurements are not significantly
different than normal.
qLake
37.97
=
= 3.61
10.52
nLake= 12
qcritical = 2.80 – 3.91
Accept Ho.
qMeseta
15.82
=
= 2.86
5.53
nMeseta= 11
qcritical = 2.74 – 3.80
Accept Ho.
qMountain =
23.68
= 2.97
7.97
nMountain= 10
qcritical = 2.67 – 3.69
Accept Ho.
All village groups are not significantly different than normal (w/s
qLake 3.61, p > 0.05, qMeseta 2.86, p > 0.05, qMountain 2.97, p > 0.05)
Equality of Variances Test (Hartley Test1)
Ho: The group variances are not significantly different.
FMax
Max s 2
=
Min s 2
Group variances: Lake = 110.67
Meseta = 30.60
Mountain = 63.54
df = k − 1, nMax
df = 2, 12
110.67
FMax =
= 3.62
30.60
FCritical = 4.16
The group variances are not significantly different (F3.89, p > 0.05).
1
SPSS uses the less conservative Levene’s Test.
• The Hartley test is VERY sensitive to the normality assumption.
Even small deviations from normal can influence the results.
• The test works best when the group sample sizes are equal. If
the group sizes are roughly equal, use the group with the
largest sample for determining the df and the smaller of the
two dfs reported.
• If in doubt, double check using the SPSS Levene’s test.
Hypotheses
Ho: There is no significant difference in the percent working
population among the three Tarascan village groups.
Ha: There is a significant difference in the percent working
population among the three Tarascan village groups.
Use between for two-group testing and among for 3+ group
testing.
Lake
Quiroga
San Andres
San Jeronimo
Erongaricuaro
Ihuatzio
Jaracuaro
Puacuaro
Santa Fe
Uricho
Zurumutaro
Tzintzuntzan
Ajuno
Column Sums
n1 = 12
n2 = 11
n3 = 10
ntotal = 33
51.55
54.12
36.61
48.73
59.91
49.09
50.52
71.19
33.22
54.68
46.38
37.52
593.5
Meseta
Arantepacua
Comachuen
La Mojonera
Nahuatzen
Pichataro
Quinceo
San Isidro
San Juan
Sevina
Tingambato
Turicuaro
Mountain
37.88
36.72
31.08
44.39
46.90
32.64
44.32
32.76
40.76
44.72
37.63
429.8
Ahuiran
Angahuan
Charapan
Cocucho
Corupo
Nurio
Pomacuaran
San Felipe
San Lorenzo
Urapicho
47.56
47.21
45.92
24.41
38.92
39.14
48.09
36.11
38.64
29.90
395.9
k
TSS = ∑
i =1
ni
∑X
j =1
2
ij
−C
Lake
Quiroga
San Andres
San Jeronimo
Erongaricuaro
Ihuatzio
Jaracuaro
Puacuaro
Santa Fe
Uricho
Zurumutaro
Tzintzuntzan
Ajuno
Column Sums
51.55
54.12
36.61
48.73
59.91
49.09
50.52
71.19
33.22
54.68
46.38
37.52
593.5
Sum of the observations
(∑∑ X )
where C =
2
ij
N
x2
2657.4
2928.9
1340.4
2374.9
3589.6
2409.9
2552.2
5068.2
1103.8
2990.1
2151.5
1407.8
30574.6
Meseta
Arantepacua
Comachuen
La Mojonera
Nahuatzen
Pichataro
Quinceo
San Isidro
San Juan
Sevina
Tingambato
Turicuaro
37.88
36.72
31.08
44.39
46.90
32.64
44.32
32.76
40.76
44.72
37.63
x2
1434.9
1348.7
965.7
1970.4
2199.6
1065.7
1964.5
1072.9
1661.3
1999.8
1416.0
Mountain
Ahuiran
Angahuan
Charapan
Cocucho
Corupo
Nurio
Pomacuaran
San Felipe
San Lorenzo
Urapicho
47.56
47.21
45.92
24.41
38.92
39.14
48.09
36.11
38.64
29.90
429.8 17099.5
395.9 16246.9
Sum of the squared observations
Total observation sum = 593.5 + 429.8 + 395.9 = 1419.2
(∑∑ X )
ij
k
Total squared sum = 30574.6 + 17099.5 + 16246.9 = 63921
x2
2262.2
2228.4
2108.9
596.0
1515.1
1532.3
2313.0
1303.9
1493.3
893.9
ni
∑ ∑X
i =1
j =1
2
ij
k
TSS = ∑
i =1
ni
2
X
∑ ij − C
where
j =1
(1419.2) 2
C=
= 61034.2
33
TSS = 63921 − 61034.2 = 2886.8
(
X )
∑∑
C=
ij
N
2
2

 ni
 ∑ X ij 

ni 
j =1
 −C

BSS = ∑
ni
i =1
Lake n=12
x2
Meseta n=11
x2
Mountain n=10
x2
Quiroga
51.55
2657.4
Arantepacua
37.88
1434.9
Ahuiran
47.56
2262.2
San Andres
54.12
2928.9
Comachuen
36.72
1348.7
Angahuan
47.21
2228.4
San Jeronimo
36.61
1340.4
La Mojonera
31.08
965.7
Charapan
45.92
2108.9
Erongaricuaro
48.73
2374.9
Nahuatzen
44.39
1970.4
Cocucho
24.41
596.0
Ihuatzio
59.91
3589.6
Pichataro
46.90
2199.6
Corupo
38.92
1515.1
Jaracuaro
49.09
2409.9
Quinceo
32.64
1065.7
Nurio
39.14
1532.3
Puacuaro
50.52
2552.2
San Isidro
44.32
1964.5
Pomacuaran
48.09
2313.0
Santa Fe
71.19
5068.2
San Juan
32.76
1072.9
San Felipe
36.11
1303.9
Uricho
33.22
1103.8
Sevina
40.76
1661.3
San Lorenzo
38.64
1493.3
Zurumutaro
54.68
2990.1
Tingambato
44.72
1999.8
Urapicho
29.90
893.9
Tzintzuntzan
46.38
2151.5
Turicuaro
37.63
1416.0
Ajuno
37.52
1407.8
593.5
30574.6
395.9
16246.9
Column Sums
429.8 17099.5
(593.5) 2 (429.8) 2 (395.9) 2
BSS =
+
+
− 61034.2 = 786.5
12
11
10
Only used to calculate the WSS
TSS = 2886.8
BSS = 786.5
WSS = 2886.8 − 786.5 = 2100.3
F=
BSS
k −1
WSS
N −k
786.5
393.25
3
−
1
=
= 5.62
F=
2100.3
70.01
33 − 3
df =
(k-1) = (3-1) = 2,
(n-k) = (33-3) = 30
To get the probability range on this table, read DOWN the column.
F = 5.62
Since 5.62 > 3.32 reject Ho
There is a significant difference in the percent employed population
among the three groups of Tarascan villages (F5.62, 0.01 > p > 0.005).
Test of Homogeneity of Variances
PctAct
Levene
Statistic
.876
df1
df2
2
Sig.
.427
30
ANOVA
Pc tAct
Between Groups
W ithin Groups
Total
Sum of
Squares
786.816
2095.240
2882.056
df
2
30
32
Mean Square
393.408
69.841
F
5.633
Sig.
.008
Multiple Comparisons
Dependent Variable: PctAct
LSD
(I) Group
1
2
3
(J) Group
2
3
1
3
1
2
Mean
Difference
(I-J)
10.3886*
9.8698*
-10.3886*
-.5188
-9.8698*
.5188
Std. Error
3.4885
3.5783
3.4885
3.6515
3.5783
3.6515
*. The mean difference is s ignificant at the .05 level.
Sig.
.006
.010
.006
.888
.010
.888
95% Confidence Interval
Lower Bound Upper Bound
3.264
17.513
2.562
17.178
-17.513
-3.264
6.938
-7.976
-17.178
-2.562
-6.938
7.976
Comparing our results to those of SPSS:
TSS
BSS
WSS
F
p
SPSS
2882.1
786.8
2095.2
5.63
0.008
Us
2886.6
786.8
2100.3
5.62
0.01 - 0.005
Multiple Comparison (AOV Post Hoc) Tests
• Available in SPSS to compare all group pairs.
• Used to determine which means are significantly different.
• Similar to (but not exactly like) performing two group means
tests while holding the other group(s) constant.
• The post hoc test listed below are useful for non-paired
comparisons and non treatment-control comparisons.
For treatment-control (e.g. before- after) comparisons, use the
Dunnet test.
LSD (Fisher test)
• The Least Significant Difference test assumes that if the null
hypothesis is incorrect, as indicated by a significant F-test, Type I
errors are not really possible.
• LSD test has been criticized for not sufficiently controlling for Type
I errors.
• Although this is the first test on the list in SPSS, it is probably not
the best.
Bonferroni
• Probably the most commonly used post hoc test because it is
flexible, simple to compute, and can be used with any type of
statistical test.
• The traditional Bonferroni tends to lack power.
• The Bonferroni overcorrects for Type I error.
Sidak
• A relatively simple modification of the Bonferroni formula that
would have less of an impact on statistical power but retain much
of the flexibility of the Bonferroni method.
• This approach is convenient but has not received any systematic
study.
• It is unlikely that a single, simple correction will result in the most
efficient balance of Type I and Type II errors.
Tukey (Tukey’s HSD)
• The HSD (honest significant difference) has greater power than
the other tests under most circumstances.
• Requires equal sample sizes.
• If sample sizes are unequal, the Tukey-Kramer test is used by
SPSS which computes the harmonic mean of the sample sizes
for each group, slightly lowering the power of the test.
Games-Howell
• This test can be used when variances are unequal and also
takes into account unequal group sizes.
• Severely unequal variances can lead to increased Type I error,
and with smaller sample sizes, more moderate differences in
group variance can also lead to increases in Type I error.
• This test appears to do better than the Tukey HSD if variances
are very unequal (or moderately so in combination with small
sample size) or can be used if the sample size per cell is very
small (e.g., <6).
• This is probably the most reliable post hoc test, but also the most
conservative.
Probabilities from 5 Post Hoc Tests
Group
Comparisons
1–2
GamesHowell
0.021
Tukey HSD
0.015
LSD
0.006
Bonferroni
0.017
Sidak
0.017
1–3
0.053
0.026
0.010
0.029
0.029
2–3
0.984
0.989
0.888
1.000
0.999
Note that the differences are slight, except for the GamesHowell test.