Analysis of Variance STAT E-150 Statistical Methods
Transcription
Analysis of Variance STAT E-150 Statistical Methods
STAT E-150 Statistical Methods Analysis of Variance In Analysis of Variance, we are testing for the equality of the means of several levels of a variable. The technique is to compare the variation between the levels and the variation within each level. (The levels of the variable are also referred to as groups, or treatments.) If the variation due to the level (variation between levels) is significantly larger than the variation within each level, then we can conclude that the means of the levels are not all equal. 2 We will test the hypothesis vs. using the ratio F H0: μ1 = μ2 = ∙∙∙ = μk Ha: the means are not all equal variability between groups MSGroups variability within the groups MSError When the numerator is large compared to the denominator, we will reject the null hypothesis. 3 ∙ The numerator of F measures the variation between groups; this is called the Mean Square for groups: MSGroups = SSGroups/df = SSGroups/(k-1) ∙ The denominator of F measures the variation within groups; this is called the Error Mean Square: MSE = SSError/df = SSError/(n - k) MSGroups . We will reject H0 when F is large. ∙ The test statistic is F = MSError ∙ MSGroups has k - 1 degrees of freedom, where k = the number of groups. ∙ MSError has n - k degrees of freedom, where n is the total sample size. 4 The ANOVA table: Source df SS MS F Model k-1 SSGroups MSGroups MSGroups/MSError Error n-k SSError MSError Total n-1 SSTotal p If the null hypothesis is true, the groups have a common mean, μ. Each group mean μk may differ from the grand mean, μ, by some value. This difference is called the group effect, and we denote this value for the kth group by αk. 5 If the null hypothesis is true, the groups have a common mean, μ. Each group mean μk may differ from the grand mean, μ, by some value. This difference is called the group effect, and we denote this value for the kth group by αk. 6 One-Way Analysis of Variance Model The ANOVA model for a quantitative response variable and a single categorical explanatory variable with K values is Response = Grand Mean + Group Effect + Error Term Y = μ + αk + ε The Grand Mean (μ) is the part of the model that is common to all observations. The Group Effect is the variability between groups. The residual, or error, is the variability within groups. Since μk = μ + αk we can write this model as Y = μk + ε where ε ~ N(0, σε) and are independent. That is, the errors are approximately normally distributed with a mean of 0 and a common standard deviation, and are independent. 7 The assumptions for a One-Way ANOVA are: 1. Independence Assumption The groups must be independent of each other, and the subjects within each group must be randomly assigned. Think about how the data was collected: Were the data collected randomly or generated from a randomized experiment? Were the treatments randomly assigned to experimental groups? 8 2. Equal Variance Assumption The variances of the treatment groups are equal. Look at side-by- side boxplots of the data to see if the spreads are similar; also check that the spreads don't change systematically with the centers and that the data is not skewed in each group. If either of these is true, a transformation of the data may be appropriate. Also plot the residuals against the predicted values to see if larger predicted values lead to larger residuals; this may also suggest that a reexpression should be considered. 9 3. Normal Population Assumption The values for each treatment group are normally distributed. Again, check side-by-side boxplots of the data for indications of skewness and outliers. 10 Example: A study reported in 1994 compared different psychological therapies for teenaged girls with anorexia. Each girl’s weight was measured before and after a period of cognitive behavioral therapy designed to aid weight gain. One group used a cognitive-behavioral treatment, a second group received family therapy, and the third group was a control group which received no therapy. The subjects in this study were randomly assigned to these groups. The weight change was calculated as weight at the end of the study minus weight at the beginning of the study; the weight change was positive if the subject gained weight and negative if she lost weight. What does this data indicate about the relative success of the three treatments? Note that in this analysis, the explanatory variable (type of therapy) is categorical and the response variable (weight change) is quantitative. 11 The hypotheses are: H0: μ1 = μ2 = μ3 Ha: the means are not all equal Note that the null hypothesis is not H0: μ1 ≠ μ2 ≠ μ3 12 Some of the data is shown below. For SPSS analysis, the data should be entered with the group in one column and the data in a second column: Group WeightGain Group WeightGain 1 1 -0.5 37 2 11.7 2 1 -9.3 38 2 6.1 3 1 -5.4 39 2 1.1 4 1 12.3 40 2 -4 5 1 -2 41 2 20.9 6 1 -10.2 42 2 -9.1 7 1 -12.2 43 2 2.1 8 1 11.6 44 2 -1.4 9 1 -7.1 45 2 1.4 10 1 6.2 46 2 -0.3 11 1 -0.2 47 2 -3.7 12 1 -9.2 48 2 -0.8 13 1 8.3 49 2 2.4 14 1 3.3 50 2 12.6 15 1 11.3 51 2 1.9 16 1 0 52 2 3.9 17 1 -1 53 2 0.1 18 1 -10.6 54 2 15.4 19 1 -4.6 55 2 -0.7 20 1 -6.7 56 3 11.4 21 1 2.8 57 3 11 13 First we will see if the equal variance condition is met, by comparing side-by-side boxplots of the data: The boxplots do not show a great deal of difference in the spread of the data, but are not conclusive. 14 We can compare the largest standard deviation and the smallest standard deviation; if this ratio is less than or equal to 2, then we can assume that the variances are similar. In this case Smax = 7.99 and Smin = 7.16 The ratio is 7.99/7.16 = 1.116, which is less than 2, and so we can assume that the equal variance condition is met. 15 We can also use Levene's test: Test of Homogeneity of Variances WeightGain Levene Statistic .314 df1 df2 2 Sig. 69 .731 This test for homogeneity of variances tests the null hypothesis that the population variances are equal: H0: σ12 = σ22 = σ32 Ha: the variances are not all equal Since the p-value is very large (.731), we cannot reject this null hypothesis, and we can conclude that the data does not violate the equal variance assumption. 16 We can check the Normality condition with Normal Probability Plots of the three groups: 17 We can also use the table shown below to assess Normality, using a hypothesis test where the null hypothesis is that the distribution is normal. The p-values for groups 1 and 3 are larger than .05, so this null hypothesis is not rejected for these groups. Tests of Normality Kolmogorov-Smirnova Treatment WeightGain Statistic df Shapiro-Wilk Sig. Statistic df Sig. 1 .094 26 .200* 2 .223 29 .001 .896 29 .008 17 .200* .954 17 .516 3 .129 a. Lilliefors Significance Correction *. This is a lower bound of the true significance. .952 26 .257 For the moment, we will assume that the conditions are met. 18 The SPSS output includes the following ANOVA table: ANOVA Gain Sum of Squares Between Groups df Mean Square 614.644 2 307.322 Within Groups 3910.742 69 56.677 Total 4525.386 71 F 5.422 Sig. .006 You can see that F = 5.422 and the p-value is .006. Since p is small, we reject the null hypothesis that the means are all equal. This data provides evidence of a difference in the mean weight gain for the three groups. But where is this difference? 19 Descriptives Gain 95% Confidence Interval for Mean N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum 1 26 -.450 7.9887 1.5667 -3.677 2.777 -12.2 15.9 2 29 3.007 7.3085 1.3572 .227 5.787 -9.1 20.9 3 17 7.265 7.1574 1.7359 3.585 10.945 -5.3 21.5 Total 72 2.764 7.9836 .9409 .888 4.640 -12.2 21.5 Which group had the greatest mean weight gain? Group 3 Which group had the lowest mean weight gain? Group 1 20 Descriptives Gain 95% Confidence Interval for Mean N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum 1 26 -.450 7.9887 1.5667 -3.677 2.777 -12.2 15.9 2 29 3.007 7.3085 1.3572 .227 5.787 -9.1 20.9 3 17 7.265 7.1574 1.7359 3.585 10.945 -5.3 21.5 Total 72 2.764 7.9836 .9409 .888 4.640 -12.2 21.5 Which group had the greatest mean weight gain? Group 3 Which group had the lowest mean weight gain? Group 1 Is either of these values significantly different from the other group means? Are all three groups different in terms of weight gain? We can answer these questions using a post-hoc test, Tukey's Honestly Significant Difference test, which compares all pairs of group means. 21 Here is one result of this test: Multiple Comparisons Gain Tukey HSD 95% Confidence Interval (I) Group (J) Group 1 2 -3.4569 2.0333 .212 -8.327 1.413 3 -7.7147* 2.3482 .005 -13.339 -2.090 1 3.4569 2.0333 .212 -1.413 8.327 3 -4.2578 2.2996 .161 -9.766 1.251 1 7.7147* 2.3482 .005 2.090 13.339 2 4.2578 2.2996 .161 -1.251 9.766 2 3 Mean Difference (I-J) Std. Error Sig. Lower Bound Upper Bound *. The mean difference is significant at the 0.05 level. The first line shows the comparison between Group 1 and Group 2. The mean difference is -3.4569, but it is not significant since p = .212. 22 Here is one result of this test: Multiple Comparisons Gain Tukey HSD 95% Confidence Interval (I) Group (J) Group 1 2 -3.4569 2.0333 .212 -8.327 1.413 3 -7.7147* 2.3482 .005 -13.339 -2.090 1 3.4569 2.0333 .212 -1.413 8.327 3 -4.2578 2.2996 .161 -9.766 1.251 1 7.7147* 2.3482 .005 2.090 13.339 2 4.2578 2.2996 .161 -1.251 9.766 2 3 Mean Difference (I-J) Std. Error Sig. Lower Bound Upper Bound *. The mean difference is significant at the 0.05 level. The next line shows that the difference between Group 1 and Group 3 is significant; not only is p = .005, but SPSS shows an asterisk beside the mean difference of -7.7147 to indicate that the difference is significant. 23 Here is one result of this test: Multiple Comparisons Gain Tukey HSD 95% Confidence Interval (I) Group (J) Group 1 2 -3.4569 2.0333 .212 -8.327 1.413 3 -7.7147* 2.3482 .005 -13.339 -2.090 1 3.4569 2.0333 .212 -1.413 8.327 3 -4.2578 2.2996 .161 -9.766 1.251 1 7.7147* 2.3482 .005 2.090 13.339 2 4.2578 2.2996 .161 -1.251 9.766 2 3 Mean Difference (I-J) Std. Error Sig. Lower Bound Upper Bound *. The mean difference is significant at the 0.05 level. What conclusions can you draw about the difference between Group 2 and Group 3? The difference between Group 2 and Group 3 is not significant. 24 Here is one result of this test: Multiple Comparisons Gain Tukey HSD 95% Confidence Interval (I) Group (J) Group 1 2 -3.4569 2.0333 .212 -8.327 1.413 3 -7.7147* 2.3482 .005 -13.339 -2.090 1 3.4569 2.0333 .212 -1.413 8.327 3 -4.2578 2.2996 .161 -9.766 1.251 1 7.7147* 2.3482 .005 2.090 13.339 2 4.2578 2.2996 .161 -1.251 9.766 2 3 Mean Difference (I-J) Std. Error Sig. Lower Bound Upper Bound *. The mean difference is significant at the 0.05 level. What conclusions can you draw about the difference between Group 2 and Group 3? Since p = .161, the difference between Group 2 and Group 3 is not significant. Are the means different for all three groups? 25 Here is one result of this test: Multiple Comparisons Gain Tukey HSD 95% Confidence Interval (I) Group (J) Group 1 2 -3.4569 2.0333 .212 -8.327 1.413 3 -7.7147* 2.3482 .005 -13.339 -2.090 1 3.4569 2.0333 .212 -1.413 8.327 3 -4.2578 2.2996 .161 -9.766 1.251 1 7.7147* 2.3482 .005 2.090 13.339 2 4.2578 2.2996 .161 -1.251 9.766 2 3 Mean Difference (I-J) Std. Error Sig. Lower Bound Upper Bound *. The mean difference is significant at the 0.05 level. Are the means different for all three groups? The three means are not all different; the only significant difference is between the means of Group 1 and Group 3. 26 Hypothesis Tests and Confidence Intervals A pair of means can be considered significantly different at a .05 level of significance if and only if zero is not contained in a 95% confidence interval for their difference. We can use Fisher's Least Significant Difference to determine where any differences lie by identifying any confidence intervals which do not contain 0. 27 Are the means different for all three groups? Are the results the same as when we used Tukey's HSD? 28 Are the means different for all three groups? Are the results the same as when we used Tukey's HSD? The only confidence interval that does not contain 0 is the CI for the difference on the means of Group 1 and Group 3. This indicates that the means for these two groups are different. 29 How else can we follow up on this analysis? Since the groups are independent, we can do our own pairwise t-tests for the difference of the means. 30 How else can we follow up on this analysis? Since the groups are independent, we can do our own pairwise t-tests for the difference of the means. Two-sample t-tests H0: μ1 - μ2 = 0 Ha: μ1 - μ2 ≠ 0 (or > 0 or < 0) Assumptions: Independent random samples: Approximately Normal distributions for both samples 31 Here are the results for the test of H0: μ1 - μ2 = 0 Ha: μ1 - μ2 ≠ 0 Is there a significant difference between the means of these groups ? What is your statistical conclusion? Be sure to state the p-value. p = .100 Since p is small, the null hypothesis is reject What is your conclusion in context?t there is a significant difference 32 Is there a significant difference between the means of these groups? What is your statistical conclusion? p = .100 Since p > .05, the null hypothesis is not rejected. What is your conclusion in context? The data does not indicate that there is a significant difference between the mean weight gain with cognitive-behavioral treatment and family therapy. 33 What else can be concluded? If the data was gathered in a well-designed experiment in which subjects were randomly assigned to treatment groups, then we can conclude causality. In an observational study in which random samples are taken from the populations, the results can be extended to the associated populations. 34 SPSS Instructions for ANOVA To create side-by-side boxplots of the data: Assume that your file has the groups in one column and the values of the variable in a second column. Choose > Graphs > Chart Builder Choose Boxplot and drag the first boxplot (Simple) to the preview area. Drag the column with the groups to the x-axis, and the column with the values of the predictor variable to the y-axis. Click OK. 35 To create Normal Probability Plots of the data: Choose > Analyze > Descriptive Statistics > Explore In the Explore dialog box, choose the Dependent List variable and the Factor List variable. Click on Plots. Click OK. 36 To perform a One-Way Analysis of Variance Choose > Analyze > Compare Means > One-Way ANOVA Choose the Dependent List variable and the Factor List variable. Click on Options, and under Statistics, choose Descriptive and Homogeneity of Variance Test. Click on Continue and then OK. 37 To perform Tukey's Honestly Significant Difference test Choose > Analyze > Compare Means > One-Way ANOVA (The variables may still be selected, so you may not have to enter the Dependent List variable and the Factor List variable.) Click on Post-Hoc, and select Tukey. Note that you can also select LSD to choose Fisher's Least Significant Difference test. 38