EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1
Transcription
EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1
EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1. The negative binominal is fit by only changing “dist=Poisson” to “dist=NegBin” (regardless of whether you use method I or II). I’ll show Method II here. Poisson Negative Binomial Parameter Estimate SE Wald p-value Est. se Wald p α −7.8180 0.0216 130, 556 < .0001 −7.8129 0.1200 4239.77 < .0001 1/ϕ = D 0.0000 . 0.3189 0.0936 Fit Statistics df 22 21 2 G 669.4458 24.1502 2 X 658.4846 21.6263 AIC 812.6197 244.2357 Note: 95% CI for dispersion parameter is (0.1355, 0.5023) The parameter estimates for α for the two models are very similar in value (i.e., −7.8180 versus −7.8129); however, their standard errors differ considerably (i.e., 0.0216 for Poisson and 0.1200). This is consistent with the need for the Negative Binomial due to overdispersion; that is, the se from Poisson are too small. There is evidence in support of the Negative Binomial model being the better one: • The estimated standard errors for Poisson and Negative binomial are very different and that for NB is much larger. This is consistent with there being over dispersion in the data that is a problem for the Poisson. • The 95% CI for the dispersion parameter (0.14, 0.50) does not include 1 and suggests there is overdispersion in the data. • G2 and X 2 indicate an acceptable fit of the model to the data (i.e., comparing them to a χ2 with ν = 22 would yield a large p-value. However, this is not the case for the Poisson. • The various information criteria (only AIC reported above) are all smaller for the NB than the Poisson. The smaller the value, the better the model. • It is reasonable to expect that the crowds over teams are heterogeneous, perhaps due to living in different cities with different SES, crowding, etc. 1 Before starting to analyze the data for the next three problems (i.e., 3.13., 3.14 and zero inflated, do a little bit of exploratory data analysis: 1. Compute the mean and variance of number of satellites. Compare. The mean is less than the variance (i.e., 2.92 < (3.15)2 ), which suggests overdispersion. 2. Plot a histogram of the number of satellites. Comment. Below is a graph of the distribution of satellites. Notice that there are a lot with 0. This could Other than this end of the distribution, Poisson may be OK. A lot of 0s could explain why we have overdispersion (model fitting will help us decide for sure). This might be best fit using a zero-inflated Poission. Figure 1: The distribution of the counts. 3. Look at the relationship between number of satellites (or log of the number) by weight. Comment. Also a look at the number of satellites versus weight with a smooth curve (actually a cubic regression) in the Figure 3. 2 Figure 2: Initial look at the data: counts versus explanatory variable with a cubic regression curve and log(count) versus explanatory variable 3 with a linear regression curve overlayed.. It appears that there is an outlier in terms of weight (i.e., weight> 5). I deleted it and recomputed the mean and variance, but doesn’t change results much, so I left it in for the homework answers. And now to do the problems. . . Problem 3.13 on page 94 of Agresti (2007). The data with SAS code to create a SAS data set is on the course web-site. Note that I re-scaled weight to kg. 1. The prediction equation is µ ˆi = exp(−0.4284 + 0.5893(weight)i ) 2. The estimated mean for a female weighing 2.44kg is µ ˆ = exp(−0.4284 + 0.5893(2.44)) = exp(1.0095) = 2.7442 3. For a one kg increase in weight, the (mean) number of satellites is exp(.5893) = 1.80 times (or 80% larger). Although a 95% confidence interval of βˆ is given in the SAS output, this comes from βˆ ± 1.96(se) ˆ 0.5893 ± 1.96(0.0650) 0.5893 ± 0.1274 −→ (0.4619, 0.7167) The 95% confidence interval for the multiplicative effect (i.e., exp(β)) is found by taking exp of the end-points of the interval for β: (exp(0.4619), exp(0.7167)) −→ (1.59, 2.05) 4. A Wald test: Ho : β = 0 versus Ha : β ̸= 0. ( 0.5893 X = .0650 2 )2 = 82.15. Comparing 82.15 a chi-square distribution with df = 1 yields a very small p-value; therefore, reject Ho and conclude the data support the hypothesis that the number of satellites is related to weight of the female crab. 5. A likelihood ratio test: −2(35.9898 − 71.9524) = 71.93, which has a very small p-value (compare 71.93 to chi-square with df = 1). Conclusion is the same as in part (d). 4 Problem 3.14 on page 94 of Agresti (2007). Fitting a negative binomial model. . . 1. The prediction equation is µ ˆ = exp(−0.8647 + 0.7603(weight)). The dispersion parameter is 1/ϕ = D(in Agresti notation) = 1.0740. The estimated standard error of the dispersion parameter is 0.1935. There is evidence that the Negative Binomial gives a better fit than the Poisson: • The 95% confidence interval for 1/ϕ = D is (0.6948, 1.4533). The value 0 is not in this interval which suggests we need the scale parameter. • All of the global fit statistics are much better for the Negative Binomial than the Poisson: Criterion Deviance Pearson Chi-Square AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) DF 171 171 Poisson Value Value/DF 560.8664 3.2799 535.8957 3.1339 920.1641 920.2347 926.4707 Negative Binomial Value Value/DF 196.1603 1.1471 147.9588 0.8653 754.6437 754.7857 764.1036 Since the Poisson is a special case of the binomial, we could do a likelihood ratio test (i.e., LR= 560.8664 − 196.1603 = 364.71, df = 1, p is tiny). Also, according to the information criteria, the Negative Binomial has smaller values and this indicate it’s better than the Poisson model. • Graphics indicate that the Negative Binomial out-performs the Poisson (I didn’t expect graphs, but it you did them, Great!). The Negative Binomial includes more points within the 95% confidence bands and fits the distribution of counts better than the Poisson (however there is room for improve. . . ZIP does the best). 5 Figure 3: The models were fit to data and then grouped to “see” how well the models are fitting the data. 6 Figure 4: To see how well the various models are doing in terms of fitting the distribution of number of satellites. Neither the Poisson or Negative Binomial are really doing that well; however, the ZIP does pretty good. 7 2. A 95% confidence interval for β with the Negative Binomial is βˆ ± 1.96(se) = 0.7603 ± 1.96(0.1769) = 0.7603 ± 0.3467 −→ (0.4136, 1.1070) Versus the one from the Poisson regression that was (0.4619, 0.7167) that has half-length equal to 0.1274. The one from the Negative Binomial is wider than the Poisson because the greater the estimated variance with the Negative Binomial (i.e., µ ˆi + 1.0740ˆ µ2i ) results in greater estimated standard error for β (see page 82 of the text). Fit a zero inflated Poisson regression using weight as a predictor of the mean and width as a predictor in a logit model for the mixing probability. I fit several ZIP models, but the one that seemed to the best in terms of fit of model to data and parameter estimates are significant is one with weight as a predictor in the Poisson regression and width as a predictor in a logit model for the mixing probability. The results are Criteria For Assessing Goodness Of Fit Criterion DF Deviance Scaled Deviance Pearson Chi-Square Scaled Pearson X2 Log Likelihood Full Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Value 725.7859 725.7859 229.9698 229.9698 167.1414 -362.8930 733.7859 734.0240 746.3991 169 169 Value/DF 1.3608 1.3608 Algorithm converged. Analysis Of Maximum Likelihood Parameter Estimates Parameter DF Estimate Standard Error Wald 95% Confidence Limits 8 Wald Chi-Square Pr > ChiSq Intercept weight Scale 1 1 0 0.9901 0.1945 1.0000 0.2092 0.0761 0.0000 0.5800 0.0454 1.0000 1.4002 0.3436 1.0000 22.39 6.54 <.0001 0.0106 NOTE: The scale parameter was held fixed. Analysis Of Maximum Likelihood Zero Inflation Parameter Estimates Parameter DF Estimate Standard Error Intercept width 1 1 12.3902 -0.5005 2.6937 0.1044 Wald 95% Confidence Limits 7.1106 -0.7051 17.6698 -0.2959 Wald Chi-Square 21.16 22.98 Pr > ChiSq <.0001 <.0001 So the estimated model for the probability is π ˆi = exp(12.3902 − 0.5005(width)i )) . 1 + exp(12.3902 − 0.5005(width)i ) The odds of being in the “zero class” is exp(−.5005) = 0.61 times the odds for a one unit increase in width. In other words, the wider the crab, the less likely they’re in the zero-class. The estimated probability of a count: µ ˆi = exp(0.9901 + 0.1945(weight)i ) { P (Yi = y) = π ˆi + (1 − π ˆi ) exp(−ˆ µi ) exp(−ˆ µi )ˆ µyi (1 − π ˆi ) y! for y = 0 for y > 0 Since exp(0.1945) = 1.21, the expected number of satellites is 1.21 than the mean number of satellites with one unit less in weight. 9 Figure 5: Observed and fitted from ZIP with logit model. 10