BE640 Exam II 2015 - University of Massachusetts Amherst
Transcription
BE640 Exam II 2015 - University of Massachusetts Amherst
PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ PubHlth 640 Intermediate Biostatistics Spring 2015 Examination 2 Units 3, 4 and 5 – Discrete Distributions, Categorical Data Analysis & Logistic Regression Due: Wednesday April 22, 2015 Before you begin: This is a “take-home” exam. You are welcome to use any reference materials you wish. You are welcome to use the computer as you wish, too. However, you MUST work this exam by yourself and you may not consult with anyone. Instructions and Checklist: __1. Start each problem on a new page. __ 2. Write your name on every page. __ 3. Make a photo-copy of your exam for safekeeping prior to submission __ 4. Complete the signature page __ 5. Please DO NOT submit a copy of the exam questions. How to submit your exam (sorry – Faxed exams are NOTpermitted): (1) ONLINE Students Please be sure your name is somewhere on your submission. Next, save it as a SINGLE FILE pdf using the naming convention lastname_exam2.pdf. Email it to me at: [email protected] (2) Worcester Section. The UMass calendar says that Wednesday April 22 is a “Monday class schedule”. Tentatively, please bring your exam (stapled, please) to class on Wednesday April 22, 2015. If you are unable to come to class, I will accept a pdf (see instructions for online students). We need to choose a night to meet during the week April 20-24, 2015. (2) Amherst Section The UMass calendar says that Wednesday April 22 is a “Monday class schedule”. Tentatively, please bring your exam (stapled, please) to class on Wednesday April 22, 2015. If you are not coming to class, please put your exam in my mailbox, located in the mail room on the 4th floor of Arnold house. Tentatively, we will have an optional “lab” session on Wednesday April 22, 2015. (3) ALL I will also accept exams sent by U.S. Post. Please mail with postmark no later than April 22, 2015 to: Carol Bigelow School of Public Health/402 Arnold House University of Massachusetts/Amherst 715 North Pleasant Street Amherst, MA 01003-9304 Tel. 413-545-1319. \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 1 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ Signature This is to confirm that in completing this exam, I worked independently and did not consult with anyone. Name: ___________________________________________________________ Date: ___________________________ Thank you! \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 2 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 1. (10 points total) It is believed that 25% of children exposed to a particular infectious agent become ill with the disease. In 100 playgroups of 4 children each, the following frequencies of disease were observed: ___________________________________________________________________ Number of Cases Frequency Expected Frequency __________________________________________________________________ 0 38 31.6 1 35 42.2 2 15 21.1 3 7 4.7 4 5 0.4 ___________________________________________________________________ Set up the computations and verify the expected frequency numbers shown in the third column of this table. \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 3 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 2. (10 points) Suppose it is known that a certain genetic mutation occurs in an insect population, on average, in 20 out of 10,000 insects. Next suppose that 8,000 insects are sampled and a count is obtained of the number of insects that have the genetic mutation. Let X be the appropriately defined Binomial distribution for this setting and let Y be the appropriately defined Poisson distribution for this setting. Thus, the count of insects with the genetic mutation is either X distributed Binomial or it is Y distributed Poisson. 2a. (6 points) Using the appropriately defined Binomial and Poisson distributions, complete the following table of probabilities: X distributed Binomial Y distributed Poisson Pr [ X = 0 ] = _____ Pr [ Y = 0 ] = _____ Pr [ X > 1 ] = _____ Pr [ Y > 1 ] = _____ Pr [X < 5 ] = _____ Pr [Y < 5 ] = _____ 2b. (2 points) What are the values of the mean and standard deviation of the random variable X? 2c. (2 points) What are the values of the mean and standard deviation of the random variable Y? \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 4 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 3. (10 points total) Twelve ants and eighteen flies were placed in a container with insecticide and observed. After sixteen insects had died, there were nine ants alive and five flies alive. Apply the Fisher’s exact test to test the null hypothesis that ants and flies are equally susceptible to the insecticide. Carry out the appropriate statistical test to address this question. 3a. (2 points) The null and alternative hypotheses. (Be sure to define your terms). 3b. (5 points) The achieved level of significance (p-value). 3c. (3 points) An interpretation of your findings in terms that a layperson can understand. \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 5 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 4. (10 points total) A study was made of 100 terminal cancer patients who were given either vitamin C or placebo as part of their therapy. Patients differed in their age (AGE), gender (SEX), and location of tumor (SITE). Of interest is the outcome 0/1 remission (REMISS) at 30 days. A subset of the data, which includes 40 patients is presented here. Vitamin C Group SITE SEX AGE Stomach Stomach Stomach Stomach Stomach F M F F M 61 69 62 66 63 Bronchus Bronchus Bronchus Bronchus Bronchus M M M M F Colon Colon Colon Colon Colon Rectum Rectum Rectum Rectum Rectum Placebo Group REMISS SITE SEX AGE REMISS Yes No Yes No yes Stomach Stomach Stomach Stomach Stomach F F M M M 58 71 63 45 57 No No Yes Yes no 74 74 66 52 48 No Yes No No No Bronchus Bronchus Bronchus Bronchus Bronchus M F F M M 74 50 66 50 87 Yes Yes No Yes No F F M M F 76 58 49 69 70 Yes Yes Yes Yes No Colon Colon Colon Colon Colon F M M F F 35 50 89 67 55 Yes No No Yes No F F F M M 56 75 57 56 68 No Yes Yes Yes No Rectum Rectum Rectum Rectum Rectum M M F M F 82 51 73 85 64 No Yes No No Yes Carry out the appropriate statistical test to assess whether, overall, the data suggest that supplemental treatment with vitamin C is effective with respect to the outcome of remission. Hint – This exercise also asks you to use the information provided to construct the 2x2 table that you then analyze. \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 6 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 5. (20 points) Dear class – The data for this question are fictitious. Consider the following case-control study investigation of the relationship of asbestos and lung cancer. An important covariate is smoking. You are given the 2x2 table distribution of asbestos exposure (yes/no) and lung cancer (yes/no), overall and separately for strata defined by smoking (smokers and non-smokers) Overall Asbestos Exposure Yes No Yes 80 15 Lung Cancer No 38 152 Yes 75 5 Lung Cancer No 20 80 Yes 5 10 Lung Cancer No 18 72 Stratum = 1 (Smokers) Asbestos Exposure Yes No Stratum = 2 (Non-Smokers) Asbestos Exposure Yes No \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 7 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 5a. (4 points, subtotal) What are the values of (i) (1 point) the “overall” odds ratio? (ii) (1 point) the Mantel-Haenszel estimate of the “overall” odds ratio? (iii) (1 point) the stratum specific odds ratio for stratum =1 (Smokers) (iv) (1 point) the stratum-specific odds ratio for stratum=2 (Non-smokers) 5b. (4 points) Perform the appropriate statistical test of the null hypothesis of homogeneity of association. 5c (4 points) Using your answer to question #5b, in your opinion, is there statistically significant evidence that the relationship between asbestos exposure and lung cancer differ (is modified) by smoking status? 5d. (4 points) Perform the Mantel-Haenszel test of the null hypothesis of no association . 5e. (4 points) In 2-3 sentences at most, what do you conclude? \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 8 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 6. (10 points) Suppose we want to learn about the relationship between education and prevalence of smoking in a particular community. Consider a study (it’s hypothetical) of a simple random sample of 585 adults, all of whom have completed at least a high school education and all of whom are of the same socio-economic status. The explanatory variable is education with 5 levels. The outcome variable is current smoking with 2 levels. Education completed = High School Associate Degree More than Associate, Some College Undergraduate Degree More than Undergraduate Total Current Smoker = Yes No 12 38 18 67 27 95 32 239 5 52 94 491 Total 50 85 122 271 57 585 Is there any statistically significant evidence of a downward trend in smoking prevalence associated with higher level of education completed? Carry out the appropriate statistical test to address this question. In reporting your answer, please state 6a. (2 points) The null and alternative hypotheses. (Be sure to define your terms). 6b. (2 points) The test statistic and its calculated value. 6c. (3 points) The achieved level of significance. 6d. (3 points) An interpretation of your findings in terms that a layperson can understand. \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 9 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 7. (10 points total) In a logistic regression analysis of the likelihood (π) of mortality that considered several variables, a one predictor model was fit to malnutrition (MALNUT) coded 1=malnutrition, 0=NO malnutrition. The following was obtained logiˆt[πˆ ] = -1.8563 + 1.210[malnut] The 2x2 table associated with these data is the following Mortality 1 = Dead 0=Alive MALNUT 1=Malnourished 0=NOT malnourished 11 10 21 64 32 74 21 85 106 7a. (4 points) Verify that the regression coefficient (beta) for MALNUT in the logistic regression model is the natural logarithm of the odds ratio for MALNUT in the 2x2 table. Show all work. 7b. (3 points) Using the logistic regression model, what is the formula for the predicted probability of death for a person who is malnourished? What is its calculated numeric value? 7c. (3 points) Using the 2x2 table, what is the formula for the empirical estimate of the probability of death for a person who is malnourished? Hint – the empirical estimate is simply the observed proportion. What is its calculated numeric value? \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 10 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 8. (10 points total) A logistic regression model analysis was performed to investigate the relationship of sex, age, and income with event of clinical depression (1=yes). The following results were obtained. Sex (1=Female) Age (per year) Income (per $1,000) Constant (intercept) βˆ ˆ ˆ SE(β) p-value 0.925 -0.024 -0.040 -0.477 0.393 0.009 0.014 0.867 0.02 0.01 0.01 0.19 Using this model, what is the estimated relative odds (odds ratio, OR) of clinical depression for a female aged 60 with income $50,000 compared to a reference person who is male aged 45 with income $75,000? \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 11 of 12 PubHlth 640 Exam 2 – Spring 2015 Name __________________________________________________ 9. (10 points total) A survey of senior high-school students queried the use of each of cigarettes, alcohol, and marijuana. In one analysis it was of interest to explore the association of cigarette use and/or alcohol use as predictors of marijuana use. Thus, in this analysis, marijuana use (yes or no) was treated as the response variable Y. Y was coded 1=yes and 0=no. The two other variables (cigarette use and alcohol use) were treated as predictor variables. Each of these were also coded as 1=yes and 0=no. The following table shows the output for a logistic regression model containing the two predictors ALCOHOL and CIGARETTES. Predictor Intercept ALCOHOL CIGARETTES Coefficient, βˆ Se Coeff, seˆ ⎡βˆ ⎤ Wald Z p-value -5.30904 2.98601 2.84789 0.475190 0.464671 0.163839 -11.17 6.43 17.38 < .0001 < .0001 < .0001 ⎣ ⎦ 9a. (2 points) For the model summarized in the table, state the prediction equation for the estimated probability ( πˆ ) of marijuana use. 9b. (2 points) ˆ alcohol=0, cigarettes=0 ) for a senior high-school What is the estimated probability of marijuana use ( π student who does not drink and who does not smoke cigarettes? 9c. (2 points) ˆ alcohol=1, cigarettes=1) for a senior high-school What is the estimated probability of marijuana use ( π student who drinks and who also smokes cigarettes? 9d. (2 points) Using the model fit summarized in the table below, complete the following table. Estimated Probability of Marijuana Use ( πˆ ) , by Alcohol Use and Cigarette Use, Based on Model Cigarette Use Alcohol Use Yes No Yes ______ _____t No ______ _____ 9e. (2 points) In 1-3 sentences, what conclusions do you draw from these analyses? \...\2015\docu\exams & solutions\BE640 Exam 2 2015.docx Page 12 of 12