ECON4950 Problem Set 1 Georgia State University
Transcription
ECON4950 Problem Set 1 Georgia State University
ECON4950 Problem Set 1 Georgia State University Questions on Background Material 1. A random sample of 22 businessa economists were asked to predict the percentage growth in the consumer price index over the next year. The forcasts were: 3.6, 3.1, 3.9, 3.7, 3.5, 3.7, 3.4, 3.0, 3.6, 3.4, 3.1, 2.9, 3.0, 4.0, 2.8, 3.8, 4.2, 2.5, 3.1, 3.9, 2.9, 2.6 (a) What are the sample mean, minimum, and maximum? (b) What is the sample variance and standard deviation The following table displays data on annual solid waste collection for nine cities in the U.S. The data includes information on the number of households in the city, the total tons of solid waste (per year), and the revenue generated by the solid waste haulers (per year). City A B C D E F G H I Number of Households 2200 2500 2700 4000 4000 4000 5500 6000 9000 Total Tons 3080 3500 3780 5600 5600 5600 7700 8400 12600 Revenue 118800 200000 250560 201600 308800 268800 452100 277200 358200 (a) Calculate the covariance between revenue and number of households. (b) Calculate the covariance between revenue and total tons collected. 2. A large consumer goods compa y has been studying the effect of advertising on total profits. As part of this study, data on advertising expenditures and total sales were collected for a six-month period and are as follows: (10, 100), (15, 200), (7, 80), (12, 120), (14, 150). 1 (a) Plot the data and compute the correlation coefficient. (b) Do these results provide conclusive evidence that advertising has a positive effect on sales? Explain your reasoning. Questions on Chapter 2 (Simple Regression) from Wooldridge Answer the following questions from the end of the chapter of the textbook. Please show your work, and attach your log file for the computer problems. All data can be found at http://gsu-econ4950.s3.amazonaws.com. Problem 2.3 The following table contains the ACT score and the GP A (grade point average) for eight college students. Grade point average is based on a four-point scale and has been rounded to one digit after the decimal. Student 1 2 3 4 5 6 7 9 GP A 2.8 3.4 3.0 3.5 3.6 3.0 2.7 3.7 ACT 21 24 26 27 29 25 25 30 1. Estimate the relationship between GP A and ACT using OLS; that is, obtain the intercept and slope estimates in the equation ˆ A = βˆ0 + βˆ1 ACT GP (1) Comment on the direction of the relationship. Does the intercept have a useful interpretation here? Explain. How much higher is the GP A predicted to be if the ACT score is increased by five points? 2. Compute the fitted values and residuals for each observation, and verify that the residuals (approximately) sum to zero. 3. What is the predicted value of GP A when ACT = 20? 4. How much of the variation in GP A for these eight students is explained by ACT ? Explain. 2 Problem 2.4 The data set bwght.csv contains data on births to women in the United States. Two variables of interest are are the dependent variable, infant birth weight in ounces (bwght), and an explanatory variable, average number of cigarettes the mother smoked per day during pregnancy (cigs). The following simple regression was estimated using data on n = 1, 388 births. d = 119.77 − 0.514cigs bwght (2) • What is the predicted birth weight when cigs = 0? What about when cigs = 20 (one pack per day)? Comment on the difference. • Does this simple regression necessarily capture a causal relationship between the child’s birth weight and the mother’s smoking habits? Explain. • To predict a birth weight of 125 ounces, what would cigs have to be? Comment. • The proportion of women in the sample who do not smoke while pregnant is about .85. Does this help reconcile your finding from part 3. Problem 2.6 Using data from 1988 for houses sold in Andover, Massachusetts, from Kiel and McClain (1995), the following equation relates housing price (price) to the distance from a recently build garbage inccinerator (dist): d log(price) = 9.40 + 0.312 log(dist) n = 135, R2 = 0.162. 1. Interpret the coefficient on log(dist). Is the sign of this estimate what you expect it to be? 2. Do you think simple regression provides an unbiased estimator of the ceteris paribus elasticity of price with respect to dist? (Think about the city’s decision on where to put the incinerator.) 3. What other factors about a house affect its price? Might these be correlated with distance from the incinerator? Problem 2.9, First Part 1. Let βˆ0 and βˆ1 be the intercept and slope from the regression of yi on xi , using n observations. Let c1 and c2 , with c2 6= 0, be constants. Let β˜0 and β˜1 be the intercept and slope from the regression of c1 yi on c2 xi . Show that β˜1 = (c1 /c2 )βˆ1 and β˜0 = c1 βˆ0 . [Hint: To obtain β˜1 , plug the scaled versions of x and y into the definition of βˆ1 , and then use βˆ0 = y¯ − βˆ1 x ¯ for β˜0 .] 3 Computer Problem 2.1 The data in 401k.csv are a subset of data analyzed by Papke (1995) to study the relationship between participation in a 401(k) pension plan and the generosity of the plan. The variable prate is the percentage of eligible workers with an active account; this is the variable we would like to explain. The measure of generosity is the plan match rate, mrate. This variable gives the average amount the firm contributes to each worker’s plan for each $1 contribution by the worker. For example if mrate = 0.50, then a $1 contribution by the worker is matched by a 50 cent contribution by the firm. 1. Find the average participation rate and the average match rate in the sample of plans. 2. Now, estimate the simple regression equation: d = βˆ0 + βˆ1 mrate prate (3) and report the results along with the sample size and R-squared. 3. Interpret the intercept in your equation. mrate. Interpret the coefficient on 4. Find the predicted prate when mrate = 3.5. Is this a reasonable prediction? Explain what is happening here. 5. How much of the variation in prates explained by mrate? Is this a lot in your opinion? Computer Problem 2.2 The data set in ceosal2.csv contains information on chief executive officers for U.S. corporations. The variable salary is annual compensation, in thousands of dollars, and ceoten is prior number of years as company CEO. 1. Find the average salary and the average tenure in the sample. 2. How many CEOs are in their first year as CEO (that is, ceoten = 0)? What is the longest tenure as a CEO? 3. Estimate the simple regression model log(salary) = β0 + β1 ceoten + u (4) and report your results in the usual form. What is the (approximate) predicted percentage increase in salary given one more year as a CEO? 4 Computer Problem 2.6 Use the data in meap93.csv to explore the relationship between the math pass rate among tenth graders at a high school (math10) and spending per student (expend). 1. Do you think each additional dollar spent has the same effect on the pass rate, or does a diminishing effect seem more appropriate? Explain. 2. In the population model math10 = β0 + β1 log(expend) + u (5) argue that β1 /10 is the percentage point change in math10 given a 10% increase in expend. 3. Use the data in meap93.csv to estimate the model from part 2. Report the estimated equation in the usual way, including the sample size and R-squared. 4. How big is the estimated spending effect? Namely, if spending increases by 10%, what is the estimated percentage point increase in math10? 5. One might worry that regression analysis can produce fitted values for math10 are greater than 100. Why is this not much of a worry in this data set? 5