1) An estimator is a. an estimate. b. a formula that gives an efficient
Transcription
1) An estimator is a. an estimate. b. a formula that gives an efficient
1 Prof. Òscar Jordà Due Date: Thursday, April 24 Problem Set 2 ECONOMETRICS STUDENT’S NAME: Multiple Choice Questions [20 pts] Please provide your answers to this section below: 6. 7. 8. 9. 10. 1. 2. 3. 4. 5. 1) An estimator is a. b. c. d. an estimate. a formula that gives an efficient guess of the true population value. a random variable. a nonrandom number. Answer: c 2) An estimate is a. b. c. d. efficient if it has the smallest variance possible. a nonrandom number. unbiased if its expected value equals the population value. another word for estimator. Answer: b 3) An estimator µˆ Y of the population value µY is more efficient when compared to another estimator µ% Y , if a. b. c. d. E( µˆ Y ) > E( µ% Y ). it has a smaller variance. its c.d.f. is flatter than that of the other estimator. both estimators are unbiased, and var( µˆ Y ) < var( µ% Y ). Answer: d 4) Among all unbiased estimators that are weighted averages of Y1 ,..., Yn , Y is a. b. c. d. the only consistent estimator of µY . the most efficient estimator of µY . a number which, by definition, cannot have a variance. the most unbiased estimator of µY . 2 Answer: b 5) The p-value is defined as follows: a. p = 0.05. b. PrH 0 [| Y − µY ,0 |>| Y act − µY ,0 |]. c. Pr( z > 1.96) . d. PrH 0 [| Y − µY ,0 |<| Y act − µY ,0 |]. . Answer: b 6) A large p-value implies a. b. c. d. rejection of the null hypothesis. a large t-statistic. a large Y act . that the observed value Y act is consistent with the null hypothesis. Answer: d 7) The power of the test a. is the probability that the test actually incorrectly rejects the null hypothesis when the null is true. b. depends on whether you use Y or Y 2 for the t-statistic. c. is one minus the size of the test. d. is the probability that the test correctly rejects the null when the alternative is true. Answer: d 8) The sample covariance can be calculated in any of the following ways, with the exception of: 1 n ∑ ( X i − X )(Yi − Y ) . n − 1 i =1 1 n n b. X iYi − XY . ∑ n − 1 i =1 n −1 1 n c. ∑ ( X i − µ X )(Yi − µY ) . n i =1 d. rXY sY sY , where rXY is the correlation coefficient. a. Answer: c 9) When the sample size n is large, the 90% confidence interval for µY is 3 a. b. c. d. Y Y Y Y ± 1.96SE (Y ) . ± 1.64SE (Y ) . ± 1.64σ Y . ± 1.96 . Answer: b 10) The following statement about the sample correlation coefficient is true. a. –1 ≤ rXY ≤ 1. p 2 b. rXY → corr ( X i , Yi ) . c. | rXY |< 1 . d. rXY = 2 s XY . s X2 sY2 Answer: a Problems [40 pts] Instructions: The goal of the problem set is to understand what you are doing rather than just getting the correct result. Please show your work clearly and neatly. Please write your answers in the space provided. 1) Adult males are taller, on average, than adult females. Visiting two recent American Youth Soccer Organization (AYSO) under-12-year-old (U12) soccer matches on a Saturday, you do not observe an obvious difference in the height of boys and girls of that age. You suggest to your little sister that she collect data on height and gender of children in 4th to 6th grade as part of her science project. The accompanying table shows her findings. Height of Young Boys and Girls, Grades 4-6, in inches Boys (a) YBoys sBoys nBoys YGirls sGirls nGirls 57.8 3.9 55 58.4 4.2 57 Let your null hypothesis be that there is no difference in the height of females and males at this age level. Specify the alternative hypothesis. Answer: (b) Girls H 0 : µ Boys − µGirls = 0 vs. H1 : µ Boys − µGirls ≠ 0 Find the difference in height and the standard error of the difference. 4 Answer: YBoys − YGirls = -0.6, SE( YBoys − YGirls ) = (c) 3.92 4.22 = 0.77. + 55 57 Generate a 95% confidence interval for the difference in height. Answer: -0.6 ± 1.96 × 0.77 = (-2.11, 0.91). (d) Calculate the t-statistic for comparing the two means. Is the difference statistically significant at the 1% level? Which critical value did you use? Why would this number be smaller if you had assumed a one-sided alternative hypothesis? What is the intuition behind this? Answer: t = -0.78, so | t | < 2.58, which is the critical value at the 1% level. Hence you cannot reject the null hypothesis. The critical value for the one-sided hypothesis would have been 2.33. Assuming a one-sided hypothesis implies that you have some information about the problem at hand, and, as a result, can be more easily convinced than if you had no prior expectation. 2) The development office and the registrar have provided you with anonymous matches of starting salaries and GPAs for 108 graduating economics majors. Your sample contains a variety of jobs, from church pastor to stockbroker. (a) The average starting salary for the 108 students was $38,644.86 with a standard deviation of $7,541.40. Construct a 95% confidence interval for the starting salary of all economics majors at your university/college. Answer: 38,644.86 ± 1.96 × 7,541.40 = 38,644.86 ± 1,422.32 = (37,222.54, 40,067.18). 108 (b) A similar sample for psychology majors indicates a significantly lower starting salary. Given that these students had the same number of years of education, does this indicate discrimination in the job market against psychology majors? Answer: It suggests that the market values certain qualifications more highly than others. Comparing means and identifying that one is significantly lower than others does not indicate discrimination. (c) You wonder if it pays (no pun intended) to get good grades by calculating the average salary for economics majors who graduated with a cumulative GPA of B+ or better, and those who had a B or worse. The data is as shown in the accompanying table: Cumulative GPA Average Earnings Y B+ or better B or worse $39,915.25 $37,083.33 Standard Deviation sY $8,330.21 $6,174.86 n 59 49 Conduct a t-test for the hypothesis that the two starting salaries are the same in the population. Given that this data was collected in 1999, do you think that your results will hold for other years, such as 2002? 5 Answer: Assuming unequal population variances, t = (39,915.25 − 37, 083.33) 8,330.212 6,174.862 + 59 49 = 2.03. The critical value for a one-sided test is 1.64, for a two-sided test 1.96, both at the 5% level. Hence you can reject the null hypothesis that the two starting salaries are equal. Presumably you would have chosen as an alternative that better students receive better starting salaries, so that this becomes your new working hypothesis. 1999 was a boom year. If better students receive better starting offers during a boom year, when the labor market for graduates is tight, then it is very likely that they receive a better offer during a recession year, assuming that they receive an offer at all. 3) Assume that under the null hypothesis, Y has an expected value of 500 and a standard deviation of 20. Under the alternative hypothesis, the expected value is 550. Sketch the probability density function for the null and the alternative hypothesis in the same figure. Pick a critical value such that the p-value is approximately 5%. Mark the areas, which show the size and the power of the test. What happens to the power of the test if the alternative hypothesis moves closer to the null hypothesis, i.e., µY = 540, 530, 520, etc.? Answer: For a given size of the test, the power of the test is lower. 4) Consider the following alternative estimator for the population mean: . 1 1 7 1 7 1 7 Y = ( Y1 + Y2 + Y3 + Y4 + ... + Yn −1 + Yn ) n 4 4 4 4 4 4 Prove that Y is unbiased and consistent, but not efficient when compared to Y . 1 1 7 1 7 1 7 ( E (Y1 ) + E (Y2 ) + E (Y3 ) + E (Y4 ) + ... + E (Yn −1 ) + E (Yn )) n 4 4 4 4 4 4 1 1 7 n is unbiased. = µY (2 + 2 + ... + + ) = µY = µY . Hence Y n 4 4 n ) = Answer: E (Y 6 var(Y ) = E (Y − µY ) 2 = 1 1 7 1 7 1 7 E[ ( Y1 + Y2 + Y3 + Y4 + ... + Yn −1 + Yn ) − µY ]2 n 4 4 4 4 4 4 1 1 7 1 7 2 = 2 E[ (Y1 − µY ) + (Y2 − µY ) + ... + (Yn −1 − µY ) + (Yn − µY )] n 4 4 4 4 = 1 1 49 1 49 [ E (Y1 − µY ) 2 + E (Y2 − µY ) 2 + ... + E (Yn −1 − µY ) 2 + E (Yn − µY ) 2 ] 2 n 16 16 16 16 1 1 2 49 2 1 2 49 2 σ Y2 n 1 49 σ Y2 . [ σ Y + σ Y + ... + σ Y + σ Y ] = 2 [ ( + )] = 1.5625 n 2 16 n 16 16 16 n 2 16 16 ) → 0 as n → ∞, Y is consistent. Y has a larger variance than Y and Since var(Y = is therefore not as efficient. EViews Exercise [40 pts] The data for this exercise is contained in the excel file “salary.XLS.” It contains 93 observations on employees for a Chicago Bank during the period 1969-1971. In particular, the variable “salary” refers to the starting salary, “education” refers to years of education, “experience” refers to the number of months of previous work experience, “seniority” refers to the number of months the person has been working at the current job with the bank, and “gender” takes the value of one for males and 0 for females. You will need to import the data into EViews. This can be easily accomplished by opening a new workfile. Select the option corresponding to undated or irregular observations and a range that goes from 1 to 93. Then from the “procs” button, select the import option, and the “excel” option thereafter. The rest is pretty straightforward. Answer the following questions: 1. Give summary statistics for all variables in one table. Then, subsample according to gender and report the summary statistics with an additional two tables. Next, select all the variables (make sure to undo the subsampling condition) and compute their correlations and include this table with the previous tables in the same page. Comment on all of these results, making particular emphasis on whether you see any evidence of discrimination (nothing formal yet, just hunches). Beware of the experience and seniority levels of each group. 2. Do a formal test of the null hypothesis that females are discriminated against at a 5% confidence level. Do your results hold at a 1% confidence level? Consider now a test of the null hypothesis that males earn $250 more than females on average at a 5% confidence level. Make sure to report these tests in one page and in one paragraph report on the economic significance of your tests (Is there gender bias in pay scales?). Hints: make sure you consider whether a onetailed test or a two-tailed test is appropriate. 3. Load data on the exchange rate between the U.S. dollar and the Japanese yen into EViews. Select monthly frequency and select a data source that begins at least in 1980. You will need to plot these data and compute summary statistics. Regarding the plot, make sure that you label important economic dates (such as oil crises, or other political events) that may help explain some 7 of the long swings in these data. Make sure to label the graph appropriately, including title, units of Answers to Problem Set #2 (Empirical) 1. Basic Statistics SALARY Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Observations EDUCATION EXPERIENCE SENIORITY 5420.323 5400.000 8100.000 3900.000 709.5872 0.589159 4.172759 12.50538 12.00000 16.00000 8.000000 2.282369 -0.486804 2.684228 100.9269 70.00000 381.0000 0.000000 90.94755 1.146060 3.628779 16.72043 15.00000 34.00000 1.000000 10.25476 0.191436 1.881037 93 93 93 93 IF GENDER=0 (Females) SALARY EDUCATION EXPERIENCE SENIORITY Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 4979.1180 4950.0000 6300.0000 3900.0000 592.06040 0.2731300 2.4446870 11.617650 12.000000 16.000000 8.0000000 2.0451670 -0.3120040 3.2059770 82.491180 52.000000 244.00000 0.0000000 79.337440 0.7313630 2.1835760 15.529410 15.000000 33.000000 1.0000000 9.7615050 0.2388490 1.9262310 Observations 34.000000 34.000000 34.000000 34.000000 IF GENDER=1 (Males) SALARY EDUCATION EXPERIENCE SENIORITY Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 5674.5760 5580.0000 8100.0000 4620.0000 647.58260 1.0326770 4.8896460 13.016950 12.000000 16.000000 8.0000000 2.2704360 -0.7699380 2.9289280 111.55080 78.500000 381.00000 6.0000000 96.046220 1.2088630 3.5638260 17.406780 16.000000 34.000000 1.0000000 10.548930 0.1429750 1.8353130 Observations 59.000000 59.000000 59.000000 59.000000 Correlations SALARY EDUCATION EXPERIENCE SENIORITY SALARY 1.0000000 0.3837540 -0.1007780 0.1667530 EDUCATION EXPERIENCE SENIORITY 0.3837540 -0.1007780 0.1667530 1.0000000 -0.2981960 -0.1507460 -0.2981960 1.0000000 0.0001920 -0.1507460 0.0001920 1.0000000 There seems to be some evidence of discrimination. First, females earn lower salaries on average. However, females also tend to have less education, seniority, and experience, compared with men in the sample. This could explain why they receive lower salaries. We can see from the correlations, that education and seniority are positively correlated with salary, so this could explain the big differences between salaries for males versus females. 2. One test of whether females are discriminated against would be to ask whether their salaries differ significantly from those of males. Therefore, the hypothesis test would be: HO: µF ≥ µM (that the average salary for a female is greater than or equal to that of a male) HA: µF < µM (that the average salary for a female is less than that of a male) Hypothesis Testing for SALARY Sample(adjusted): 1 68 IF GENDER=0 Included observations: 34 after adjusting endpoints Test of Hypothesis: Mean = 5674.576 Sample Mean = 4979.118 Sample Std. Dev. = 592.0604 Method t-statistic Value -6.849274 Probability 0.0000 Note that Eviews reports the p-value for a two-tailed test. For the one-tailed test, the critical t-statistic approximately -1.667, so we reject the null hypothesis. This test reveals that females earn a lower salary than males. You should conduct a one-tailed test to see if women receive a lower salary compared to men. Alternatively, you could set up a two-tailed test to test whether the salaries differ. The t-statistic is the same as the one reported above, and the p-value reveals that we reject the null hypothesis. For the test of whether males make more than $250 above the average salary for females: HO: µM ≤ µF + 250 (that the average salary of a male is less than $250 greater than that of a female) HA: µM >µF + 250 (that the average salary of a male is more than $250 greater than that of a female) Hypothesis Testing for SALARY Sample(adjusted): 13 93 IF GENDER=1 Included observations: 59 after adjusting endpoints Test of Hypothesis: Mean = 5229.118 Sample Mean = 5674.576 Sample Std. Dev. = 647.5826 Method t-statistic Value 5.283697 Therefore, males earn an average salary that is as least $250 greater than that of females. 3. A few of the relevant events are labeled on the graph below. Notice, that an increase in the exchange rate implies an appreciation of the U.S. dollar and depreciation of the Japanese yen. Probability 0.0000 Exchange Rate: Japan (Yen/$US) 280 240 Gulf War 200 Oil prices fall 62% (1985-1986) 160 1990-91 U.S. Recession Asian Financial Crisis Current U.S. Recession 1980-83 U.S. 120 Recessions (2) 80 1980 1985 1990 1995 2000