p - HKUST Business School
Transcription
p - HKUST Business School
Karl Schmedders MANAGERIAL ECONOMICS & DECISION SCIENCES Visiting Professor of Managerial Economics & Decision Sciences PhD, 1996, Operations Research, Stanford University MS, 1992, Operations Research, Stanford University Vordiplom, 1990, Business Engineering, Universitat Karlsruhe, Highest Honors, Ranked first in a class of 350 EMAIL: [email protected] OFFICE: Jacobs Center Room 528 Karl Schmedders is Associate Professor in the Department of Managerial Economics and Decision Sciences. He holds a PhD in Operations Research from Stanford University. Professor Schmedders’ research interests include computational economics, general equilibrium theory, asset pricing and portfolio selection. His work has been published in Econometrica, The Review of Economic Studies, The Journal of Finance, and many other academic journals. He teaches courses in decision science both in the MBA and the EMBA program at Kellogg. Professor Schmedders has been named to the Faculty Honor Roll in every quarter he has taught at Kellogg. He has received numerous teaching awards, including the 2002 Lawrence G. Lavengood Outstanding Professor of the Year. Professor Schmedders is the only Kellogg faculty member to receive the ‘Ehrenmedaille’ (Honorary Medal) of Kellogg’s partner school WHU. Research Interests Mathematical economics, in particular general equilibrium models involving time and uncertainty Asset pricing Mathematical programming KH19, Course Description Managerial Statistics Course Description In this course we will cover the following topics: Confidence Intervals Hypothesis Tests Regression Analysis Our objective is to quickly cover the first two topics. While they are important by themselves many people describe them as rather “dry” course material. However, they will be of great help to us when we cover the main subject of the course, regression analysis. Regressions are extremely useful and can deliver eye-opening insights in many managerial situations. You will solve some entertaining case studies which show the power of regression analysis. We will cover the material in this case packet as well as the following chapters of the textbook: Sections 13.1 and 13.2 of chapter 13; Section 14.1 of chapter 14; Chapter 15; Chapter 16; Chapter 19; Chapter 21; Chapter 23. Time permitting, we will also cover parts of chapter 25. There will be several team assignments. After the conclusion of the course, there will be an in-class final exam on the first day of the following module, that is, on April 1, 2016. The final grades in this course will be determined as follows. Team assignments: Class participation: Final Exam: 40% 10% 50% 1 KH19, Course Description In case you would like to prepare for our course, you should start reading the relevant sections of Chapters 13 and 14 in our textbook. Before you do that, please also consider the following suggestions. 1) Review the material on the normal distribution from your probability course. In particular, you should review the use of the functions NORMDIST, NORMSDIST, NORMINV and NORMSINV in Excel. 2) We will use the software KStat that was developed at Kellogg. Ideally you should install KStat on your laptop before our first class. I realize that all of you are very busy and you may not have the time to prepare at length for our course. Please note, however, that the better you prepare the faster we can cover the early parts of the course material and the more time we have for the fun part, the coverage of regression analysis. Of course, I am happy to help you with your preparation. Please do not hesitate to contact me with any questions or concerns. My email address is [email protected]. 2 When Scientific Predictions Are So Good They're Bad - The New York Times Page 1 of 3 September 29, 1998 When Scientific Predictions Are So Good They're Bad By WILLIAM K. STEVENS NOAH had it easy. He got his prediction straight from the horse's mouth and was left in no doubt about what to do. But when the Red River of the North was rising to record levels in the spring of 1997, the citizens and officials of Grand Forks, N.D., were not so privileged. They had to rely on scientists' predictions about how high the water would rise. And in this case, Federal experts say, the flood forecast may have been issued and used in a way that made things worse. The problem, the experts said, was that more precision was assigned to the forecast than was warranted. Officials and citizens tended to take as gospel an oft-repeated National Weather Service prediction that the river would crest at a record 49 feet. Actually, there was a wider range of probabilities; the river ultimately crested at 54 feet, forcing 50,000 people to abandon their homes fast. The 49-foot forecast had lulled the town into a false sense of security, said Dr. Roger A. Pielke Jr. of the National Center for Atmospheric Research in Boulder, Colo., a consultant on a subsequent inquiry by the weather service. In fixating on the single number of 49 feet, the people involved in the Grand Forks disaster made a common error in the use of predictions and forecasts, experts who have studied the case say. It was, they say, a case of what Alfred North Whitehead, the mathematician and philosopher, once termed ''misplaced concreteness.'' And whether the problem is climate change, earthquakes, droughts or floods, they say the tendency to overlook uncertainties, margins of error and ranges of probability can lead to damaging misjudgments. The problem was the topic of a workshop this month at Estes Park, Colo. In part, participants said, the problem arises because decision makers sometimes want to avoid making hard choices in uncertain situations. They would rather place responsibility on the predictors. Scientifically based predictions, typically using computerized mathematical models, have become pervasive in modern society. But only recently has much attention been paid to the proper use -- and misuse -- of predictions. The Estes Park workshop, of which Dr. Pielke was an organizer, was an attempt to come to grips with the question. The workshop was sponsored by the Geological Society of America and the National Center for Atmospheric Research. People have predicted and prophesied for millenniums, of course, through means ranging from the visions of shamans and the warnings of biblical prophets to the examination of animal entrails. With the arrival of modern science, people teased out fundamental laws of physical and chemical behavior and used them to make better and better predictions. But once science moves beyond the relatively deterministic processes of physics and chemistry, prediction gets more complicated and chancier. The earth's atmosphere, for instance, often frustrates efforts to predict the weather and long-term climatic changes because scientists have not nailed down all of its physical workings and because a substantial measure of chaotic unpredictability is inherent in the climate system. The result is a considerable range of uncertainty, much more so than is popularly associated with science. So while computer modeling has often made reasonable predictions possible, they are always uncertain; results are by definition a model of reality, not reality itself. The accuracy of predictions varies widely. Some, like earthquake forecasts, have proved so disappointing that experts have turned instead to forecasting longer-term earthquake potential in a general sense and issuing last-second warnings to distant communities once a quake has begun. In some cases, the success of a prediction is near impossible to judge. For instance, it will take thousands of years to know whether the environmental effects of buried radioactive waste will be as predicted. On the other hand, daily weather forecasts are checked almost instantly and are used to improve the next day's forecast. But weather forecasting is also a success, the assembled experts agreed, because people know its shortcomings and take them into consideration. Weather forecasts ''are wrong a lot of the time, but people expect that and they use them accordingly,'' said http://www.nytimes.com/1998/09/29/science/when-scientific-predictions-are-so-good-they... 7/15/2009 When Scientific Predictions Are So Good They're Bad - The New York Times Page 2 of 3 Robert Ravenscroft, a Nebraska rancher who attended the workshop as a ''user'' of predictions. A prediction is to be distrusted, workshop participants said, when it is made by the group that will use it as a basis for policy making -- especially when the prediction is made after the policy decision has been taken. In one example offered at the workshop, modeling studies purported to show no harmful environmental effects from a gold mine that a company had decided to dig. Another type of prediction miscue emerged last March in connection with asteroids, the workshop participants were told by Dr. Clark R. Chapman, a planetary scientist at the Southwest Research Institute in Boulder. An astronomer erroneously calculated that there was a chance of one-tenth of 1 percent that a mile-wide asteroid would strike Earth in 30 years. The prediction created an international stir but was withdrawn a day later after further evidence turned up. This ''uncharacteristically bad'' prediction, said Dr. Chapman, would not have been issued had it been subjected to normal review by the forecaster's scientific peers. But, he said, there was no peer-review apparatus set up to make sure that ''off-thewall predictions don't get out.'' (Such a committee has since been established by NASA.) Most sins committed in the name of prediction, however, appear to stem from the uncertainty inherent in almost all forecasts. ''People don't understand error bars,'' said one scientist, referring to margins of error. Global climate change and the Red River flood offer two cases in point. Computer models of the climate system are the major instruments used by scientists to project changes in climate that might result from increasing atmospheric concentrations of heat-trapping gases, like carbon dioxide, emitted by the burning of fossil fuels. Basing its forecast on the models, a panel of scientists set up by the United Nations has projected that the average surface temperature of the globe will rise by 2 to 6 degrees Fahrenheit, with a best estimate of 3.5 degrees, in the next century, and more after that. This compares with a rise of 5 to 9 degrees since the depths of the last ice age. The temperature has increased by about 1 degree over the last century. But the magnitude and nature of any climate changes produced by any given amount of carbon dioxide are uncertain. Moreover, it is unclear how much of the gas will be emitted over the next few years, said Dr. Jerry D. Mahlman, a workshop participant who directs the National Oceanic and Atmospheric Administration's Geophysical Fluid Dynamics Laboratory at Princeton, N.J. The laboratory is one of the world's major climate modeling centers, and the oldest. This uncertainty opens the way for two equal and opposite sins of misinterpretation. ''The uncertainty is used as a reason for governments not to act,'' in the words of Dr. Ronald D. Brunner, a political scientist at the University of Colorado at Boulder. On the other hand, people often put too much reliance on the precise numbers. In the debate over climate change, the tendency is to state all the uncertainties and caveats associated with the climate model projections -- and then forget about them, said Dr. Steve Rayner, a specialist in global climate change in the District of Columbia office of the Pacific Northwest National Laboratory. This creates a ''fallacy of misplaced confidence,'' he said, explaining that the specific numbers in the model forecasts ''take on a validity not allowed by the caveats.'' This tendency to focus unwisely on specific numbers was termed ''fallacious quantification'' by Dr. Naomi Oreskes, a historian at the University of California at San Diego. Where uncertainty rules, many at the workshop said, it might be better to stay away from specific numbers altogether and issue a more generalized forecast. In climate change, this might mean using the models as a general indication of the direction in which the climate is going (whether it is warming, for instance) and of the approximate magnitude of the change, while taking the numbers with a grain of salt. None of which means that the models are not a helpful guide to public policy, said Dr. Mahlman and other experts. For example, the models say that a warming atmosphere, like today's, will produce heavier rains and snows, and some evidence suggests that this is already happening in the United States, possibly contributing to damaging floods. Local planners might be well advised to consider this, Dr. Mahlman said. One problem in Grand Forks was that lack of experience with such a damaging flood aggravated the uncertainty of the flood forecast. Because the river had never before been observed at the 54-foot level, the models on which the prediction was based were ''flying blind,'' said Dr. Pielke; there was no historical basis on which to produce a reliable forecast. But this was apparently lost on local officials and the public, who focused on the specific forecast of a 49-foot crest. This number was repeated so often, according to the report of an inquiry by the National Weather Service, that it ''contributed to an http://www.nytimes.com/1998/09/29/science/when-scientific-predictions-are-so-good-they... 7/15/2009 When Scientific Predictions Are So Good They're Bad - The New York Times Page 3 of 3 impression of certainty.'' Actually, the report said, the 49-foot figure ''created a sense of complacency,'' because it was only a fraction of a foot higher than the record flood of 1979, which the city had survived. ''They came down with this number and people fixated on it,'' Tom Mulhern, the Grand Forks communications officer, said in an interview. The dikes protecting the city had been built up with sandbags to contain a 52-foot crest, and everyone figured the town was safe, he said. It is difficult to know what might have happened had the uncertainty of the forecast been better communicated. But it is possible, said Mr. Mulhern, that the dikes might have been sufficiently enlarged and people might have taken more steps to preserve their possessions. As it was, he said, ''some people didn't leave till the water was coming down the street.'' Photo: Petty Officer Tim Harris patroled an area of Grand Forks, N.D., in April 1997, where the Red River flooded the houses up to the second story. Residents, relying on the precision of forecasts, were forced to flee quickly. (Reuters)(pg. F6) Copyright 2009 The New York Times Company Home Privacy Policy Search Corrections XML Help Contact Us Back to Top http://www.nytimes.com/1998/09/29/science/when-scientific-predictions-are-so-good-they... 7/15/2009 Managerial Statistics KH 19 1 – Sampling Course material adapted from Chapters 13.1, 13.2, and 14.1 of our textbook Statistics for Business, 2e © 2013 Pearson Education, Inc. Learning Objectives Describe why sampling is important Understand the implications of sampling variation Explain the flaw of averages Define the concept of a sampling distribution Determine the mean and standard deviation for the sampling distribution of the sample mean Describe the Central Limit Theorem and its importance Determine the mean and standard deviation for the sampling distribution of the sample proportion 2 Tools of Business Statistics Descriptive statistics Collecting, presenting, and describing data Inferential statistics Drawing conclusions and/or making decisions concerning a population based only on sample data 3 Populations and Samples A Population is the set of all items or individuals of interest Examples: All likely voters in the next election All parts produced today All sales receipts for March A Sample is a subset of the population Examples: 1000 voters selected at random for interview A few parts selected for destructive testing Random receipts selected for audit 4 Properties of Samples A representative sample is a sample that reflects the composition of the entire population. A sample is biased, if a systematic error occurs in the selection of the sample. For example, the sample may systematically omit a portion of the population. 5 Population vs. Sample Population a b Sample cd b ef gh i jk l m n o p q rs t u v w x y z c gi o n r u y 6 Why Sample? Less time consuming than a census Less costly to administer than a census It is possible to obtain statistical results of a sufficiently high precision based on samples 7 Two Surprising Properties Surprise 1: The best way to obtain a representative sample is to pick members of the population at random. Surprise 2: Larger populations do not require larger samples. 8 Randomization A randomly selected sample is representative of the whole population (avoids bias). Randomization ensures that on average a sample mimics the population. Randomization enables us to infer characteristics of the population from a sample. 9 Comparison of Two Random Samples Two large samples (each with 8,000 data points) drawn at random from a population of 3.5 million customers of a bank 10 (In)Famous Biased Sample The Literary Digest predicted a landslide defeat for Franklin D. Roosevelt in the 1936 presidential election. They selected their sample from, among others, a list of telephone numbers. The size of their sample was about 2.4 million! Telephones were a luxury during and soon after the Great Depression. Roosevelt’s supporters tended to be poor and were grossly underrepresented in the sample. 11 Simple Random Sample (SRS) A Simple Random Sample (SRS) is a sample of n data points chosen by a method that has an equal chance of picking any sample of size n from the population. An SRS is the standard to which all other sampling methods are compared. An SRS is the foundation for virtually all of the theory of statistics. 12 Inferential Statistics Making statements about a population by examining sample results Sample statistics (known) Population parameters Inference Sample (unknown, but can be estimated from sample evidence) Population 13 Tools of Inferential Statistics Drawing conclusions and/or making decisions concerning a population based on sample results. Estimation Example: Estimate the population mean age using the sample mean age. Hypothesis Testing Example: Use sample evidence to test the claim that the population mean age is 40.5 years. 14 Estimating Parameters Parameter: a characteristic of the population (e.g., mean µ) Statistic: an observed characteristic of a sample (e.g., sample average y , x ) Estimate: using a statistic to approximate a parameter 15 Notation for Statistics and Parameters 16 Sampling Variation Sampling Variation is the variability in the value of a statistic from sample to sample. Two samples from the same population will rarely (if ever) yield the same estimate. Sampling variation is the price we pay for working with a sample rather than the population. 17 The Flaw of Averages 18 The Flaw of Averages (continued) Our culture encodes a strong bias either to neglect or ignore variation. We tend to focus instead on measures of central tendency, and as a result we make some terrible mistakes, often with considerable practical import. Stephen Jay Gould, 1941 – 2002, evolutionary biologist, historian of science 19 Point Estimates A sample statistics is a point estimate. It provides a single number (e.g. the sample mean) for an unknown population parameter (e.g. the population mean). A point estimate delivers no information on the possible sampling variation. A key step in any careful statistical analysis is to quantify the effect of sampling variation. 20 Definitions An estimator of a population parameter is a random variable that depends on sample information . . . whose value provides an approximation to this unknown parameter. A specific value of that random variable is called an estimate. 21 Sampling Distributions The sampling distribution is the probability distribution that describes how a statistic, such as the mean, varies from sample to sample. 22 Testing of GPS Chips A manufacturer of GPS chips selects samples for highly accelerated life testing (HALT). HALT scores range from 1 (failure on first test) to 16 (chip endured all 15 tests without failure). Even when the production process is functioning normally, there is variation among HALT scores. 23 Testing 400 Chips Distribution of individual HALT scores 24 Distribution of Daily Average Scores Distribution of average HALT scores (54 samples, each with sample size n=20) 25 Benefits of Averaging Averaging reduces variation: The sample-tosample variance among average HALT scores is smaller than the variance among individual HALT scores. The distribution of average HALT scores appears more “bell shaped” than the distribution of individual HALT scores. 26 Sampling Distributions Sampling Distributions Sampling Distribution of Sample Mean Sampling Distribution of Sample Proportion 27 Expected Value of Sample Mean Let x1, x2, . . . , xn represent a random sample from a population. The sample mean value of these observations is defined as 1 n x xi n i 1 The random variable “sample mean” is denoted by X and its specific value in the sample by x . 28 Standard Error of the Mean Different samples from the same population will yield different sample means. A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean: σ SE(X) n The standard error of the mean decreases as the sample size increases. 29 Standard Error of the Mean (continued) The standard error is proportional to σ. As population data become more variable, sample averages become more variable. The standard error is inversely proportional to the square root of the sample size n. The larger the sample size, the smaller the sampling variation of the averages. 30 If the Population is Normal If a population is normally distributed with mean μ and standard deviation σ, then the sampling distribution of the sample mean X is also normally distributed with E(X) μ and SE(X) σ n 31 Sampling Distribution Properties E(X) μ Normal Population Distribution ( X is unbiased ) μ x E(X) x Normal Sampling Distribution (has the same mean) 32 Sampling Distribution Properties (continued) As n increases, Larger sample size SE(X) decreases Smaller sample size x μ 33 If the Population is not Normal We can apply the Central Limit Theorem: Even if the population is not normal, … sample means from the population will be approximately normal as long as the sample size is large enough. Properties of the sampling distribution: E(X) μ and SE(X) σ n 34 Central Limit Theorem As the sample size gets large enough … the sampling distribution becomes almost normal regardless of shape of population. n↑ x 35 If the Population is not Normal (continued) Sampling distribution properties: Population Distribution Central Tendency E(X) μ Variation SE(X) σ n x μ Sampling Distribution (becomes normal as n increases) Larger sample size Smaller sample size E(X) μ x 36 How Large is Large Enough? For most distributions, a sample size of n > 30 will give a sampling distribution that is nearly normal. For normal population distributions, the sampling distribution of the mean is always normally distributed regardless of the sample size. 37 More Formal Condition Sample Size Condition for an application of the central limit theorem: A normal model provides an accurate approximation to the sampling distribution of X if the sample size n is larger than 10 times the squared skewness and larger than 10 times the absolute value of the kurtosis, n 10K 32 and n 10 K 4 . 38 Average HALT Scores Design of the chip-making process indicates that the HALT score of a chip has a mean µ = 7 with a standard deviation σ = 4. Sampling distribution of average HALT scores (n = 20) σ2 42 0 .89 2 X ~N μ 7 , n 20 39 Average HALT Scores (continued) The sampling distribution of average HALT scores is (approximately) a normal distribution with mean 7 and standard deviation 0.89. 40 Sampling Distributions of Sample Proportions Sampling Distributions Sampling Distribution of Sample Mean Sampling Distribution of Sample Proportion 41 Population Proportions p p = the proportion of the population having some characteristic Sample proportion ( p̂ ) provides an estimate of p: pˆ # items in the sample with the characteristic of interest sample size 0 ≤ p̂ ≤ 1 p has a binomial distribution, but can be approximated by a normal distribution when n is large enough 42 Sampling Distribution Normal approximation: Sampling Distribution .3 .2 .1 0 Properties: E(pˆ ) p 0 and σ 2pˆ .2 .4 .6 8 1 p̂ p( 1 p) n (where p = population proportion) 43 Sample Size Condition Sample size condition for proportions, npˆ 10 and n(1 pˆ ) 10 . If this condition holds, then the distribution of the sample proportion p̂ is approximately a normal distribution. 44 Take Aways Understand the notion of sampling variation. Appreciate the dangers of the flaw of averages. Grasp the concept of a sampling distribution. Have an idea of the central limit theorem. Know the sampling distributions of a sample mean and of a sample proportion. 45 Pitfalls Do not confuse a sample statistic for the population parameter. Do not fall for the flaw of averages. 46 Managerial Statistics KH 19 2 – Confidence Intervals Course material adapted from Chapter 15 of our textbook Statistics for Business, 2e © 2013 Pearson Education, Inc. Learning Objectives Distinguish between a point estimate and a confidence interval estimate Construct and interpret a confidence interval of a population proportion Construct and interpret a confidence interval of a population mean 2 Point and Interval Estimates A point estimate is a single number. A Confidence Interval provides additional information about variability. Lower Confidence Limit Upper Confidence Limit Point Estimate Width of confidence interval 3 Point Estimates We can estimate a population parameter … with a sample statistic (a point estimate) mean μ x proportion p p̂ 4 Confidence Interval Estimate An interval gives a range of values: Takes into consideration variation in sample statistics from sample to sample Based on observation from a single sample Provides more information about a population characteristic than does a point estimate Relies on the sampling distribution of the statistic. Stated in terms of level of confidence Can never be 100% confident 5 Estimation Process Random Sample Population (mean, μ, is unknown) Mean x = 50 I am 95% confident that μ is between 40 and 60. Sample 6 General Formula The general formula for all confidence intervals is: Point Estimate (Reliability Factor)(Standard Error) The value of the reliability factor depends on the desired level of confidence. 7 Confidence Intervals Confidence Intervals Population Mean Population Proportion 8 Confidence Interval for the Proportion Recall that the Central Limit Theorem implies a normal model for the sampling distribution of p̂. E( p̂ ) = p and SE( p̂ ) = p (1 p ) / n SE( p̂ ) is called the Standard Error of the Proportion. 9 Interpretation The sample statistic in 95% of samples lies within 1.96 standard errors of the population parameter. 10 Interpretation (continued) Probability that sample proportion p̂ deviates by less than 1.96 standard errors of the proportion from the true (but unknown) population proportion p is 95%. P( –1.96 SE(p̂ ) ≤ p – p̂ ≤ +1.96 SE(p̂ ) ) = 0.95. 11 95% Confidence Interval for p For 95% of samples, the interval formed by reaching 1.96 standard errors to the left and right of p̂ will contain p. Problem: We do not know the value of the standard error of the proportion, SE(p̂ ), since it depends on the true (but unknown) parameter p. We estimate this standard error using p̂ in place of p, se(pˆ ) pˆ ( 1 pˆ ) n 12 Confidence Interval for p The 100(1 – α)% confidence interval for p is pˆ zα/ 2 pˆ ( 1 pˆ ) p pˆ zα/ 2 n pˆ ( 1 pˆ ) n where z/2 is the standard normal value for the level of confidence desired (“reliability factor”) p̂ is the sample proportion n is the sample size 13 Finding the Reliability Factor, z/2 Consider a 95% confidence interval: 1 .95 α .025 2 z units: z = p units: α .025 2 -1.96 Lower Confidence Limit 0 Point Estimate z= 1.96 Upper Confidence Limit 14 Common Levels of Confidence Most commonly used confidence level is 95%. Confidence Level 80% 90% 95% 98% 99% 99.8% 99.9% Confidence Coefficient, 1 Z/2 value .80 .90 .95 .98 .99 .998 .999 1.28 1.645 1.96 2.33 2.58 3.08 3.27 15 Affinity Credit Card Before deciding to offer an affinity credit card to alumni of a university, the credit card company wants to know how many customers will accept the offer. Population: Alumni of the university Parameter of interest: Proportion p of alumni who will return the application for the credit card 16 SRS of Alumni Question: What should we conclude about the proportion p in the population of 100,000 alumni who will accept the offer if the card is launched on a wider scale? Method: Construct a confidence interval based on the results of a simple random sample. 17 SRS of Alumni (continued) The credit card issuer sent preapproved applications to a sample of 1000 alumni. Of these, 140 accepted the offer and received the card. Summary Statistics: 18 Checklist for Application of Normal SRS condition. The sample is a simple random sample from the relevant population. Sample size condition (for proportion). Both npˆ and n(1 pˆ ) are larger than 10. 19 Credit Card: Confidence Interval The estimated standard error is se( pˆ ) 0 .14 ( 1 0 .14 ) 0 .01097 1000 The 95% confidence interval is 0.14 ± 1.96 × 0.01097 ≈ [0.1185, 0.1615] 20 Credit Card: Conclusion With 95% confidence, the population proportion that will accept the offer is between 11.85% and 16.15%. If the bank decides to launch the credit card, might 20% of the alumni accept the offer? It’s not impossible but rather unlikely given the information in our sample; 20% is outside the 95% confidence interval for the unknown proportion p. 21 Margin of Error The confidence interval, pˆ zα/ 2 pˆ ( 1 pˆ ) p pˆ zα/ 2 n pˆ ( 1 pˆ ) n can also be written as pˆ ME where ME is called the Margin of Error, ME zα/ 2 pˆ ( 1 pˆ ) n 22 Reducing the Margin of Error The width of the confidence interval is equal to twice the margin of error. ME zα/ 2 pˆ ( 1 pˆ ) n The margin of error can be reduced if the sample size is increased (n↑), or the confidence level is decreased, (1 – ) ↓ . 23 Margin of Error in the News You often read in the news statements like the following: The CNN/USA Today/Gallup poll taken March 7-10 showed that 52% of Americans say… . The poll had a margin of error of plus or minus four percentage points. No confidence level is given! The assumed confidence level is typically 95%. In addition, the 1.96 is rounded up to 2. 24 Margin of Error in the News (continued) For an interpretation of this statement we use the confidence interval formula pˆ ME where ME = 0.04 ≥ 2 pˆ ( 1 pˆ ) . n We can have (slightly more than) 95% confidence that the true proportion of Americans saying … is between 48% and 56%. 25 Confidence Intervals Confidence Intervals Population Mean Population Proportion 26 Sampling Distribution of the Mean Recall that the Central Limit Theorem implies a normal model for the sampling distribution of X. E( X ) = μ and SE(X) σ n SE( X ) is called the Standard Error of the Mean. 27 Interpretation Probability that sample mean X deviates by less than 1.96 standard errors of the mean from the true (but unknown) population mean μ is 95%. P( –1.96 SE( X ) ≤ μ – X ≤ +1.96 SE(X )) = 0.95. Once again, the sample statistic lies within about two standard errors of the corresponding population parameter in 95% of samples. 28 Confidence Interval for μ Since the population standard deviation σ is unknown, we estimate it using the sample standard deviation, s. n (xi x )2 s i 1 n-1 This step introduces extra uncertainty, since s is variable from sample to sample. As an adjustment, we use the t-distribution instead of the normal distribution. 29 Student’s t-Distribution Consider an SRS of n observations with mean x and standard deviation s from a normally distributed population with mean μ. Then the variable Tn 1 X μ S/ n follows the Student’s t-distribution with (n - 1) degrees of freedom. 30 Student’s t-Distribution The t-distribution is a family of distributions. The t-value depends on the degrees of freedom (df). Number of observations that are free to vary after sample mean has been calculated df = n – 1 31 Student’s t-Distribution Note: t (continued) Z as n increases Standard Normal (t with df = ∞) t (df = 13) t-distributions are bellshaped and symmetric, but have ‘fatter’ tails than the normal t (df = 5) 0 t 32 t distribution values With comparison to the Z value Confidence t Level (df = 10) t (df = 20) t Z (df = 30) ____ .80 1.372 1.325 1.310 1.282 .90 1.812 1.725 1.697 1.645 .95 2.228 2.086 2.042 1.960 .99 3.169 2.845 2.750 2.576 Note: t Z as n increases 33 Confidence Interval for μ Assumptions Population is normally distributed. If population is not normal, use “large” sample. Use Student’s t-Distribution 100(1-α)% Confidence Interval for μ: x tα/ 2 ,n-1 s s μ x tα/ 2 ,n-1 n n where t α/2,n-1 is the reliability factor from the t-distribution with n-1 degrees of freedom and an area of α/2 in each tail. 34 Affinity Credit Card Before deciding to offer an affinity credit card to alumni of a university, the credit card company wants to know how large a balance those alumni will carry who accept the offer. Population: (Future) credit card balances of (future) customers among the alumni of the university Parameter of interest: Mean μ of (future) balances carried by alumni on their affinity credit card 35 SRS of Alumni The 140 alumni who accepted the offer and received the affinity credit card have been carrying an average monthly balance of x = $1990.50 with a standard deviation of s = $2,833.33. 36 SRS of Alumni (continued) Question: What should we conclude about the average future credit card balance μ on the new affinity credit card for this particular university? Method: Construct confidence interval. 37 Checklist for Application of Normal SRS condition. The sample is a simple random sample from the relevant population. Sample size condition (for mean). The sample size is larger than 10 times the squared skewness and 10 times the absolute value of the kurtosis. 38 Credit Card: Confidence Interval The estimated standard error is se ( X ) = 2,833.33 / 140 = 239.46. The t-value for a 95% confidence interval with 139 degrees of freedom is T.INV.2T(0.05,139) = 1.97718. The 95% confidence interval is 1,990.50 ± 1.97718 × 239.46 = [1517.04, 2463.96]. 39 Credit Card: Conclusion We are 95% confident that the true but unknown µ lies between $1,517.04 and $2,463.96. If the bank decides to launch the credit card, might the average balance be $1,250? It’s not impossible but based on the sample results it’s rather unlikely. 40 Confidence Interval and Confidence Level If P(a ≤ p ≤ b) = 1 - then the interval from a to b is called a 100(1 - )% confidence interval of p. The quantity (1 - ) is called the confidence level of the interval ( between 0 and 1). In repeated samples of the population, the true value of the parameter p would be contained in 100(1 - )% of intervals calculated this way. 41 Intervals and Level of Confidence Sampling distribution of the proportion α/ 2 Intervals extend from pˆ zα/ 2 se( pˆ ) 1 α E( pˆ ) p p̂ p̂ to pˆ zα/ 2 se( pˆ ) α/ 2 p 100(1-)% of intervals constructed contain p; 100()% do not. Confidence Intervals 42 Confidence Level, (1-) Suppose confidence level = 95% Also written (1 - ) = 0.95 A relative frequency interpretation: From repeated samples, 95% of all the confidence intervals that can be constructed will contain the unknown true parameter. 43 Common Confusions: Wrong Interpretations 95% of all customers keep a balance of $1,517 to $2,464. The CI gives a range for the population mean µ, not the balance of individual customers. The mean balance of 95% of samples of 140 accounts will fall between $1,517 and $2,464. The CI provides a range for µ, not the means of other samples. 44 Common Confusions: Wrong Interpretations (continued) The mean balance is between $1,517 and $2,464. The average balance in the population may not fall within the CI. The confidence level of the interval is 95%. It may not contain µ. 45 Correct Interpretation We are 95% confident that the mean monthly credit card balance for the population of customers who accept an application lies between $1,517 and $2,464. The phrase “95% confident” is our way of saying that we are using a procedure that produces an interval containing the unknown mean in 95% of samples. 46 Transforming Confidence Intervals Obtaining Ranges for Related Quantities If [L,U] is a 100(1 – α)% confidence interval for µ, then [c×L,c×U] is a 100 (1 – α)% confidence interval for c×µ and [c+L,c+U] is a 100(1 – α)% confidence interval for c+µ. 47 Application: Property Taxes Motivation A mayor is considering a tax on business that is proportional to the amount spent to lease property in her city. How much revenue would a 1% tax generate? 48 Property Taxes Method Need a confidence interval for µ (average cost of a lease) to obtain a confidence interval for the amount raised by the tax. Check conditions (SRS and sample size) before proceeding. 49 Property Taxes (continued) Mechanics Univariate statistics mean standard deviation standard error of the mean Total Lease Cost 478,603.48 535,342.56 35,849.19 minimum median maximum range 20,409.00 290,559.00 2,820,213.00 2,799,804.00 skewness kurtosis 1.953 4.138 number of observations t-statistic for computing 95%-confidence intervals 223 1.9707 50 Property Taxes (continued) Mechanics 95% confidence interval for average lease cost 478603 ± 1.9707 × 35849 = [407955, 549252] 95% confidence interval for average tax revenue per business 0.01 × [407955, 549252] = [4079.55, 5492.52] 51 Conclusion Message We are 95% confident that the average cost of a lease is between $407,955 and $549,252. The 95% confidence interval for tax raised per business is therefore [$4079, $5493]. Since the number of businesses leased in the city is 4,500, we are 95% confident that the amount raised will be between $18,358,000 and $24,716,000. 52 Best Practices Be sure that the data are an SRS from the population. Stick to 95% confidence intervals. Round the endpoints of intervals when presenting the results. Use full precision for intermediate calculations. 53 Pitfalls Do not claim that a 95% confidence interval holds µ. Do not use a confidence interval to describe other samples. Do not manipulate the sampling to obtain a particular confidence interval. 54 Managerial Statistics KH 19 3 – Hypothesis Tests Course material adapted from Chapter 16 of our textbook Statistics for Business, 2e © 2013 Pearson Education, Inc. Learning Objectives Formulate null and alternative hypotheses for applications involving a single population proportion a single population mean Execute the four steps of a hypothesis test Know how to use and interpret p-values Know what Type I and Type II errors are 2 Motivating Example An office manager is evaluating software to filter SPAM e-mails (cost $15,000). To make it profitable, the software must reduce SPAM to less than 20%. Should the manager buy the software? The manager wants to test the software. 3 Motivating Example (continued) To demonstrate how well the software works, the software vendor applied its filtering system to email arriving at the office. After passing through the filter, a sample of 100 messages contained only 11% spam (and no valid messages were removed). 4 Motivating Example (continued) Question: Okay, 11% is better than 20%. But does that mean the manager should buy this software? Method: Use a Hypothesis Test to answer this question. Idea: Use the sample result, pˆ 0.11, to decide whether the software will be profitable, p < 0.2. 5 What is a Hypothesis? A hypothesis is a claim about the value of an unknown parameter: population proportion Example: The proportion of spam will be below 20%, that is, p < 0.2. population mean Example: The average monthly rent for all rental properties exceeds $500, that is, μ > 500. 6 The Null Hypothesis, H0 The Null Hypothesis, H0, states the claim to be tested; specifies a default course of action; preserves the status quo. Example: The proportion of spam that slips past the filter is at least 20% (H0: p ≥ 0.2). H0 is always about a population parameter, not about a sample statistic. H0 : p ≥ 0.20 H0 : p̂ ≥ 0.20 7 The Null Hypothesis, H0 (continued) We begin with the assumption that the null hypothesis is true. Similar idea to the notion of innocent until proven guilty Always contains “=” , “≤”, or “” sign May or may not be rejected 8 The Alternative Hypothesis, Ha The Alternative Hypothesis, Ha (H1), is the opposite of the null hypothesis. Example: The proportion of spam that slips past the filter is less than 20% (Ha: p < 0.2). Ha never contains the “=” , “≤”, or “” sign. Ha may or may not be supported. Ha is generally the hypothesis that the decision maker is trying to support. 9 Spam Filter: Hypotheses Step 1 of a hypothesis test: Define the hypotheses H0 and Ha. H0: p ≥ p0 = 0.20 Ha: p < p0 = 0.20 10 Two Possible Options We may decide to reject H0 (accept Ha). Alternatively, we may decide not to reject H0 (we do not accept Ha). There is no third option. 11 Reason for Rejecting H0 Sampling Distribution of p 0.11 If it is unlikely that we would get a sample proportion of this value ... p = 0.2 If H0 is true ... if in fact this were the population proportion… X ... then we reject the null hypothesis that p ≥ 0.2. 12 Errors in Decision-Making Type I Error Reject a true null hypothesis Example: Buy software that will not reduce spam to below 20% of incoming emails. Considered a serious type of error Threshold probability of Type I Error is Called level of significance or simply level of the test Set in advance by decision maker 13 Errors in Making Decisions (continued) Type II Error Fail to reject a false null hypothesis Example: Do not buy software that would have reduced spam to below 20% of incoming emails. The probability of Type II Error is β. 1-β is also called the power of a test. 14 Outcomes and Probabilities Possible Hypothesis Test Outcomes Actual Situation Key: Outcome (Probability) Decision H0 True H0 False Do Not Reject H0 No error (1 - ) Type II Error (β) Reject H0 Type I Error () No Error (1-β) 15 Type I & II Errors Type I and Type II errors cannot happen at the same time. Type I error can only occur if H0 is true. Type II error can only occur if H0 is false. 16 Evaluation of Hypotheses Sample proportion pˆ 0.11 < 0.2. Is this relationship sufficient to reject the null hypothesis? No! The claim is about the population proportion p. Maybe we just have a lucky (unlucky?) sample. That is, the test result may be due to sampling error. 17 Evaluation of Hypotheses (continued) Hypothesis tests rely on the sampling distribution of the statistic that estimates the parameter specified in the null and the alternative. Key question: What is the chance of getting a sample that differs from H0 by as much as (or even more than) this one if H0 is true? 18 Spam Filter A sample of size n = 100 delivered a sample proportion of pˆ 0.11 . Question: Assuming H0: p ≥ 0.20 is true, how likely is this deviation of 0.09 (or more)? Assuming H0 is true, the sampling distribution of p̂ is approximately normal with mean p = 0.20 and SE( p̂ ) = 0.04 (note that the hypothesized “boundary” value p0 = 0.20 is used to calculate SE). 19 Spam Filter (continued) What is the chance of finding a sample proportion of pˆ 0.11 or even smaller? 20 Test Statistic Step 2 of a hypothesis test: Calculate the test statistic. z pˆ p 0 p 0 ( 1 p 0 )/n 0 .11 0 .20 0 .20 ( 1 0 .20 )/ 100 2 .25 21 Meaning of Test Statistic The test statistic measures the difference between the sample outcome and the boundary value of the null hypothesis in multiples of the standard error. Spam filter example: The sample proportion lies 2.25 standard errors of the proportion below the boundary value in the null hypothesis. Since the sample distribution is assumed to be normal, the test statistic for proportions is also called z-statistic. 22 From Test Statistic to Probability Since the sampling distribution of the sample proportion is (approximately) normal, we can calculate the probability of a sample outcome of at least 2.25 standard errors below the mean. This probability is the famous p-value. 23 p-value Step 3 of a hypothesis test: Calculate the p-value. p = NORM.S.DIST(-2.25,1) ≈ 0.012 p = NORM.DIST(0.11,0.2,0.04,1) ≈ 0.012 24 Calculating the p-value p-value = NORMSDIST(-2.25) = 0.012 -2.25 0 z pˆ p0 SE ( pˆ ) Under the null hypothesis (H0: p ≥ 0.2), our sample proportion is at least 2.25 standard errors below the population proportion. The probability of such a sample outcome is 1.2% (p-value). 25 Type I Error and p-value Question: Suppose we decide to reject H0. What is the probability of a Type I error? Answer: The p-value is the (maximal) chance of a Type I error if H0 is rejected based on the observed test statistic. 26 Level of Significance Common practice is to reject H0 only if the pvalue is less than a preset threshold. This threshold that sets the maximum tolerance for a Type I error is called level of significance or α-level. Statistically significant difference from the null hypothesis: Data contradicts H0 and leads us to reject H0 since p-value < α. 27 Decision Step 4 of a hypothesis test: Compare p-value to α and make a decision. p-value = 0.012 < 0.05 = α We reject H0 and accept the alternative hypothesis Ha. The spam software reduces the proportion of spam e-mails to less than 20%. The office manager should buy the software. 28 Summary 29 Take Aways I The Four Steps of a Hypothesis Test: 1. Define H0 and Ha. 2. Calculate the test statistic. 3. Calculate the p-value. 4. Compare the p-value to the significance level α. Make a decision. Accept Ha if pvalue < α. 30 Take Aways II Hypothesis Testing: The Idea We always try to prove the alternative hypothesis, Ha. We then assume that its opposite (the null hypothesis) is true. H0 and Ha must be totally exhaustive & mutually exclusive. We can never possibly prove H0! 31 Take Aways III We ask the question: how likely is to obtain our evidence, given that the null hypothesis is (supposedly) true? This probability is called the p-value. Not likely (small p) we have statistically “proven” the alternative hypothesis, so we reject the null. Likely (not small p) we cannot reject the null. 32 Application: Burger King Ads Motivation The Burger King ad featuring Coq Roq won critical acclaim (and resulted in much controversy as well as several lawsuits). In a sample of 2,500 homes, MediaCheck found that only 6% saw the ad. An ad must be viewed by 5% or more of households to be effective. Based on these sample results, should the local sponsor run this ad? 33 Burger King Ads Method Perform a hypothesis test. Set up the null and alternative hypotheses. H0: p ≤ 0.05 Ha: p > 0.05 Use α = 0.05. Note that p is the population proportion who watches this ad. (Both SRS and sample size conditions are met.) 34 Burger King Ads (continued) Mechanics Perform the necessary calculations for an evaluation of the null hypothesis. z 0.06 0.05 2.294 0.05(1 0.05) / 2,500 NORM.S.DIST(2.294,1) = 0.9891 p-value = 1 – 0.9891 = 0.0109 < 0.05 = α Reject H0. 35 Conclusion Message The hypothesis test shows a statistically significant result. We can conclude that more than 5% of households watch this ad. The Burger King Coq Roq ad is cost effective and should be run. 36 Hypothesis Test of a Mean Hypothesis tests of the mean are similar to tests of proportions. H0 and Ha are claims about the unknown population mean μ. For example, H0: µ ≤ µ0 and Ha: µ > µ0 . The test statistic uses the random variable X , the sample mean. Unlike in the test of proportions, the standard error is not specified since σ is unknown. 37 Hypothesis Test of a Mean (continued) Just as in the calculation of a CI we estimate the unknown population standard deviation σ with the known sample standard deviation s. SE(X) σ n se(X) s n The resulting test statistic is t x 0 s/ n 38 Hypothesis Test of a Mean (continued) In a hypothesis test of a mean the test statistic is called a t-statistic since the appropriate sampling distribution is the t-distribution. Specifically, the distribution of the t-statistic in a hypothesis test of a mean is the t-distribution with n-1 degrees of freedom. We use this distribution to calculate the p-value. 39 Denver Rental Properties A firm is considering expanding into the Denver area. In order to cover costs, the firm needs rents in this area to average more than $500 per month. Are Denver rents high enough to justify the expansion? 40 Univariate Statistics Univariate statistics The firm obtained rents for a sample of size n = 45; the average rent was $647.33 with a sample std. dev. s = $298.77. mean standard deviation standard error of the mean Rent ($/Month) 647.3333333 298.7656424 44.53735239 minimum median maximum range 140 610 1600 1460 skewness kurtosis 0.617 0.992 number of observations t-statistic for computing 95%-confidence intervals 45 2.0154 41 Hypotheses H0 and Ha Let µ = mean monthly rent for all rental properties in the Denver area. Step 1: Set up the hypotheses. H0: µ ≤ µ0 = 500 Ha: µ > µ0 = 500 42 Test Statistic Step 2: Compute the test statistic. t x 0 647 . 33 500 3 . 308 44.5374 s/ n The average rent in the sample is 3.308 standard errors of the mean above the boundary value in the null hypothesis. 43 p-value Step 3: Calculate the p-value. T.DIST.RT(3.308,44) = 0.0009394 The p-value is 0.09394% and thus below 0.1%. 44 Make a Decision Step 4: Compare the p-value to α and make a decision. p-value = 0.0009394 < 0.05 = α We reject H0 and accept Ha. We conclude that the average rent in the Denver area exceeds the break-even value. 45 Summary: Tests of a Mean 46 Checklist SRS condition: the sample is a simple random sample from the relevant population. Sample size condition. Unless the population is normally distributed, a normal model can be used to approximate the sampling distribution of if the sample size n is larger than 10 times both the squared skewness and the absolute value of the kurtosis. 47 Application: Returns on IBM Stock Motivation Does stock in IBM return more, on average, than T-Bills? From 1980 through 2005, T-Bills returned 0.5% each month. 48 Returns on IBM Stock Method Let µ = mean of all future monthly returns for IBM stock. Set up the hypotheses as follows (Step 1): H0: µ ≤ 0.005 Ha: µ > 0.005 The sample consists of monthly returns on IBM for 312 months (January 1980 – December 2005). 49 Returns on IBM Stock (continued) Univariate statistics IBM Return 0.01063365 mean 0.08053206 standard deviation standard error of the mean 0.00455923 minimum median maximum range -0.2619 0.0065 0.3538 0.6157 skewness kurtosis 0.303 1.624 number of observations t-statistic for computing 95%-confidence intervals The sample yields x = 0.01063 s = 0.08053 312 1.9676 50 Returns on IBM Stock (continued) Mechanics Step 2: Calculation of test statistic. t x μ0 0 .0106 0 .005 1 .236 0 .004559 s/ n Step 3: Calculation of p-value. T.DIST(1.236,311,1) ≈ 0.1088 Step 4: Compare p-value to α = 0.05. p-value = 0.1088 > 0.05 = α. Do NOT reject H0. 51 Conclusion Message According to monthly IBM returns from 1980 through 2005, the IBM stock does not generate statistically significantly higher earnings than comparable investments in US Treasury Bills. 52 Failure to Reject H0 Our failure to reject H0 and to prove Ha does not mean the null is true. We did not prove the null hypothesis. Our sample evidence is just too weak to prove Ha at a 5% or even 10% significance level. If we had rejected H0, then the chance of making a Type I error (p-value of about 11%) would have been too high for the given level of significance. If the α-level had been 15% then we could have proven Ha. 53 Significance vs. Importance Statistical significance does not mean that you have made a practically important or meaningful discovery. The size of the sample affects the p-value of a test. With enough data, a trivial difference from H0 leads to a statistically significant outcome. Such a trivial difference may be practically unimportant. 54 Confidence Interval vs. Test Confidence intervals make positive statements about the population. A confidence interval provides a range of parameter values that are compatible with the observed data. Hypothesis tests provide negative statements. A test provides a precise analysis of specific hypothesized values for a parameter. A test attempts to reject a specific hypothesis for a parameter. 55 Two-tailed Hypothesis Test Hypotheses in a Two-tailed Hypothesis Test are of the following form: mean: H0: µ = 0.005 Ha: µ ≠ 0.005 Ha: p ≠ 0.2 proportion: H0: p = 0.2 The calculation of the test statistic is identical to the calculation in a One-tailed Hypothesis Test. 56 Two-Tailed Hypothesis Test (continued) By convention, the p-value in a two-tailed test is defined as two times the p-value of the corresponding one-tailed test. As a consequence, the two-tailed p-value does not have the intuitive interpretation along the lines “The probability of the sample result assuming the null is true”. This convention leads to a paradox. 57 One-tailed Test on IBM Returns Step 1: H0: µ ≤ 0.005 Ha: µ > 0.005 Step 2: Calculation of test statistic. t x 0 0 . 0106 0 . 005 1 . 236 0.004559 s/ n Step 3: Calculation of p-value. T.DIST(1.236,311,1) ≈ 0.1088 Step 4: Compare p-value to α = 0.15. p-value = 0. 1088 < 0.15 = α. Reject H0. 58 Two-tailed Test on IBM Returns Step 1: H0: µ = 0.005 Ha: µ ≠ 0.005 Step 2: Calculation of test statistic. t x 0 0 . 0106 0 . 005 1 . 236 0.004559 s/ n Step 3: Calculation of p-value. T.DIST(1.236,311,2) ≈ 0.2175 Step 4: Compare p-value to α = 0.15. p-value = 0. 2175 > 0.15 = α. Do NOT reject H0. 59 Paradox According to the one-tailed hypothesis test we can prove that µ > 0.005. But according to the two-tailed test we cannot prove that µ ≠ 0.005. That’s the paradox! The reason for the convention leading to the paradox is to obtain a sensible relation between two-tailed hypothesis tests and confidence intervals. 60 Two-tailed Tests and Confidence Interval The hypothesis Ha: µ ≠ 0.005 can be proved at the significance level α if and only if the (1- α)*100% confidence interval does not include 0.005. 61 Summary Discussed hypothesis testing methodology Introduced four-step process of hypothesis testing Defined p-value Performed z-test for the proportion Performed t-test for the mean Discussed two-tailed hypothesis test 62 Best Practices Be sure that the data are an SRS from the population. Pick the hypotheses before looking at the data. Pick the α-level before you compute the test statistic and the p-value. Think about whether α = 0.05 is appropriate for each test. Report a p-value to summarize the outcome of a test. 63 Pitfalls Do not confuse statistical significance with substantive importance. Do not think that the p-value is the probability that the null hypothesis is true. Avoid cluttering a test summary with jargon. 64 Managerial Statistics KH 19 4 – Simple Linear Regression Course material adapted from Chapter 19 of our textbook Statistics for Business, 2e © 2013 Pearson Education, Inc. Learning Objectives Calculate and interpret the simple linear regression equation for a set of data Describe the meaning of the coefficients of the regression equation in the context of business applications Examine and interpret the scatterplot and the residual plot as they relate to a regression Understand the meaning (and limitation) of the R-squared statistic 2 Diamond Prices Motivation: What is the relationship between the price and weight of diamonds? Method: Using a sample of 320 emerald-cut diamonds of various weights, regression analysis produces an equation that relates price to weight. Mechanics: Let y denote the response (“dependent”) variable (price) and let x denote the explanatory (“independent”) variable (weight). 3 Scatterplot of Price vs. Weight Scatterplot $2'000.00 $1'800.00 $1'600.00 $1'400.00 Price ($) $1'200.00 $1'000.00 $800.00 $600.00 $400.00 $200.00 $0.00 0.3 0.35 0.4 0.45 0.5 0.55 Weight (carats) 4 Linear Equation There appears to be a linear trend. We identify the trend line (“best-fit line” or “fitted line”) by an intercept b0 and a slope b1. The equation of the fitted line is Estimated Price = b0 + b1 × Weight . In generic terms, ŷ = b0 + b1 x . 5 Residuals Not all data points will lie on the best-fit line. The Residuals are the vertical deviations from the data points to the line (e=y- ŷ). 6 Method of Least Squares The Method of Least Squares determines the best-fit line by minimizing the sum of squared residuals. The method uses differential calculus to obtain the values of the coefficients b0 and b1 that minimize the sum of squared residuals, also called the sum of squared errors, SSE. 7 Minimizing SSE Let the index i indicate the ith data point, (xi,yi). min SSE min min min e (y [y 2 i i yˆ i ) 2 i (b 0 b1 x i )] 2 8 Least Square Regression The method of least squares generates the following coefficient values: n b1 (x x)(y i i 1 y) i r n (x x) i 1 2 sY sX i b0 y b1 x 9 Diamonds: Fitted Line The least squares regression equation relating diamond prices to weight is Estimated Price = 43.5 + 2670 Weight Regression: Price ($) coefficient std error of coef t-ratio p-value beta-weight constant Weight (carats) 43.48910163 2669.745803 71.90155144 172.4731816 0.6048 15.4792 54.5715% 0.0000% 0.6555 standard error of regression R-squared adjusted R-squared 170.2149256 42.97% 42.79% number of observations residual degrees of freedom 320 318 t-statistic for computing 95%-confidence intervals 1.9675 10 Using the Fitted Line The average price of a diamond that weighs 0.4 carat is Estimated Price = 43.49 + 2669.75 × 0.4 ≈ 1111.39, that is, the estimated price is (about) $1,111. A diamond that weighs 0.5 carat costs (about) $267 more, on average. 11 Illustration 12 Interpreting the Slope The slope coefficient b1 describes how differences in the explanatory variable x associate with differences in the response y. In the diamond example, we can interpret the slope b1 as the marginal cost of an additional carat. (i.e., marginal cost is $2,670 per carat). 13 Interpreting the Intercept The intercept b0 estimates the average response when x = 0 (where the line crosses the y axis). The intercept is the portion of y that is present for all values of x. In the diamond example we can interpret b0 as fixed cost, $43.49, per diamond. 14 Interpreting the Intercept (continued) In many applications, the intercept coefficient does not have a useful interpretation. Unless the range of x values includes zero, the value for b0 is the result of an extrapolation. 15 Residual Plot A Residual Plot shows the variation that remains in the data after accounting for the linear relationship defined by the fitted line. Put differently, the plot shows the variation of the data points around the fitted line. The residuals should be plotted against the predicted values of y (or against x) to check for patterns. 16 Residual Plot (continued) If the least squares line captures the association between x and y, then a plot of residuals should stretch out horizontally with consistent vertical scatter. No particular pattern should be visible. Our task is to visually check for the absence of a pattern. 17 Residuals vs. Predicted Values Residual Plot 600 400 residuals 200 0 800 900 1000 1100 1200 1300 1400 1500 -200 -400 -600 predicted values of Price ($) 18 Variation of Residuals The standard deviation of the residuals measures how much the residuals vary around the fitted line. This standard deviation is called the Standard Error of Regression or the Root Mean Squared Error (RMSE). e12 e22 en2 se SSE/(n 2 ) n2 19 Diamonds For the diamond example, se=170.21. The standard error of regression is $170.21. Regression: Price ($) coefficient std error of coef t-ratio p-value beta-weight constant Weight (carats) 43.48910163 2669.745803 71.90155144 172.4731816 0.6048 15.4792 54.5715% 0.0000% 0.6555 standard error of regression R-squared adjusted R-squared 170.2149256 42.97% 42.79% number of observations residual degrees of freedom 320 318 t-statistic for computing 95%-confidence intervals 1.9675 20 Measures of Variation Y yi SSE = (yi - yi )2 y _ y SST = (yi - y)2 _ SSR = (yi - y)2 _ _ y y X xi 21 Measures of Variation (continued) SST = total sum of squares SSR = regression sum of squares Variation of the yi values around their mean, y Explained variation attributable to the linear relationship between x and y SSE = error sum of squares (sum of squared errors) Variation attributable to factors other than the linear relationship between x and y 22 Measures of Variation (continued) Total variation is made up of two parts: SST SSR Total Sum of Squares Regression Sum of Squares SST (y i y)2 SSR (yˆ i y)2 SSE Error Sum of Squares SSE (y i yˆ i )2 where: y = Average value of the dependent variable yi = Observed values of the dependent variable ŷi = Predicted value of y for the given xi value 23 Coefficient of Determination, R2 The Coefficient of Determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable. The coefficient of determination is also called R-squared and is denoted by r2 or R2. R2 SSR regression sum of squares SST total sum of squares note: 0 R2 1 24 Examples of R-squared Values Y r2 = 1 r2 = 1 X 100% of the variation in Y is explained by variation in X. Y r2 = 1 Perfect linear relationship between X and Y: X 25 Examples of R-squared Values (continued) Y 0 < r2 < 1 X Weaker linear relationships between X and Y: Some but not all of the variation in Y is explained by variation in X. Y X 26 Examples of R-squared Values (continued) r2 = 0 Y No linear relationship between X and Y: r2 = 0 The value of Y does not depend on X. (None of the variation in Y is explained by variation in X). X 27 Diamonds For the diamond example, r2 = 0.4297. The R-squared is 43%. That is, the regression explains 43% of the variation in price. Regression: Price ($) coefficient std error of coef t-ratio p-value beta-weight constant Weight (carats) 43.48910163 2669.745803 71.90155144 172.4731816 0.6048 15.4792 54.5715% 0.0000% 0.6555 standard error of regression R-squared adjusted R-squared 170.2149256 42.97% 42.79% number of observations residual degrees of freedom 320 318 t-statistic for computing 95%-confidence intervals 1.9675 28 Checklist for Simple Regression Linear: Examine the scatterplot to see if pattern resembles a straight line. Random residual variation: Examine the residual plot to make sure no pattern exists. (No obvious lurking variable: Think about whether other explanatory variables may better explain the linear association between x and y.) 29 Application: Lease Costs Motivation How can a dealer anticipate the effect of age on the value of a used car? The dealer estimates that $4,000 is enough to cover the depreciation per year. 30 Lease Costs Method Use regression analysis to find the equation that relates y (resale value in dollars) to x (age of the car in years). The car dealer has data on the prices and age of 218 used BMWs in the Philadelphia area. 31 Lease Costs (continued) Mechanics (Think about lurking variables) Check scatterplot Run regression Check residual plot 32 Lease Costs: Scatterplot Scatterplot $50'000.00 $45'000.00 $40'000.00 Price $35'000.00 $30'000.00 $25'000.00 $20'000.00 $15'000.00 $10'000.00 0 1 2 3 4 5 6 Age Regression Equation: Price = 39851.7199 - 2905.5284 Age 33 Lease Costs: Regression Regression: Price Mechanics coefficient std error of coef t-ratio p-value beta-weight constant Age 39851.7199 -2905.5284 758.460867 219.3264 52.5429 -13.2475 0.0000% 0.0000% -0.6695 standard error of regression R-squared adjusted R-squared 3366.63713 44.83% 44.57% number of observations residual degrees of freedom 218 216 t-statistic for computing 95%-confidence intervals 1.9710 34 Lease Costs: Residual Plot Residual Plot 15000 10000 residuals 5000 0 20000 25000 30000 35000 40000 45000 -5000 -10000 predicted values of Price 35 Lease Costs: Regression Mechanics The linear regression equation is Estimated Price = 39,851.72 – 2,905.53 Age The R-squared is 0.4483, the standard error of regression is se = $3366.64. 36 Conclusion Message The results indicate that used BMWs decline in resale value by $2,900 per year. The current lease price of $4,000 per year appears profitable. However, the fitted line leaves more than half of the variation unexplained. Leases longer than 5 years would require extrapolation. 37 Best Practices Always look at the scatterplot. Know the substantive context of the model. Describe the intercept and slope using units of the data. Limit predictions to the range of observed conditions. 38 Pitfalls Do not assume that changing x causes changes in y. Do not forget lurking variables. Do not trust summaries like R-squared without looking at plots. Do not call a regression with a high R-squared “good” or a regression with a low R-squared “bad”. 39 Managerial Statistics KH 19 5 – Simple Regression Model Course material adapted from Chapter 21 of our textbook Statistics for Business, 2e © 2013 Pearson Education, Inc. Learning Objectives Understand the framework of the simple linear regression model Calculate and interpret confidence intervals for the regression coefficients Perform hypothesis tests on the regression coefficients Understand the difference between confidence and prediction intervals for the predicted value 2 Berkshire Hathaway Motivation: How can we test the CAPM (Capital Asset Pricing Model) for Berkshire Hathaway stock? Method: Formulate the simple regression with percentage excess return in Berkshire Hathaway stock as y and the percentage excess return in value of the whole stock market (“value-weighted stock market index) as x. 3 From Description to Inference We do not only want to describe the historical relationship between x and y that is evident in the data. In addition, we now want to make inferences about the underlying population. We have to think of our data as a sample from a population. 4 From Description to Inference (continued) Naturally, the question arises, what conclusions can we derive from the sample about the population? The central idea is to use inference related to regression: standard errors, confidence intervals and hypothesis tests. 5 Model of the Population The Simple Linear Regression Model (SRM) is a model for the association in the population between an explanatory variable x and a response variable y. The SRM equation describes how the (conditional) mean of y depends on x. The SRM assumes that these means lie on a straight line with intercept β0 and slope β1: y x E (Y X x) 0 1 x 6 Model of the Population (continued) The response variable y is a random variable. The actual values vary around the mean. The deviations of responses around their (conditional) mean are called errors, y y x Errors ε can be positive or negative. They have zero mean, that is, the average deviation from the line is zero. 7 Simple Linear Regression Model The population regression model: population Y intercept dependent variable population slope coefficient independent variable random error term y β0 β1 x ε linear component random error component 8 Simple Linear Regression Model (continued) Y yi β0 β1 xi εi observed value of y for xi εi average value of y for xi slope = β1 random error for this xi value intercept = β0 xi X 9 Data Generating Process The “true regression line” is a characteristic of the population, not the observed data. The true line’s parameters β0 and β1 are (and will remain) unknown! The SRM is a model and offers a simplified view of the population. The observed data points are a simple random sample from the population. The fitted line provides an estimate of the population regression line. 10 Simple Linear Regression Equation The simple linear regression equation provides an estimate of the population regression line. estimated (or predicted) y value for observation i estimate of the regression intercept estimate of the regression slope yˆ i b0 b1 xi value of x for observation i The individual random error terms ei are value of y for observation i ei (yi -yˆ i ) yi -(b0 b1 xi ) 11 Estimates vs. Parameters 12 From Description to Inference We want to use the estimated regression line to make inferences about the true relationship between the explanatory and the response variable. The central idea is to use the standard statistical tools: standard errors, confidence intervals and hypothesis tests. The application of these tools requires us to make some assumptions. 13 SRM: Classical Assumptions (1) The regression model is linear. (2) The error term ε has zero mean, E(ε) = 0. (3) The explanatory variable x and the error term ε are uncorrelated. (4) The error terms are uncorrelated with each other. 14 SRM: Classical Assumptions (continued) (5) The error term has a constant variance, Var(ε) = σe2 for any value of x. (homoskedasticity) (6) The error terms are normally distributed. (This assumption is optional but usually invoked.) 15 Inference If assumptions (1) – (6) hold, then we can easily compute confidence intervals for the unknown parameters β0 and β1. Similarly, we can perform hypothesis tests for these parameters. 16 Modeling Process: Practical Checklist Before looking at plots or running a regression, ask the following questions: Does a linear relationship make sense to us? What type of relationship (sign of coefficients) do we expect? Could there be lurking variables? Then begin working with data. 17 Modeling Process: Practical Checklist (continued) Plot y versus x and verify a linear association in the scatterplot. Compute the fitted line. Plot the residuals versus the predicted values (or x) and inspect the residual plot. Do the … … residuals appear to be independent? … residuals appear to have similar variances? (… residuals appear to be nearly normal?) (Time series require additional checks.) 18 CAPM: Berkshire Hathaway Check scatterplot: relationship appears linear Scatterplot 40 % Change Berk-Hath 30 20 10 0 -25 -20 -15 -10 -5 0 5 10 15 -10 -20 -30 % Change Market 19 CAPM: Berkshire Hathaway (continued) Run simple linear regression Regression: % Change Berk-Hath constant % Change Market coefficient 1.39620459 0.72234946 std error of coef 0.33968223 0.07776332 t-ratio p-value beta-weight standard error of regression 4.1103 9.2891 0.0049% 0.0000% 0.4334 6.51740865 R-squared 18.79% adjusted R-squared 18.57% number of observations 375 residual degrees of freedom 373 t-statistic for computing 95%-confidence intervals 1.9663 20 CAPM: Berkshire Hathaway (continued) Check residual plot: no pattern visible Residual Plot 40 30 20 residuals 10 -20 0 -15 -10 -5 0 5 10 15 -10 -20 -30 predicted values of % Change Berk-Hath 21 Standard Errors of the Coefficients The Standard Errors of the Coefficients describe the sample-to-sample variability of the coefficients b0 and b1. The estimated standard error of b1, se(b1), is se(b1) se 1 n 1 sx 22 Estimated Standard Error of b1 The estimated standard error of b1 depends on three factors: Standard deviation of the residuals se. As se increases, the standard error se(b1) increases. Sample size n. As n increases, the standard error se(b1) decreases. Standard deviation sx of x. As sx increases, the standard error se(b1) decreases. 23 CAPM: Berkshire Hathaway CAPM regression for Berkshire Hathaway Regression: % Change Berk-Hath constant % Change Market coefficient 1.39620459 0.72234946 std error of coef 0.33968223 0.07776332 t-ratio p-value beta-weight standard error of regression 4.1103 9.2891 0.0049% 0.0000% 0.4334 6.51740865 R-squared 18.79% adjusted R-squared 18.57% number of observations 375 residual degrees of freedom 373 t-statistic for computing 95%-confidence intervals 1.9663 24 Confidence Intervals Confidence intervals for the coefficients The 95% confidence interval for β1 is b1 t0.025,n2 se(b1 ) The 95% confidence interval for β0 is b0 t0.025,n2 se(b0 ) 25 Confidence Intervals: CAPM The 95% confidence interval for β1 is 0.72234 ± 1.9663×0.077763 = [0.5694, 0.8753]. The 95% confidence interval for β0 is 1.3962 ± 1.9663×0.33968 = [0.7283, 2.064]. 26 Hypothesis Tests Hypothesis tests on the coefficients Test statistic for H0: β1 = 0: t b1 se(b1 ) Test statistic for H0: β0 = 0: t b0 se(b 0 ) 27 Hypothesis Tests: CAPM Hypothesis test of statistical significance for β1: The t-statistic of 9.2891 with a p-value of less than 0.0001% indicates that the slope is significantly different from zero. Hypothesis test of statistical significance for β0: The t-statistic of 4.1103 with a p-value of 0.0049% indicates that the intercept is significantly different from zero. 28 Application: Locating a Gas Station Motivation Does traffic volume affect gasoline sales? How much more gasoline can be expected to be sold at a gas station with an average of 40,000 drive-bys a day compared to one with an average of 32,000 drive-bys? 29 Gas Station Method Use sales data from a recent month obtained from 80 gas stations (from the same franchise). Run a regression of sales against traffic volume. The 95% confidence interval for 8,000 times the estimated slope will indicate how much more gas is expected to sell at the busier location. 30 Gas Station (continued) Mechanics (Think about lurking variables) Check scatterplot Run regression Check residual plot 31 Gas Station: Scatterplot Mechanics Check scatterplot: relationship appears linear Scatterplot 14 Sales (000 gal.) 12 10 8 6 4 2 20 25 30 35 40 Traffic Volume (000) 45 50 55 32 Gas Station: Regression Regression: Sales (000 gal.) Mechanics Run a regression constant Traffic Volume (000) coefficient -1.3380974 0.23672864 std error of coef 0.94584359 0.02431421 t-ratio p-value -1.4147 9.7362 16.1132% 0.0000% 0.7407 beta-weight 1.5054068 standard error of regression R-squared 54.86% adjusted R-squared 54.28% number of observations 80 residual degrees of freedom 78 t-statistic for computing 1.9908 95%-confidence intervals 33 Gas Station: Residual Plot Mechanics Check the residual plot: no pattern Residual Plot 5 4 3 residuals 2 1 0 4 5 6 7 8 9 10 11 -1 -2 -3 -4 predicted values of Sales (000 gal.) 34 Gas Station: Regression Mechanics The linear regression equation is Estimated Sales = -1.338 + 0.23673 Traffic Vol. The 95% confidence interval for β1 is 0.23673 ± 1.9908×0.024314 = [0.1883, 0.2851]. The 95% confidence interval for 8000×β1 is 8000×[0.1883, 0.2851] ≈ [1507, 2281]. 35 Conclusion Message Based on a sample of 80 gas stations, we expect that a station located at a site with 40,000 drive-bys will sell, on average, from 1,507 to 2,281 more gallons of gas daily than a location with 32,000 drive-bys. 36 Standard Errors of the Fitted Value The fitted value, ŷ , for a given value of x is an estimator of two different unknown values: It is a point estimate for the average value of y for all data points with the particular x value. It is a point estimate for the y value of a single observation with this particular x value. It is much more difficult to make a prediction about a single observation than to make a prediction about an average value. 37 SE Estimated Mean y = Sales ŷ = b0 + b1*x ŷ = 8.13 Confidence Interval for average Sales at Traffic Volume = 40. b0 x = 40 x = Traffic Volume Std error of ŷ for estimating μy|x: SE of estimated mean. 38 SE Prediction y = Sales ŷ = b0 + b1*x ŷ = 8.13 b0 Prediction Interval for Sales at Traffic Volume = 40 x = 40 x = Traffic Volume Std error of ŷ for estimating avg y at x: SE of estimated mean. Std error of ŷ for estimating individual y: SE of prediction. (SE of prediction)2 = (SE of est. mean)2 + (SE of regression)2 39 Standard Errors of the Fitted Value The Standard Error of the Estimated Mean captures the variability of the estimated mean of y around μy|x, the (true but unknown) population average y at the given x. The fitted ŷ = b0 + b1*x is our estimator for the average y at x. The SE of Estimated Mean is a measure for its sample-by-sample variation. 40 Standard Errors of the Fitted Value (continued) The Standard Error of Regression , se, measures the variability of the individual y around the fitted line. By SRM assumption (5) (homoskedasticity), the std. deviation of y around the average μy|x does not vary with x; this std. deviation is estimated by the SE of Regression. (Note: it is not the std. error of any estimator.) 41 Standard Errors of the Fitted Value (continued) The Standard Error of Prediction captures the variability of any individual observation y around μy|x, the (true but unknown) population average y at any given x. (SE of Prediction)2 = (SE of Est. Mean)2 + (SE of Regression)2 42 Two Different Intervals Confidence Interval: An interval designed to hold an unknown population parameter with some level (often 95%) of confidence. Prediction Interval: An interval designed to hold a fraction of the values of the variable y (for a given value of x). A prediction interval differs from a confidence interval because it makes a statement about the location of a new observation rather than a parameter of a population. 43 CI vs. PI (1- α) Confidence Interval for a mean Predicted Value ± TINV(α,df)×SE Est. Mean Prediction Interval for a single observation Predicted Value ± TINV(α,df)×SE Prediction Prediction intervals are sensitive to SRM assumptions (5), constant variance, and (6), normal errors. 44 Gas Station: CI and PI 95% CI: [7.786, 8.476] Prediction, using most-recent regression Traffic Volume (000) constant -1.3381 coefficients 0.236729 40 values for prediction predicted value of Sales (000 gal.) 8.131048 standard error of prediction 1.515364 standard error of regression 1.505407 95% PI: [5.114, 11.148] 0.173427 standard error of estimated mean 95.00% confidence level 1.9908 t-statistic 78 residual degr. freedom confidence limits lower 5.114191 for prediction upper 11.14791 confidence limits lower 7.785781 for estimated mean upper 8.476316 45 Interpretation of Intervals We are 95% confident that average sales at gas stations with 40,000 drive-bys per day are between 7,786 gallons and 8,476 gallons. We are 95% confident that sales at an individual gas station with 40,000 drive-bys per day are between 5,114 gallons and 11,148 gallons. 46 Best Practices Verify that your model makes sense, both visually and substantively. Consider other possible explanatory variables. Check the conditions, in the listed order. Use confidence intervals to express what you know about the slope and intercept. Check the assumptions of the SRM carefully before using prediction intervals. Be careful when extrapolating. 47 Pitfalls Don’t overreact to residual plots. Do not mistake varying amounts of data for unequal variances. Do not confuse confidence intervals with prediction intervals. Do not expect that r2 and se must improve with a larger sample. 48 Managerial Statistics KH 19 6 – Multiple Regression Course material adapted from Chapter 23 of our textbook Statistics for Business, 2e © 2013 Pearson Education, Inc. Learning Objectives Apply multiple regression analysis to decisionmaking situations in business Analyze and interpret multiple regression models Understand the difference between partial and marginal slopes Decide when to exclude variables from a regression model 2 Chain of Women’s Apparel Stores Motivation: How are sales at a chain of women’s apparel stores (annually in dollars per square foot of retail space) affected by competition (number of competing apparel stores in the same shopping mall)? First approach: Formulate a simple regression with sales at stores of this chain as the response variable y and the number of competing stores as the explanatory variable x. 3 Scatterplot of Sales vs. Competitors Scatterplot $900.00 $800.00 Sales ($/sq ft) $700.00 $600.00 $500.00 $400.00 $300.00 0 1 2 3 4 5 6 7 Competitors 4 Simple Linear Regression Regression: Sales ($/sq ft) constant Competitors 502.201557 4.63517778 coefficient 25.4436616 8.74691578 std error of coef 19.7378 0.5299 t-ratio 0.0000% 59.8029% p-value 0.0666 beta-weight standard error of regression R-squared adjusted R-squared 105.778443 0.44% -1.14% number of observations residual degrees of freedom 65 63 t-statistic for computing 95%-confidence intervals Positive relationship: more competitors, higher sales! Does this make sense? 1.9983 5 Interpretation A large number of competitors is indicative of a shopping mall in a location with a high median household income. Put differently, the number of competitors and the median household income are positively correlated. The simple regression of Sales on Competitors mixes the decrease in sales associated with increased competition with the increase in sales associated with higher income levels (that accompany a larger number of competitors). 6 Apparel Sales: Multiple Regression Multiple regression with 2 explanatory variables Median household income in the area (in thousands of dollars) Number of competing apparel stores in the same mall Response variable as before Sales at stores of the chain (annually in dollars per square foot of retail space) 7 Apparel Sales: Multiple Regression Regression: Sales ($/sq ft) constant Income ($000) Competitors 60.3586702 7.965979876 -24.16503223 coefficient 49.290165 0.838249629 6.38991396 std error of coef 1.2246 9.5031 -3.7817 t-ratio 22.5374% 0.0000% 0.0353% p-value 0.8727 -0.3473 beta-weight standard error of regression R-squared adjusted R-squared 68.03062709 59.47% 58.17% number of observations residual degrees of freedom 65 62 t-statistic for computing 95%-confidence intervals 1.9990 Estimated Sales = 60.359 + 7.966 Income – 24.165 Competitors 8 Sales: Residual Plot Check the residual plot: no pattern Residual Plot 200 150 residuals 100 50 0 300 350 400 450 500 550 600 650 700 750 800 -50 -100 -150 predicted values of Sales ($/sq ft) 9 Interpreting the Equation The slope 7.966 for Income implies that a store in a location with a higher median household of $10,000 sells, on average, $79.66 more per square foot than a store in a less affluent location with the same number of competitors. The slope -24.165 for Competitors implies that, among stores in equally affluent locations, each additional competitor lowers average sales by $24.165 per square foot. 10 Multiple Regression The Multiple Regression Model (MRM) is a model for the association in the population between multiple explanatory variables x1, x2, …,xk and a response y. While the SRM bundles all but one explanatory variable into the error term, multiple regression allows for the inclusion of several variables in the model. Multiple regression separates the effects of each explanatory variable on the response and reveals which really matter. 11 Multiple Regression Model Idea: Examine the linear relationship between a response (y) & 2 or more explanatory variables (xi) Multiple regression model with k independent variables: y intercept population slopes random error y β0 β1 x1 β2 x2 βk xk ε 12 Multiple Regression Equation The coefficients of the multiple regression model are estimated using sample data Estimated multiple regression equation: estimated intercept estimated slope coefficients yˆ b0 b1 x1 b2 x2 bk xk 13 Graph for Two-Variable Model y yˆ b0 b1 x1 b2 x2 x2 x1 14 Residuals in a Two-Variable Model y sample observation yˆ b0 b1 x1 b2 x2 < residual = ei = (yi – yi) yi < yi x2i x2 x1i 15 MRM: Classical Assumptions (1) The regression model is linear. (2) The error term ε has zero mean, E(ε) = 0. (3) All explanatory variables x1, x2, …,xk are uncorrelated with the error term ε. (4) Observations of the error term are uncorrelated with each other. 16 MRM: Classical Assumptions (continued) (5) The error term has a constant variance, Var(ε) = σe2 for any value of x. (homoskedasticity) (6) No explanatory variable is a perfect linear function of any other explanatory variables. (7) The error terms are normally distributed. (This assumption is optional but usually invoked.) 17 Multiple vs. Simple Regressions Partial slope: slope of an explanatory variable in a multiple regression that statistically excludes the effects of other explanatory variables. Marginal slope: slope of the explanatory variable in a simple regression. Partial and Marginal slopes only agree when the explanatory variables are uncorrelated. 18 Partial Slopes: Women’s Apparel Competitors + – Sales + Income Competitors has a direct negative effect on Sales. Income has a positive effect on Sales. Competitors and Income are positively correlated. 19 Marginal Slope: Women’s Apparel + Income – Competitors + Sales – + (+ × +) The direct effect of Competitors on Sales is negative (–). The indirect effect (via Income) is positive (+ × +). The marginal slope of Competitors in the simple regression is now the sum of these two effects. 20 Partial vs. Marginal Slopes The MRM separates the individual effects of all explanatory variables (into the partial slopes). Indirect effects (resulting from correlation among explanatory variables) are not present. The SRM does not separate individual effects and so indirect effects are present. The marginal slope of the (single) explanatory variable reflects both the direct effect of this variable as well as the indirect effect(s) due to missing explanatory variable(s). 21 Apparel Sales: Multiple Regression Regression: Sales ($/sq ft) constant Income ($000) Competitors 60.3586702 7.965979876 -24.16503223 coefficient 49.290165 0.838249629 6.38991396 std error of coef 1.2246 9.5031 -3.7817 t-ratio 22.5374% 0.0000% 0.0353% p-value 0.8727 -0.3473 beta-weight standard error of regression R-squared adjusted R-squared 68.03062709 59.47% 58.17% number of observations residual degrees of freedom 65 62 t-statistic for computing 95%-confidence intervals 1.9990 Estimated Sales = 60.359 + 7.966 Income – 24.165 Competitors 22 Inference in Multiple Regression Hypothesis test of statistical significance for β1: The t-ratio of 9.5031 with a p-value of less than 0.0001% indicates that the partial slope of Income is significantly different from zero. Hypothesis test of statistical significance for β2: The t-statistic of -3.7817 with a p-value of 0.0353% indicates that the partial slope of Competitors is significantly different from zero. 23 Inference in Multiple Regression (continued) Both explanatory variables, Income and Competitors, have a statistically significant effect on the response, Sales. Hypothesis test of statistical significance for β0: The t-statistic of 1.2246 with a p-value of 22.5374% indicates that the constant coefficient is not significantly different from zero. 24 Prediction with a Multiple Regression Prediction, using most-recent regression coefficients values for prediction constant Income ($000) Competitors 60.35867 7.965979876 -24.16503223 50 3 predicted value of Sales ($/sq ft) standard error of prediction standard error of regression standard error of estimated mean confidence level t-statistic residual degr. freedom 386.1626 69.9607 68.0306 16.3198 95.00% 1.9990 62 confidence limits for prediction lower upper 246.3131 526.0120 confidence limits for estimated mean lower upper 353.5398 418.7853 25 Prediction with a Multiple Regression (continued) The 95% prediction interval for annual sales per square foot at a location with median household income of $50,000 and 3 competitors is [$246.31, $526.01]. The 95% confidence interval for average annual sales per square foot at locations with median household income of $50,000 and 3 competitors is [$353.54, $418.79]. 26 Application: Subprime Mortgages Motivation A banking regulator would like to verify how lenders use credit scores to determine the interest rate paid by subprime borrowers. The regulator would like to separate its effect from other variables such as loan-to-value (LTV) ratio, income of the borrower and value of the home. 27 Subprime Mortgages Method Use multiple regression on data obtained for 372 mortgages from a credit bureau. The explanatory variables are the LTV, credit score (FICO), income of the borrower, and home value. The response is the annual percentage rate of interest on the loan (APR). 28 Subprime Mortgages (continued) Mechanics Run regression Check residual plot 29 Subprime Mortgages: Regression Regression: APR coefficient std error of coef t-ratio p-value beta-weight constant LTV FICO Stated Income ($000) Home Value ($000) 23.7253652 -1.588843 -0.0184318 0.000403212 -0.000752082 0.6859028 0.51971233 0.00135016 0.003326563 0.000818648 34.5900 -3.0572 -13.6515 0.1212 -0.9187 0.0000% 0.2398% 0.0000% 90.3591% 35.8862% -0.1339 -0.6008 0.0047 -0.0362 standard error of regression R-squared adjusted R-squared 1.24383566 46.31% 45.73% number of observations residual degrees of freedom 372 367 t-statistic for computing 95%-confidence intervals 1.9664 30 Subprime Mortgages: Residual Plot Mechanics Check the residual plot: no pattern Residual Plot 8 6 residuals 4 2 0 8 9 10 11 12 13 14 15 16 17 -2 -4 predicted values of APR 31 Subprime Mortgages: Regression Mechanics The linear regression equation is Estimated APR = 23.725 – 1.5888 LTV – 0.01843 FICO + 0.0004032 Stated Income – 0.000752 Home Value The first two variables, LTV and Credit Score (FICO) have low p-values. The remaining two variables, Stated Income and Home Value, have high p-values. 32 Conclusion Message Regression analysis shows that the credit score (FICO) of the borrower and the loan LTV affect interest rates in the market. Neither income of the borrower nor the home value improves a model with these two variables. 33 Dropping Variables Since the variables Stated Income and Home Value have no statistically significant effect on the response variable APR, we may decide to drop them from the regression. We run a new regression with only two explanatory variables, LTV and Credit Score (FICO). 34 New Regression Regression: APR coefficient std error of coef t-ratio p-value beta-weight constant LTV FICO 23.6913824 -1.5773413 -0.0185656 0.64984629 0.51842379 0.00134003 36.4569 -3.0426 -13.8546 0.0000% 0.2514% 0.0000% -0.1329 -0.6051 standard error of regression R-squared adjusted R-squared 1.24189462 46.19% 45.90% number of observations residual degrees of freedom 372 369 t-statistic for computing 95%-confidence intervals 1.9664 Estimated APR = 23.691 – 1.5773 LTV – 0.018566 FICO 35 Removing Variables Multiple regressions may often indicate that some of the explanatory variables are not statistically significant. Depending on the context of the analysis, we may decide to remove insignificant variables from the regression. If we remove such variables then we should do so one at a time to make sure that we don’t omit a useful variable. 36 Best Practices Know the business context of your model. Distinguish marginal from partial slopes. Check the assumptions of the model before interpreting the output. 37 Pitfalls Don’t confuse a multiple regression with several simple regressions. Don’t believe that you have all of the important variables. Do not think that you have found causal effects. Do not interpret an insignificant t-ratio to mean that an explanatory variable has no effect. Don’t think that the order of the explanatory variables in a regression matters. Don’t remove several explanatory variables from your model at once. 38 Managerial Statistics KH 19 7 – Dummy Variables Course material adapted from Chapter 25 of our textbook Statistics for Business, 2e © 2013 Pearson Education, Inc. Learning Objectives Incorporate qualitative variables into regression models by using dummy variables Interpret the effect of a dummy variable on the regression equation Analyze interaction effects by introducing slope dummy variables Apply and interpret regression models with slope dummy variables 2 Dummy Variable A Dummy Variable is a variable that only takes values 0 or 1. It usually expresses a qualitative difference; e.g., whether the observation is for a man or a woman, or from customer A or B, etc. For example, we can define a dummy variable Group as follows: Group = 0, if the data point is for a woman Group = 1, if the data point is for a man 3 Gender and Salaries Motivation: How can we examine the impact of the variables ‘years of experience’ and ‘gender (male/female)’ on average salaries of managers? Method: Represent the categorical variable gender by a dummy variable. Then run a regression with the response variable Salary and two explanatory variables, years of experience and the new dummy variable. 4 Regression with a Dummy Regression: Salary ($000) constant Years of Experience Group 133.467579 0.853708343 1.024190096 coefficient 2.13151142 0.192481379 2.057626623 std error of coef 62.6164 4.4353 0.4978 t-ratio 0.0000% 0.0016% 61.9298% p-value 0.3449 0.0387 beta-weight standard error of regression R-squared adjusted R-squared 11.77881458 13.11% 12.09% number of observations residual degrees of freedom 174 171 t-statistic for computing 95%-confidence intervals 1.9739 Estimated Salary = 133.47 + 0.8537 Years + 1.024 Group 5 Substituting Values for the Dummy Estimated Salary = 133.47 + 0.8537 Years + 1.024 Group Equation for women (Group = 0) Estimated Salary = 133.47 + 0.8537 Years Equation for men (Group = 1) Estimated Salary = 134.49 + 0.8537 Years 6 Effect of the Dummy Coefficient After substituting the two values 0 and 1 for the dummy variable, we obtain two regression equations. The equation for Group = 0 yields a relationship between Salary and Years for women. The equation for Group = 1 yields a relationship between Salary and Years for men. The two lines have different intercepts but identical slopes. The coefficient of the dummy variable, bGroup=1.024, determines the difference between the intercepts of the two regression lines. 7 In General Terms Regression with two variables, x1 and dum: yˆ b0 b1 x1 b2 dum Substituting values for the dummy: yˆ b 0 b1 x1 b 2 ( 0 ) b1 x1 dum = 0 yˆ b 0 b1 x1 b 2 (1) ( b 0 b 2 ) b1 x1 dum = 1 b0 different intercept same slope 8 Illustration y b0 + b2 b0 slope b1 x1 If H0: β2 = 0 is rejected, then the dummy variable dum has a significant effect on the response y. 9 Dummy: Gender and Salaries The coefficient of the dummy variable Group, bGroup, can be interpreted as the difference in starting salaries between men and women. The coefficient is bGroup= 1.024. So, on average, men have higher starting salaries than women. The p-value of this coefficient is 61.9298%. Therefore, the difference in starting salaries appears to be statistically insignificant. 10 Possible Interaction Effect There is no significant difference between starting salaries of men and women. But, perhaps, a significant difference arises during the time of employment. Put differently, one group of employees may see larger pay increases than the other one. Such an effect is called an Interaction Effect. The variables Group and Years interact in their respective effects on the response variable Salary. 11 Slope Dummy Variable How can we detect the presence of such an interaction effect? We need to include an Interaction (Variable), also called, Slope Dummy Variable. This new variable is the product of an explanatory variable and a dummy variable. 12 In General Terms Regression with the variables, x1, dum and x1×dum: yˆ b0 b1 x1 b2 dum b3 ( x1 dum) Substituting values for the dummy: yˆ b0 b1 x1 b2 (0) b3 ( x1 0) b0 b1 x1 yˆ b0 b1 x1 b2 (1) b3 ( x1 1) (b0 b2 ) (b1 b3 ) x1 different intercept different slope 13 Illustration y slope b1+b3 b0 + b2 b0 slope b1 x1 If H0: β2 = 0 is rejected, then the dummy variable dum has a significant effect on the response y. If H0: β3 = 0 is rejected, then the slope dummy variable x1×dum has a significant effect on the response y. 14 Dummy and Slope Dummy Regression: Salary ($000) constant Years of Experience Group Group x Years 130.988793 1.175983272 4.61128123 -0.41492239 coefficient 3.49019381 0.407570912 4.497011759 0.462459128 std error of coef 37.5305 2.8853 1.0254 -0.8972 t-ratio 0.0000% 0.4417% 30.6627% 37.0876% p-value 0.4751 0.1743 -0.2314 beta-weight standard error of regression R-squared adjusted R-squared 11.78553688 13.52% 11.99% number of observations residual degrees of freedom 174 170 t-statistic for computing 95%-confidence intervals 1.9740 15 Substituting Values for the Dummy Estimated Salary = 130.99 + 1.176 Years + 4.611 Group – 0.4149 Group×Years Equation for women (Group = 0) Estimated Salary = 130.99 + 1.176 Years Equation for men (Group = 1) Estimated Salary = 135.60 + 0.7611 Years 16 Significance Question: Is there a statistically significant difference between salaries paid to women and salaries paid to men? Answer: The differences in salaries are statistically insignificant. The p-values of the dummy variable Group and the slope dummy variable Group×Years exceed 30%, respectively. 17 Principle of Marginality Principle of Marginality: if the slope dummy is statistically significant, retain it as well as both of its components regardless of their level of significance. If the interaction is not statistically significant, remove it from the regression and re-estimate the equation. A model without an interaction term is simpler to interpret since the lines fit to the groups are parallel. 18 Prediction with Slope Dummy Predictions, using most-recent regression Predict constant Years of Experience Group Group x Years coefficients 130.98879 1.1759833 4.6112812 -0.4149224 predicted value of Salary ($000) standard error of prediction standard error of regression standard error of estimated mean confidence level t-statistic residual degr. freedom values for prediction 10 0 0 10 1 10 142.7486 11.92218 11.78554 1.799847 143.2107 11.84443 11.78554 1.179728 95.00% 1.9740 170 confidence limits for prediction lower upper 119.214 119.8296 166.2832 166.5918 confidence limits for estimated mean lower upper 139.1957 140.8819 146.3016 145.5395 19 Best Practices Be thorough in your search for confounding variables. Consider interactions. Choose an appropriate baseline group. Write out the fits for separate groups. Be careful interpreting the coefficient of the dummy variable. (Check for comparable variances in the groups.) (Use color-coding or different plot symbols to identify subsets of observations in plots.) 20 Pitfalls Don’t think that you have adjusted for all of the confounding factors. Don’t confuse the different types of slopes. Don’t forget to check the conditions of the MRM. 21 REVISED MARCH 19, 2014 KARL SCHMEDDERS Germanys Bundesliga: KEL754 Does Money Score Goals? Some people believe football is a matter of life and death; I am very disappointed with that attitude. I can assure you it is much, much more important than that. William Bill Shankly (19131981), Scottish footballer and legendary Liverpool manager Tor! [Goal!] yelled the jubilant announcer as 22-year-old midfielder Toni Kroos of FC Bayern München fired a blistering shot past Borussia Dortmunds goalkeeper. After sixty-six minutes of scoreless football (soccer in the United States) on December 1, 2012, Bayern had pulled ahead of the reigning German champion and Cup winner. A sigh escaped Franz Dully, a financial analyst who covered football clubs belonging to the Union of European Football Associations (UEFA). He was disappointed for two reasons: Not only had a bout with the flu kept him home, but as a staunch Dortmund fan he had a decidedly nonprofessional interest in the outcome. The days showdown between Germanys top professional teams and archrivals would possibly be the deciding match for the remainder of the season; with only three more matches before the mid-season break, FC Bayern had already obtained the coveted title of Herbstmeister (winter champion). History had shown that the league leader at the break often went on to win the coveted German Bundesliga Championship title. It was no guarantee, however, as Dortmund had demonstrated last season when the club had overcome Bayerns mid-season lead to take the title in May. This year Bayern, the leagues traditional frontrunner, was determined to reclaim its glory (and trophy). As the station cut to the delighted Bayern fans in the stands, the phone rang. Dully knew exactly who would be on the other end of the line. Tough break, comrade! Wish you were here! yelled his friend Max Vogel. Dully could barely hear him over the Bayern fans celebrating at Allianz Arena. Lets skip the schadenfreude, shall we? Its most unbecoming. ©2014 by the Kellogg School of Management at Northwestern University. This case was developed with support from the December 2009 graduates of the Executive MBA Program (EMP-76). This case was prepared by Professor Karl Schmedders with the assistance of Charlotte Snyder and Sophie Tinz. Cases are developed solely as the basis for class discussion. Cases are not intended to serve as endorsements, sources of primary data, or illustrations of effective or ineffective management. To order copies or request permission to reproduce materials, call 800-545-7685 (or 617-783-7600 outside the United States or Canada) or e-mail [email protected]. No part of this publication may be reproduced, stored in a retrieval system, used in a spreadsheet, or transmitted in any form or by any meanselectronic, mechanical, photocopying, recording, or otherwisewithout the permission of Kellogg Case Publishing. GERMANYS BUNDESLIGA KEL754 Who, me? Vogel asked. Surely you jest. I would never take pleasure in my childhood friends suffering. But disappointment is inevitable when you root for the underdog. That underdog, as you call it, has taken the title for the last two years and were going for three in a row. Vogel was undeterred. Fortunately, I had the foresight to move to Munich, city of champions. Remember the old saying: Money scores goals. And Bayern has the most. Money is no guarantee of success, Dully countered. Really? his friend shot back. Havent billionaires from Russia, America, and Abu Dhabi bought the last three English Premier League titles for Chelsea, Manchester United, and Manchester City? Well, money certainly helps, Dully conceded. But youre using British examples, and German football is altogether different. To quote our mutual patron saint Sepp Herberger: The ball is round and football is anything but predictable. This match isnt over until the whistle blows, and thats true for the season, too. Well, youre the numbers wizard. If anyone can calculate whether money offers an advantage, its you. Your readers might find it interesting if you managed to prove what football fans think they already know. Ill see, said Dully, without enthusiasm. Ill drink a beer for you in the meantime! Feel better! Tschüss! Dully grunted and put the phone down, but his friends offhand remark stuck with him. With one eye on the game, he leaned over the side of his chair and felt around for his laptop. He dreaded Vogels gloating if Bayern held onto its lead to win the match; perhaps he could quiet him down if he met his friends challenge to show that money correlated with winning football matches as surely as a talented striker. The Bundesliga Football was widely recognized as one of Germanys top pastimes. Since the German Football Association (DFB) was founded in 1900, it had grown to encompass nearly 27,000 clubs and 6.3 million people around the country.1 Initially the game was played only at an amateur level, although semi-professional teams emerged after World War II. Professional football in Germany appeared later than in many of its international counterparts. The countrys top professional league, known as the Bundesliga, was formed on July 28, 1962, after Yugoslavia stunned the German national team with a quarter-final World Cup defeat. Sixteen clubs initially were granted admission to the new league based on athletic 1 Deutscher Fussball-Bund, History, http://www.dfb.de/index.php?id=311002 (accessed January 4, 2013). 2 KELLOGG SCHOOL OF MANAGEMENT KEL754 GERMANYS BUNDESLIGA performance, economics, and infrastructural criteria. Enthusiasm developed quickly, and 327,000 people watched Germanys first professional football matches on August 24, 1963.2 The Bundesliga was organized in two divisions, the 1 and 2 Bundesliga, with the former drawing far more fan attention than the latter. In 2001 the German Football League was formed to oversee all regular-season and playoff matches, licensing, and operations for both divisions. As of 2012, eighteen teams competed in each division. The season ran from August to May, with most games played on weekends. Each team played every other team twice, once at home and once away. The winner of each match earned three points, the loser received no points, and a draw earned one point for each team. At the end of the season, the top team from the 1 Bundesliga was awarded the Deutsche Meisterschaft (German Championship, the Bundesliga title). (The fans jokingly referred to the cup given to the champion as the Salad Bowl.) In 2012 the top three teams of the 1 Bundesliga qualified for the prestigious European club championship known as the Champions League, and the fourth-place team was given the opportunity to compete in a playoff round for a Champions League spot. Within the league, the bottom two teams from the 1 Bundesliga were relegated to the 2 Bundesliga and the top two teams from the 2 Bundesliga were promoted. The team that came in third from the bottom in the 1 Bundesliga played the third-place team of the 2 Bundesliga for the final spot in the top league for the following season. Based on the number of spectators, German football was the most popular sport in the world after the U.S. National Football Leagueit had higher attendance per game than Major League Baseball, the National Basketball Association, and the National Hockey League in the United States. More people attended football games in Germany than in any other country (see Exhibit 1). From a performance perspective, the UEFA ranked the Bundesliga as the third best league in Europe after Spain and England.3 Germany had also distinguished itself as one of the two most successful participants in World Cup history.4 *** Dully roared with glee a few minutes later as Dortmund midfielder Mario Götze evened the score with a shot that sliced through a pack of players before finding the bottom corner of the Bayern goal. This is the magic of German football, he reflected. The neck-and-neck races between the top few teams, the surprises, the upsets, the legends like Franz Beckenbauer and Lothar Matthäus. And of course, there were the magical moments, perhaps none more so than that rainy 1954 day when Germanys David defeated the Hungarian Goliath and stunned the world by winning the World Cup in what came to be called the Miracle of Berne. Call me mad, call me crazy!5 the announcer had shrieked over the airwaves when Helmut Rahn nudged the ball past Hungarian goalkeeper Gyuli Grosics and gave Germany the lead over 2 Silvio Vella, The Birth of Professional Football in Germany, Malta Independent, July 28, 2012. UEFA Rankings, http://www.uefa.com/memberassociations/uefarankings/country/index.html (accessed January 4, 2013). 4 FIFA, All-Time FIFA World Cup Ranking 19302010, http://www.fifa.com/aboutfifa/officialdocuments/doclists/matches.html (accessed January 4, 2013). 5 Ulrich Hesse-Lichtenberger, Tor!: The Story of German Football (London: WSC Ltd, 2003), 126. 3 KELLOGG SCHOOL OF MANAGEMENT 3 GERMANYS BUNDESLIGA KEL754 the Hungarians, a team that had gone unbeaten for thirty-one straight games in the preceding four years and was considered the undisputed superpower of world football.6 Minutes later, the Germans raised the Jules Rimet World Cup trophy high for the first time. Bundesliga Finances: The Envy of International Football Most European football clubs wrestled with finances: In the 20102011 season, the twenty clubs in the English Premier League showed £2.4 billion in debt,7 a figure surpassed by the twenty Spanish La Liga clubs, which hit 3.53 billion (£2.9 billion).8 In contrast, the thirty-six Bundesliga clubs showed a net profit of 52.5 million in 20102011. The Bundesliga had the distinction of being the most profitable football league in the world. In 20102011 the Bundesliga had revenues of 2.29 billion, more than half of which came from advertising and media management (see Exhibit 2).9 Television was one of the largest sources of income. This money was split between the football clubs according to their performance during the season. Secrets of the Bundesligas success included club ownership policies, strict licensing rules, and low ticket costs. With a few notable exceptions, German football clubs were large membership associations with the same majority owner: their members. League regulations dictated a 50+1 rule, which meant that club members had to maintain control of 51 percent of shares. This left room for private investment without risking instability as a result of individual entrepreneurs with deep pockets taking over teams and jeopardizing long-term financial stability for short-term success on the field. Bundesliga licensing procedures mandated that clubs had to open their books to league accountants and not spend more than they made in order to avoid fines and be granted a license to play the following year. Among a host of other stipulations, precise rules established liquidity and debt requirements; Teutonic efficiency had little patience for inflated transfer fees and spiraling wages that could send clubs into financial ruin. Football player salaries were the highest of any sport in the world. A 2012 ESPN survey revealed that seven of the top ten highest-paying sports teams were football clubs, with U.S. major league baseball and basketball clubs rounding out the set. FC Barcelonas players led the worlds professional athletes with an average salary of $8.68 milliona weekly salary of $166,934. Real Madrid players followed close behind with an average salary of $7.80 million per year.10 While the salaries were impressive, the cost of transferring players between countries and leagues could be even more so. A transfer fee was paid to a club for relinquishing a player (either still under contract or with an expired contract) to an international counterpart, and such transfers 6 FIFA, 1954 World Cup Switzerland, http://www.fifa.com/worldcup/archive/edition=9/overview.html. Deloitte Annual Review of Football Finance, May 31, 2012. 8 La Liga Debt Crisis Casts a Shadow Over On-Pitch Domination, Daily Mail, April 19, 2012. 9 Bundesliga Annual Report 2012, p. 50. 10 Jeff Gold, Highest-Paying Teams in the World, ESPN, May 2, 2012. 7 4 KELLOGG SCHOOL OF MANAGEMENT KEL754 GERMANYS BUNDESLIGA were regulated by footballs world governing body, the Fédération Internationale de Football Association (FIFA). Historically, transfers were permitted twice a yearfor a longer period during the summer between seasons, and for a shorter period during the winter partway through the season. FIFA reported that $3 billion was spent transferring players between teams in 2011 and that a transfer was conducted every 45 minutes.11 Although the average transfer fee was $1.5 million in 2011, clubs often paid top dollar to secure star power. In 2011 thirty-five players transferred at fees exceeding 15 million,12 including Javier Pastore, who transferred from Palermo to Paris Saint-Germain for 42 million.13 The highest transfer fee ever paid was 94 million by Real Madrid to Manchester United for Cristiano Ronaldo in 2009. After financial crises in the business world demonstrated that no company was too big to fail and evidence to this effect began mounting in the football world, the UEFA approved fair play legislation in 2010 requiring teams to live within their means or face elimination from competition. The policies were designed to prevent football teams from crumpling under oppressive debt and to ensure a more stable economic future for the game.14 The legislation was to be phased in over several years, with some key components taking effect in the 20112012 season. Because the Bundesliga already operated under a system that linked expenditure with revenue, wealth was relatively evenly distributed among the clubs, and teams could not vastly outspend one another as was frequently the case in the Spanish La Liga and the British Premier League. As a result, a greater degree of competitive parity made for exciting matches and competition for the Deutsche Meisterschaft. The leagues reasonable ticket prices made Germany arguably one of the greatest places in the world to be a football fan. A BBC survey revealed that the average price of the cheapest match ticket in the Premier League was £28.30 ($46), but season tickets to Dortmund matches, for example, cost only 225 ($14 per game including three Champions League games) and included free rail travel. In comparison, season tickets to Arsenal matches (the most expensive in the Premier League) cost £1,955 ($3,154) for 20122013.15 Germany had some of the biggest and most modern stadiums in the world as the result of 1.4 billion spent by the government expanding and refurbishing them in preparation for hosting the 2006 World Cup.16 According to the London Times, two German stadiums made the list of the worlds ten best football venuesthe Signal Iduna Park (formerly known as Westfalenstadion) in Dortmund (ranked number one) and the Allianz Arena in Munich (number five). During the 20102011 season, more than 17 million people watched Bundesliga football matches live in stadiums, and the 1 Bundesliga attendance averaged a record-breaking 42,101 per game.17 The average attendance at Dortmunds Signal Iduna Park in the first half of the 2012 11 Tom McGowan, A FIFA First: Footballs Transfer Figures Released, CNN, March 6, 2012. Mark Chaplin, Financial Fair Plays Positive Effects, UEFA News, August 31, 2012. 13 PSG Complete Record-Breaking Pastore Transfer, UEFA News, August 6, 2011. 14 Financial Fair Play Regulations Are Approved, UEFA News, May 27, 2010. 15 Ticket Prices: Arsenal Costliest, ESPN News, October 18, 2012. 16 German Football Success: A League Apart, The Economist, May 16, 2012. 17 Bundesliga Annual Report 2012, p. 56. 12 KELLOGG SCHOOL OF MANAGEMENT 5 GERMANYS BUNDESLIGA KEL754 2013 Bundesliga season was 80,577.18 In addition, around 18 million peoplenearly a quarter of the countrytuned in to the Bundesliga matches on television each weekend.19 No other leisure time activity consistently generated that level of interest in Germany. FC Bayern München In the Bundesligas fifty-year history, FC Bayern München had been a perennial powerhouse; the club boasted twenty-one title victories and an aggregate advantage of nearly 500 points in the eternal league table. Conventional wisdom held that clubs with a higher market value were more likely to win championships because they could afford to pay the highest wages and transfer fees to attract the best talent. FC Bayern was the eighth highest-paying sports team in the world, with an average salary of $5.9 million per player according to ESPN in 2012.20 The highest transfer fee ever paid in the Bundesliga occurred in the summer of 2012 when Bayern bought midfielder Javi Martinez from the Spanish team Athletic Bilbao for 40 million.21 Bayerns appearance in the Champions League in eleven of the previous twelve years (including one first-place and two second-place finishes) raised the team to new heights on the international stage and increased its brand value; in 2012 it was the second most valuable football club brand in the world according to Brand Finance, a leading independent brand valuation consultancy (see Table 1). Table 1: Bundesliga Club Brand Value and Average Player Salary Club FC Bayern München Number of Titles 2012 Rank 2012 Market Value ($ in millions) Average Annual Salary per Player for the 20112012 Season ($ in millions) 21 2 786 5,907,652 FC Schalke 04 0 10 266 4,187,722 Borussia Dortmund Hamburger SV 5 3 11 17 227 153 3,122,824 2,579,904 VfB Stuttgart 3 28 71 2,721,154 SV Werder Bremen 4 30 68 2,734,924 Source: Brand Finance Football Brands 2012 and Jeff Gold, Highest-Paying Teams in the World, ESPN, May 2, 2012. Bayern was also the only Bundesliga club to appear on the Forbes magazine list of the fifty most valuable sports franchises worldwide. It was one of five football teams that consistently appeared alongside the National Football League teams that dominated the listfrom 2010 to 2012, the clubs ranking climbed from 27 to 14. In 2012 the magazine estimated that Bayern had the fourth highest revenue of any football team in the world and valued the club at $1.23 billion.22 18 Europes Getting to Know Dortmund, Bundesliga News, December 26, 2012. Sky Strikes Bundesliga Deal with Deutsche Telekom, Reuters, January 4, 2013. 20 Gold, Highest-Paying Teams in the World. 21 Javi Martinez Joins Bayern Munich, ESPN News, August 29, 2012. 22 Kurt Badenhausen, Manchester United Tops the Worlds 50 Most Valuable Sports Teams, Forbes, July 16, 2012. 19 6 KELLOGG SCHOOL OF MANAGEMENT KEL754 GERMANYS BUNDESLIGA Despite Bayerns privileged position, competition in the league remained strong. All eighteen of the 1 Bundesliga teams ranked among the top 200 highest-paying sports teams in the world, with average salaries above $1.3 million per year for the 20112012 season.23 The Bundesligas depth kept seasons interesting: since 2000, five different teams had won the title and two more had been Herbstmeister (see Exhibit 3). Seeking Correlation Dully flipped off the television and went to the kitchen to get some food. The match had ended in a 11 draw, leaving the country in suspense over whether Bayern would run away from the pack in the league table or if Dortmund could catch up. The phone rang again. Have you proven me right yet? Vogel asked above the din. No, said Dully. Im averse to promoting financial doping. You always were an idealist, Vogel observed. Or a purist or something. Im the complement to your cynicism. Ah yes, that must be why we get along so well. Id like to see your analysis, though, when you actually come up with some. Funny you should ask for that, Dully said. Ill get back to you. Maybe. After a few more minutes of banter followed by well-intentioned plans for catching up someday soon, the friends hung up. Dully returned to the living room and flopped on the couch. The analyst wondered about the future of a Bundesliga with one team that was much wealthier than the restwould it remain competitive and exciting or, as Vogel said, would money shoot goals and give those rich Bayern the German Cup year after year? Dully returned to the spreadsheet he had started during the match, looking for a statistical correlation between money and Bundesliga success. 23 Gold, Highest-Paying Teams in the World. KELLOGG SCHOOL OF MANAGEMENT 7 GERMANYS BUNDESLIGA KEL754 Exhibit 1: Comparison of Sporting League Attendance Worldwide, 20102011 Season League Average Attendance per Game U.S. National Football League 66,960 German Bundesliga 42,690 Australian A-League 38,243 British Premier League 35,283 U.S. Major League Baseball 30,066 Spanish La Liga Mexican Liga MX 29,128 27,178 Italian Serie A 24,031 French Ligue 1 19,912 Dutch Eredivisie 19,116 Source: ESPN Soccer Zone, WorldFootball.net, and Bundesliga Annual Report 2012, p. 56. Exhibit 2: Bundesliga Revenue 1 BUNDESLIGA REVENUE Sector Revenue ( in thousands) % Revenue Match earnings 411,164 Advertisement 522,699 26.92 Media management 519,629 26.76 Transfers 195,498 10.07 Merchandising Other Total 21.17 79,326 4.08 213,665 11.00 1,941,980 100 Source: Bundesliga Report 2012: The Economic State of German Professional Football, January 23, 2012. TOTAL REVENUE FOR 1 AND 2 BUNDESLIGA Sector Revenue ( in thousands) % Revenue Match earnings Advertisement 469,510 634,010 20.41 27.57 Media management 629,079 27.35 Transfers 215,110 9.35 Merchandising Other Total 89,493 3.89 262,779 11.43 2,299,980 100 Source: Bundesliga Report 2012: The Economic State of German Professional Football, January 23, 2012. 8 KELLOGG SCHOOL OF MANAGEMENT KEL754 GERMANYS BUNDESLIGA Exhibit 3: Bundesliga Mid-Season Leaders and Champions Season Mid-Season Leader 20122013 FC Bayern München Champion 20112012 FC Bayern München Borussia Dortmund 20102011 Borussia Dortmund Borussia Dortmund 20092010 Bayer 04 Leverkusen FC Bayern München 20082009 1899 Hoffenheim VfL Wolfsburg 20072008 FC Bayern München FC Bayern München 20062007 SV Werder Bremen VfB Stuttgart 20052006 FC Bayern München FC Bayern München 20042005 FC Bayern München FC Bayern München 20032004 SV Werder Bremen SV Werder Bremen 20022003 FC Bayern München FC Bayern München 20012002 Bayer 04 Leverkusen Borussia Dortmund 20002001 FC Schalke 04 FC Bayern München 19992000 FC Bayern München FC Bayern München 19981999 FC Bayern München FC Bayern München 19971998 1.FC Kaiserslautern 1.FC Kaiserslautern 19961997 FC Bayern München FC Bayern München 19951996 Borussia Dortmund Borussia Dortmund Borussia Dortmund 19941995 Borussia Dortmund 19931994 Eintracht Frankfurt FC Bayern München 19921993 FC Bayern München SV Werder Bremen 19911992 Eintracht Frankfurt VfB Stuttgart 19901991 SV Werder Bremen 1.FC Kaiserslautern 19891990 FC Bayern München FC Bayern München 19881989 FC Bayern München FC Bayern München 19871988 SV Werder Bremen SV Werder Bremen 19861987 Hamburger SV FC Bayern München 19851986 SV Werder Bremen FC Bayern München 19841985 FC Bayern München FC Bayern München 19831984 VfB Stuttgart VfB Stuttgart 19821983 Hamburger SV Hamburger SV 19811982 1.FC Köln Hamburger SV 19801981 Hamburger SV FC Bayern München 19791980 FC Bayern München FC Bayern München 19781979 1.FC Kaiserslautern Hamburger SV 19771978 1.FC Köln 1.FC Köln 19761977 Borussia Mönchengladbach Borussia Mönchengladbach 19751976 Borussia Mönchengladbach Borussia Mönchengladbach 19741975 Borussia Mönchengladbach Borussia Mönchengladbach 19731974 FC Bayern München FC Bayern München 19721973 FC Bayern München FC Bayern München 19711972 FC Schalke 04 FC Bayern München 19701971 FC Bayern München Borussia Mönchengladbach 19691970 Borussia Mönchengladbach Borussia Mönchengladbach 19681969 FC Bayern München FC Bayern München 19671968 1.FC Nürnberg 1.FC Nürnberg 19661967 Eintracht Braunschweig Eintracht Braunschweig 19651966 TSV 1860 München TSV 1860 München 19641965 SV Werder Bremen SV Werder Bremen 19631964 1.FC Köln 1.FC Köln Source: Bundesliga, History Stats, http://www.bundesliga.com/en/stats/history (accessed January 4, 2013). KELLOGG SCHOOL OF MANAGEMENT 9 GERMANYS BUNDESLIGA KEL754 Questions PART I 1. What were the smallest, average, and largest market values of football teams in the Bundesliga in the 20112012 season? 2. Develop a regression model that predicts the number of points a team earns in a season based on its market value. Write down the estimated regression equation. 3. Are the regression coefficients statistically significant? Explain. 4. Carefully interpret the slope coefficient in your regression in the context of the case. 5. Conventional wisdom among football traditionalists states that the aggregate number of points at the end of a Bundesliga season closely correlates with the market value of a club. Simply put, money scores goals, which in turn lead to wins and points. Comment on this wisdom in light of your regression equation. 6. Some of the (estimated) market values at the beginning of the 20122013 season were as follows: SC Freiburg 46,650,000 1.FSV Mainz 05 46,000,000 Eintracht Frankfurt 49,400,000 Provide a point estimate for the difference between the number of points Eintracht Frankfurt and 1.FSV Mainz 05 will earn in the 20122013 season. 7. Provide a point estimate and a 95% interval for the number of points SC Freiburg will earn in the 20122013 season. PART II The first half of a Bundesliga season ends in mid-December. After a break for the holiday season and potentially bad winter weather (which could lead to the cancellation of games) the league resumes play in late January. 8. Develop a regression model that predicts the number of points a team earns at the end of a season based on its market value and the number of points it earned during the first half of the season. Write down the estimated regression equation. 9. Carefully interpret the two slope coefficients in your regression in the context of the case. 10. Compare your regression equation to the simple linear regression you obtained in Part I. How did the coefficient of the variable Marketvalue_2011_Mio ( in millions) change? Provide an explanation for the difference. 10 KELLOGG SCHOOL OF MANAGEMENT KEL754 11. Drop all insignificant variables (use GERMANYS BUNDESLIGA = 0.05). Write down the final regression equation. 12. At the beginning of the 20122013 season, the market value of Borussia Mönchengladbach was estimated to be 88,350,000; the market value of 1.FC Nürnberg was estimated at 41,500,000. During the first half of the 20122013 season, Borussia Mönchengladbach earned 25 points and 1.FC Nürnberg, 20 points. Provide a point estimate and an 80% interval for the number of points Borussia Mönchengladbach will earn in the 20122013 season. 13. Provide a point estimate for the difference between the number of points Borussia Mönchengladbach and 1.FC Nürnberg will earn in the 20122013 season. 14. An intuitive claim may be that, on average, a team earns twice as many points in an entire season as it earns in the first half of the season. Put differently, on average, the total number of a teams points should just be two times the number of points at mid-season. Can you reject this claim based on your regression model (at a significance level of = 0.05)? KELLOGG SCHOOL OF MANAGEMENT 11 KARL SCHMEDDERS AND MARKUS SCHULZE 5-215-250 Solid as Steel: Production Planning at ThyssenKrupp On Monday, March 31, 2014, production manager Markus Schulze received a call from Reinhardt Täger, senior vice president of ThyssenKrupp Steel Europe’s production operations in Bochum, Germany. Täger was preparing to meet with the company’s chief operating officer and was eager to learn the reasons why the current figures of one of Bochum’s main production lines were far behind schedule. Schulze explained that the line had had three major breakdowns in early March and therefore would miss the planned utilization rate for that month. Consequently, the scheduled production volume could not be carried out. Schulze knew that a lack of production capacity utilization would lead to unfulfilled orders at the end of the planning period. In a rough steel market with fierce competition, however, delivery performance was an important differentiation factor for ThyssenKrupp. Täger wanted a chance to review the historic data, so he and Schulze agreed to meet later that week to continue their discussion. After looking over the production figures from the past ten years, Täger was shocked. When he met with Schulze later that week, he expressed his frustration. “Look at the historic data!” Täger said. “All but one of the annual deviations from planned production are negative. We never achieved the production volumes we promised in the planning meetings. We need to change that!” “I agree,” Schulze replied. “Our capacity planning is based on forecast figures that are not met in reality, which means we can’t fulfill all customers’ orders in time. And the product cost calculations are affected, too.” “You’re right,” Täger said. “We need appropriate planning figures to meet the agreed delivery time in the contracts with our customers. What do you think would be necessary for that?” “Hm, I guess we need a broad analysis of data to identify the root causes.” Schulze answered. “It’ll take some time to build queries for the databases and aggregate data. And—” “Stop!” Täger interrupted him. “We need data for the next planning period. The planning meeting for May is in two weeks.” ©2015 by the Kellogg School of Management at Northwestern University. This case was prepared by Markus Schulze (Kellogg-WHU ’16) under the supervision of Professor Karl Schmedders. It is based on Markus Schulze’s EMBA master’s thesis. Cases are developed solely as the basis for class discussion. Cases are not intended to serve as endorsements, sources of primary data, or illustrations of effective or ineffective management. To order copies or request permission to reproduce materials, call 847.491.5400 or e-mail [email protected]. No part of this publication may be reproduced, stored in a retrieval system, used in a spreadsheet, or transmitted in any form or by any means—electronic, mechanical, photocopying, recording, or otherwise—without the permission of Kellogg Case Publishing. PRODUCTION PLANNING AT THYSSENKRUPP 5-215-250 ThyssenKrupp Steel Europe ThyssenKrupp Steel Europe, a major European steel company, was formed in a 1999 merger between historic German steel makers Thyssen and Krupp, both of which had been founded in the nineteenth century. ThyssenKrupp Steel Europe annually produced up to 12 million metric tons of steel with its 26,000 employees. In fiscal year 2013–2014, the company accounted for €9 billion of sales, roughly a quarter of the group sales of its parent company, ThyssenKrupp AG, which traded on the DAX 30 (an index of the top thirty blue-chip German companies). Its main drivers of success were customer orientation and reliability in terms of product quality and delivery time. Bochum Production Lines The production lines at ThyssenKrupp Steel’s Bochum site were supplied with interim products delivered from the steel mills in Duisburg, 40 kilometers west of Bochum. Usually, slabs1 were brought to Bochum by train and then processed in the hot rolling mill (see Figure 1). The outcome of this production step was coiled hot strip2 (see Figure 2) with mill scale3 on its surface. Whether the steel would undergo further processing in the cold rolling mill or would be sold directly as “pickled hot strip,” the mill scale needed to be removed from the surface. The production line in which Täger and Schulze were interested, a so-called push pickling line (PPL), was designed to remove mill scale from the upstream hot rolling process. To remove the scale, the hot strip was uncoiled in the line and the head of the strip was pushed through the line. The processing part of the line held pickling containers filled with hot hydrochloric acid, which removed the scale from the surface. Following this pickling, the strip was pushed through a rinsing section to remove any residual acid from the surface. After oiling for corrosion protection, the strip was coiled again. The product of this step, pickled hot strip, could be sold to B2B customers, mainly in the automotive industry. Other types of pickling lines were operated as continuous lines, in which the head of a new strip was welded to the tail of the one that preceded it. The differentiating factor of a PPL was its batching process, which involved pushing in each strip individually. Production downtimes due to push-in problems did not occur at continuous lines, but with PPLs this remained a concern. Figure 1. Source: ThyssenKrupp AG, http://www.thyssenkrupp.com/en/presse /bilder.html&photo_id=898. Figure 2. Source: ThyssenKrupp AG, http://www.thyssenkrupp.com/en/presse /bilder.html&photo_id=891. 1 Slabs are solid blocks of steel formed in a continuous casting process and then cut into lengths of about 20 meters. A coiled hot strip is an intermediate product in steel production. Slabs are rolled at temperatures above 1,000°C. As they thin out they become longer; the result is a flat strip that needs to be coiled. 3 Mill scale is an iron oxide layer on the hot strip’s surface that is created just after hot rolling, when the steel is exposed to air (which contains oxygen). Mill scale protects the steel to a certain extent, but it is unwanted in further processes such as stamping or cold rolling. 2 2 KELLOGG SCHOOL OF MANAGEMENT 5-215-250 PRODUCTION PLANNING AT THYSSENKRUPP Nevertheless, ThyssenKrupp chose to build a PPL in 2000 because increasing demand for highstrength steel made it profitable to invest in such a production line. At that time, high-strength steel grades could not be welded to one another with existing machines, and the dimensions (at a thickness of more than 7.0 millimeters) could not be processed in continuous lines. The material produced on the PPL was not simply a commodity called steel. Rather, it was a portfolio of different steel grades—that is, different metallurgical compositions with specific mechanical properties. (For purposes of this case, the top five steel grades in terms of annual production volume have been randomly assigned numbers from 1 to 5.) Within these top five grades were two high-strength steel grades. These high-strength grades were rapidly cooled after the hot rolling process—from around 1,000°C down to below 100°C. Removing the mill scale generated during this rapid cooling process required a different process speed in the pickling line. Only one of the five grades could be processed without limitations in speed and without expected downtimes. Performance Indicators At ThyssenKrupp, managers responsible for production lines needed to report regularly on the performance of the lines and the fulfillment of individual objectives. The output, or throughput, of the production lines had always been an important metric. Even today, coping with overcapacities and customers’ increasing demands concerning product quality, the line throughput was part of the set of key performance indicators. These indicators were taken into account for internal benchmarking against comparable production lines at other sites. The linespecific variable production cost was calculated as cost over throughput and was expressed in euros per metric ton. Capacity planning was based on these figures, eventually resulting in delivery time performance. In the steel industry, production reports contained performance indicators at different levels of aggregation. A very important metric was throughput (tons4 produced) per time unit5; the performance indicator run time ratio6 (RTR) was the portion of time used for production (run time) compared to the operating time of a production line. Operating time = Calendar time – (legal holidays, shortages,7 all scheduled maintenance) Run Time = Operating time – (breakdowns, exceeding downtime for maintenance, set-up time) Both figures were reported not only on a daily basis (i.e., a 24-hour production period) but also monthly and per fiscal year. Deviations from planned figures were typically noted in automated reports containing database queries. Thus, every plant manager received an overview of past periods. Comparable production lines of different sites were benchmarked internally. 4 Throughout this case, the term “ton” refers to a metric ton. Tons produced are usually reported by shift (eight hours), by month, and eventually by fiscal year. 6 The metric run time ratio is calculated as run time over operating time (e.g., 8 hours of operating time, or 480 minutes, with 48 minutes of downtime yields a RTR of 90%). 7 Shortages can refer to material shortages, lack of orders, labor disputes, or energy/fuel shortages (external). 5 KELLOGG SCHOOL OF MANAGEMENT 3 PRODUCTION PLANNING AT THYSSENKRUPP 5-215-250 Deviation from Planned Throughput Steel production lines had typical characteristics and an average performance calculated based on an average production portfolio, mostly determined empirically using historic figures. For planning purposes, a fixed number was usually used to place order volumes on the production lines and in this way “fill capacities.” On a monthly basis, real orders then were placed to a certain amount, which was capped by the line capacity. Each month’s production figures had three possible outcomes. The first possibility was that the planned throughput would be reached and at the end of the month there would be extra capacity. In this case, the extra capacity would be filled with orders from the next month if the intermediate product already were available for processing. Otherwise, the line would stand still without fulfilling orders. This mode was very expensive because idle capacity would be wasted, and fixed costs occurred anyway. The second possibility was that the planned throughput would not be reached. This would mean that at the end of the month, orders would be left that could not be fulfilled. This mode was also very expensive because the planned capacity could not be used, and real production costs were higher than pre-calculated. Product calculation would result in prices that were too low, so contribution margins would be much lower than expected—or even negative. In the third scenario, the exact planned throughput would be met (+/- 100 tons per month, or +/- 1,200 tons per year, was set as accurate). This was the ideal case, but this had occurred only once in the first ten years of line history (see the annual figures in Table 1). Table 1: Annual Deviation from Planned Production in the First Ten Years of Line Operation Year of Operation Annual Deviation from Planned Production (tons) 1 - 23,254 2 - 22,691 3 + 1,115 4 - 22,774 5 - 2,807 6 - 20,363 7 (financial crisis) - 66,810 8 - 21,081 9 - 4,972 10 - 9,486 Each month, production management had to explain the deviation from planned figures. Many reasonable explanations had been given in the past. Major breakdowns were a common explanation because downtimes directly influenced the RTR. The RTR theory—the lower the run time ratio, the higher the negative deviation from the plan—was often mentioned as the dominating force behind the PPL not achieving the planned throughput. The production engineers’ gut feeling was that a straightforward reason would explain patterns that showed peaks “against the RTR theory,” namely the material structure: The resulting 4 KELLOGG SCHOOL OF MANAGEMENT 5-215-250 PRODUCTION PLANNING AT THYSSENKRUPP throughput can be explained on the basis of whether the material structure is favorable or unfavorable. A specific metric of the structure was the ratio meters per ton (MPT), a dimension indicator. The MPT theory reflected the fact that material with a low thickness and/or a low width carried a lower weight per meter. In other words, it took longer to put one ton of material through the production line if the process speed remained constant. According to the MPT theory, negative deviations in months with average or above-average RTR could be explained by this metric. Data Schulze realized he had to compile data carefully in order to have any hope of finding possible explanations for the deviations from planned throughput. He decided to define aggregate clusters for material dimensions such as the width and the thickness of the strips. The technical data of the Bochum PPL relevant to the data collection were: Width: Thickness: Maximum throughput: 800 to 1,650 mm 1.5 to 12.5 mm 80,000 tons per month Then Schulze reviewed available past production data, beginning with the night shift on October 1, 2013, up until the early shift on April 4, 2014. Unfortunately, he had to omit a few shifts during this six-month period because of missing or obviously erroneous data. Schulze’s data set accompanies this case in a spreadsheet. The explanation of the variables in the data set is as follows: Shift: The day and time at the beginning of a shift. Shift type: The production line operated 24/7 with three eight-hour shifts; the early shift (“E”) started at 6 a.m., the late (or Midday) shift (“M”) started at 2 p.m., and the night shift (“N”) started at 10 p.m. Shift number: ThyssenKrupp Steel used a continuous rolling shift system with five different shift groups (shift group 1, shift group 2, etc.). The binary variables indicate whether the shift group i worked a particular shift. Weekday: The line operated Monday through Sunday, but engineers usually worked Monday to Friday on a dayshift basis (usually starting at 7 a.m.). Throughput: The throughput (in tons) during a shift. Delta throughput: The deviation (in tons) of actual throughput from planned throughput. KELLOGG SCHOOL OF MANAGEMENT 5 PRODUCTION PLANNING AT THYSSENKRUPP 5-215-250 MPT: A dimension indicator (meters per ton). Thickness clusters: Each cluster represented a certain scope of material thickness in millimeters within the technical feasible range of the production line. Strips fell into one of three clusters. The variables “thickness 1,” “thickness 2,” and “thickness 3” denote the number of strips from the first, second, and third thickness clusters, respectively, that were processed during a shift. Width clusters: Each cluster represented a certain scope of material width in millimeters within the technically feasible range of the production line. Strips fell into one of three width clusters. The variables “width 1,” “width 2,” and “width 3” denote the number of strips from the first, second, and third width clusters, respectively, that were processed during a shift. Steel grades: Strips of many different steel grades were processed on the line. The steel grades 1 to 5 are the grades with the largest portion by volume. The variables “grade 1,” “grade 2,” “grade 3,” “grade 4,” and “grade 5” denote the proportion (in %) of steel of that grade that was processed during a given shift. The remaining strips were of other steel grades; their proportion is given by “grade rest.” RTR: The run time ratio (in %), which is calculated as run time divided by operating time. Schulze quickly realized he had data on more variables than he could employ for his analysis. Obviously, the total number of strips in the three width clusters had to be the same as the total number of strips in the three thickness clusters. Similarly, the proportions of the six different steel grades always added up to 100%. Schulze also decided to omit the dimension indicator (MPT) for his own analysis, as he now had much more detailed and reliable information about the size of the strips. After the analysis of the aggregated and clustered data, Schulze looked at his prediction model for delta throughput. From his experience, he knew he had found the key drivers for deviations from the planned production volume. “Look at this equation,” he said to the production engineer in charge of the PPL. “The model coefficients determine the outcome, which is the deviation from planning. If we had the forecast figures for May, I could predict the deviation based on this model. Please get the numbers of coils from the different clusters and the proportions of the different steel grades. For the RTR, I’m guessing 86% is an appropriate figure.” 6 KELLOGG SCHOOL OF MANAGEMENT 5-215-250 PRODUCTION PLANNING AT THYSSENKRUPP Assignment Questions PART A: INITIAL ANALYSIS First, obtain an initial overview of the data. Next, plan to examine the two theories proposed by the production engineers. Questions: 1. Perform a univariate analysis and answer the following questions: a. What is the average number of strips per shift? b. Strips of which thickness cluster are the most common, and strips of which thickness cluster are the least common? c. What are the minimum, average, and maximum values of delta throughput and RTR? d. Are there shifts during which the PPL processes strips of only steel grade 1, or of only steel grade 2, etc.? 2. Can the RTR theory adequately explain the deviations from the planned production figures? Explain why or why not. 3. Is the MPT theory sufficient to explain the deviations? Explain why or why not. PART B: SCHULZE’S MODEL Now interpret Schulze’s model. Questions: 4. Develop a sound regression model that can be used to predict delta throughput based on the characteristics of the strips scheduled for production. Include only explanatory variables that have a coefficient with a 10% level of significance. 5. Interpret the coefficient of RTR for the PPL and provide a 90% confidence interval for the value of the coefficient (in the population). 6. A strip of thickness 1 and width 1 is replaced by a strip of thickness 3 and width 3. This change does not affect any other aspect of the production. Provide an estimate for the change in delta throughput. PART C: PREDICTION OF MAY THROUGHPUT Two weeks after the first phone call about the deviations of production figures from planned volumes, Schulze was happy to have a sound prediction model on hand. Now he was looking forward to applying the model for future planning periods. The planning meeting for May was scheduled for the next day, and the production engineers have provided the requested materialstructure data that would serve as input for the model. “Let’s see what the prediction tells us,” Schulze said to Täger. As usual, the initial plan included an average capacity of 750 tons per shift. “I’m pretty sure the initial estimate will yield a KELLOGG SCHOOL OF MANAGEMENT 7 PRODUCTION PLANNING AT THYSSENKRUPP 5-215-250 useful first benchmark, but we also need to look at the uncertainty in the forecast,” Schulze continued, and he entered the data. “All right,” Täger replied. “I can see the predicted deviation from planned production for the next month in the model. We should show this in the planning meeting tomorrow and adjust the line capacity for May.” The next day, the predicted outcome was included in the monthly planning for the very first time. A new era of production planning at ThyssenKrupp Steel Europe had begun. Next, determine Schulze’s forecast. Questions: 7. The table below shows the data provided by the production engineers. Because of major upcoming maintenance on the PPL, only 84 shifts were planned for the month of May. Provide an estimate for the average delta throughput per shift in May based on these estimated figures. (The actual figures are, of course, still unknown.) Table 2: Planned Production in May (units of all forecasts: numbers of strips) Characteristic Forecast Thickness 1 996 Thickness 2 1,884 Thickness 3 434 Width 1 1,242 Width 2 1,191 Grade 1 109 Grade 2 709 Grade 3 167 Grade 4 243 Grade 5 121 8. Provide a 90% confidence interval for the average delta throughput per shift in May. 9. An RTR of 86% for a production facility such as the Bochum PPL is considered a good value. A value of 90% would be considered world class. The effort to increase production performance measured in RTR by just one percentage point, from 86% to 87%, is assumed to be very costly. In light of your model, would you expect such a performance improvement to pay for itself? PART D: ADDITIONAL ANALYSIS Schulze’s prediction model led to an intensive discussion in the production-planning meeting that provided him with much food for thought. As a result, he decided to analyze whether the inclusion of some human or timing factors potentially could enhance his prediction model. In the final part of the analysis, consider some enhancements to your model. 8 KELLOGG SCHOOL OF MANAGEMENT 5-215-250 PRODUCTION PLANNING AT THYSSENKRUPP Questions: 10. Determine whether, for given production quantities, the performance of the PPL depends on the group working each shift. Can you detect any significantly over- or under-performing shift groups? 11. Tests and rework are regularly scheduled on early shifts during the week (but not on weekends). Both involve interruptions and slow process speed, which are not indicated as downtimes and are not included in the RTR. As a result, all else being equal, early shifts during the week should process less steel than the other shifts. Can you show the presence of this effect? 12. Provide a final critical evaluation of your prediction model. What are the key insights with respect to production planning at the Bochum PPL? What are the weaknesses of your model? KELLOGG SCHOOL OF MANAGEMENT 9 KH19, Exercises Exercises QUESTION 1 Unoccupied seats on flights cause airlines to lose revenues. A large airline wants to estimate its average number of unoccupied seats per flight over the past year. To accomplish this, the records of 225 flights are randomly selected, and the number of unoccupied seats is noted for each of the flights in the sample. The sample mean is 14.5 seats and the sample standard deviation is s = 8.2 seats. a) Provide a 95% confidence interval for the mean number of unoccupied seats per flight during the past year. b) Provide an 80% confidence interval for the mean number of unoccupied seats per flight during the past year. c) Can you prove, at a 2% level of significance, that the average number of unoccupied seats per flight during the last year was smaller than 15.5? QUESTION 2 During the National Football League (NFL) season, Las Vegas odds-makers establish a point spread on each game for betting purposes. The final scores of NFL games were compared against the final spreads established by the odds-makers ahead of the game. The difference between the game outcome and point spread is called the point-spread error. For example, before the 2003 Super Bowl the Oakland Raiders were established as 3-point favorites over the Tampa Bay Buccaneers. Tampa Bay won the game by 27 points and so the point-spread error was –30. (Had the Oakland Raiders won the game by 10 points then the point-spread error would have been +7.) In a sample of 240 NFL games the average point-spread error was – 1.6. The sample standard deviation was s = 13.3. Can you reject that the true mean point-spread error for all NFL games is zero? (significance level α = 0.05) 1 KH19, Exercises QUESTION 3 In a random sample of 95 manufacturing firms, 67 respondents have indicated that their company attained ISO certification within the last two years. Find a 99% confidence interval for the population proportion of companies that have been certified within the last two years. QUESTION 4 Of a random sample of 361 owners of small businesses that had gone into bankruptcy, 105 reported conducting no marketing studies prior to opening the business. Can you reject the null hypothesis that at most 25% of all members of this population conducted no marketing studies before opening the business (significance level α = 0.05)? QUESTION 5 Hertz contracts with Uniroyal to provide tires for Hertz’ rental car fleet. A clause in the contract states that the tires must have a life expectancy of at least 28,000 miles. Of the 10,000 cars in the Hertz’ fleet, 400 are based in Chicago. The Chicago garage tested the tires on 60 of their cars. The life spans of the 60 tire sets are listed in the file tires.xls. If Hertz wants to use a 1% level of significance, should Hertz seek relief from (i.e., sue) Uniroyal? That is, can Hertz prove that the tires did not meet the contractually agreed (average) life expectancy? QUESTION 6 Tyler Realty would like to be able to predict the selling price of new homes. They have collected data on size (“sqfoot” in square feet) and selling price (“price” in thousands of dollars) which are stored in the file tyler.xls. Download this file from the course homepage and answer the following questions. a) Develop a scatter diagram for these data with size on the horizontal axis using KStat. Display the best fit line in the scatter diagram. b) Develop an estimated regression equation. Report the KStat regression output. c) Predict the selling price for a home that is 2,000 square feet. 2 KH19, Exercises QUESTION 7 The time between eruptions of the Old Faithful geyser in Yellowstone National Park is random but is related to the duration of the previous eruption. In order to investigate this relationship you collect data on 21 eruptions. For each observed eruption, you write down its duration (call it DUR) and the waiting time to the next eruption (call it TIME). That is, your variables are: DUR Duration of the previous eruption (in minutes) TIME Time until the next eruption (in minutes) You obtain the following regression output from KStat. Regression: TIME Coefficient std error of coef t-ratio p-value Constant DUR 31.01311 9.79006898 4.41658492 1.29990618 7.0220 7.5314 0.0001% 0.0000% a) Write down the estimated regression equation, and verbally interpret the intercept and the slope coefficients (in terms of geysers and eruption times). b) The most recent eruption lasted 3 minutes. What is your best estimate for the time till the next eruption? c) Based on your regression, what is difference between the average time until the next eruption after a 3.2-minute eruption and the average time until the next eruption after a 3-minute eruption? 3