6.4 the hypergeometric probability distribution
Transcription
6.4 the hypergeometric probability distribution
M06_SULL8028_03_SE_C06CD.QXD 7/30/08 6:36 PM Page 6–1 Section 6.4 The Hypergeometric Probability Distribution 6–1 6.4 THE HYPERGEOMETRIC PROBABILITY DISTRIBUTION Preparing for This Section Before getting started, review the following: • Classical Method (Section 5.1, pp. 263–266) • Independence (Section 5.3, pp. 286–287) • Multiplication Rule of Counting (Section 5.5, pp. 302–305) • Multiplication Rule (Example 6, Section 5.4, p. 297) • Combinations (Section 5.5, pp. 307–309) Objectives 1 Determine whether a probability experiment is a hypergeometric experiment 2 Compute the probabilities of hypergeometric experiments 3 Compute the mean and standard deviation of a hypergeometric random variable 1 Determine Whether a Probability Experiment Is a Hypergeometric Experiment In Section 6.2, we presented binomial experiments. Recall, the binomial probability distribution can be used to compute the probabilities of experiments when there are a fixed number of trials in which there are two mutually exclusive outcomes and the probability of success for any trial is constant. In addition, the trials must be independent. Based on the results from Example 6 in Section 5.3, we learned that, when small samples are obtained from large finite populations, it is reasonable to assume independence of events. That is, when obtaining a sample of size n from a population whose size is N, we are willing to assume independence of the events provided that n 6 0.05N (the sample size is less than 5% of the population size). What if the requirement of independence is not satisfied? Under these circumstances, the experiment is a hypergeometric experiment. Criteria for a Hypergeometric Probability Experiment A probability experiment is said to be a hypergeometric experiment provided: 1. The finite population to be sampled has N elements. 2. For each trial of the experiment, there are two possible outcomes, success or failure. There are exactly k successes in the population. 3. A sample of size n is obtained from the population of size N without replacement. If a probability experiment satisfies these three requirements, the random variable X, the number of successes in n trials of the experiment, follows the hypergeometric probability distribution.We now introduce the notation that we will use. Notation Used in the Hypergeometric Probability Distribution • The population is size N. The sample is size n. • There are k successes in the population. • Let the random variable X denote the number of successes in the sample of size n, so x must be greater than or equal to the larger of 0 or n - 1N - k2, and x must be less than or equal to the smaller of n or k. Copyright © 2010 Pearson Education, Inc. M06_SULL8028_03_SE_C06CD.QXD 6–2 7/30/08 6:36 PM Page 6–2 Chapter 6 Discrete Probability Distributions EXAMPLE 1 Historical Note The name hypergeometric is attributed to Leonhard Euler. Euler was born in Basel, Switzerland, on April 15, 1707. His father was a minister and wanted Leonhard to study theology as well. However, after a discussion with Johann Bernoulli, a friend from college, Euler’s father allowed him to study mathematics at the University of Basel. Euler completed his studies in 1726. Euler married Katharina Gsell on January 7, 1734. They had 13 children, only 5 of whom survived. Euler claims to have made many of his greatest discoveries with a child in his arms and children crawling at his feet. In 1740, Euler lost sight in his right eye.One of his famous quotes on this loss is “Now I will have less distraction.” He eventually lost sight in his other eye as well, but this did not slow him down. Euler died on September 18, 1783, in St. Petersburg. A Hypergeometric Probability Experiment Problem: Suppose that a researcher goes to a small college with 200 faculty, 12 of which have blood type O-negative. She obtains a simple random sample of n = 20 of the faculty and finds that 3 of the faculty have blood type O-negative. Is this experiment a hypergeometric probability experiment? List the possible values of the random variable X, the number of faculty that have blood type O-negative. Approach: We need to determine if the three criteria for a hypergeometric experiment have been satisfied. Solution: This is a hypergeometric probability experiment because 1. The population consists of N = 200 faculty. 2. Two outcomes are possible: the faculty member has blood type O-negative or the faculty member does not have blood type O-negative. The researcher obtained k = 3 successes. 3. The sample is size n = 20. The possible values of the random variable are x = 0, 1, 2, Á , 12. The largest value of X is 12, because we cannot have more than 12 successes since there are only 12 faculty with blood type O-negative in the population. Notice that we cannot use the binomial probability distribution to determine the likelihood of obtaining three successes in 20 trials in Example 1 because the sample size is large relative to the population size. That is, n = 20 is more than 5% of the population size, N = 200. Now Work Problem 5 2 Compute the Probabilities of Hypergeometric Experiments The basis for computing probabilities in a hypergeometric experiment lies in the fact that each sample of size n is equally likely to be chosen. Consider an urn that contains 8 white chips and 6 black chips for a total of N = 14 chips. If we decide to randomly select n = 3 chips, all possible combinations of chips are equally likely. That is, if we let W1, W2, Á , W8 represent the 8 white chips and B1, B2, Á , B6 represent the 6 black chips, selecting W1, W2, B3 is just as likely as selecting W3, W6, B4. Notice in both cases that we selected 2 white chips and 1 black chip. So, if X represents the number of black chips selected, we have x = 1 in both cases; however, the chips selected are different (so each represents a different sample). Hypergeometric Probability Distribution The probability of obtaining x successes based on a random sample of size n from a population of size N is given by P1x2 = 1kCx21N - kCn - x2 NCn (1) where k is the number of successes in the population. The logic behind Formula (1) is based on the Classical Method given on page 263, along with the Multiplication Rule of Counting given on page 304. The Classical Method for computing probabilities states that the probability of an event is the number of ways the event can occur, divided by the total number of outcomes in Copyright © 2010 Pearson Education, Inc. M06_SULL8028_03_SE_C06CD.QXD 7/30/08 6:36 PM Page 6–3 Section 6.4 The Hypergeometric Probability Distribution 6–3 the experiment. The denominator of Formula (1) represents the number of ways n objects can be selected from N objects. This represents the number of possible outcomes in the experiment. The numerator consists of two factors. The first factor, kCx, represents the number of ways we can select the x successes from the k successes in the population. The second factor, 1N - k2C1n - x2, represents the number of ways we can select n - x failures from the N - k failures in the population. Using the Multiplication Rule of Counting, we find the number of ways we could obtain x successes from n trials of the experiment. EXAMPLE 2 Using the Hypergeometric Probability Distribution Problem: Suppose a researcher goes to a small college of 200 faculty, 12 of which have blood type O-negative. She obtains a simple random sample of n = 20 of the faculty. Let the random variable X represent the number of faculty in the sample of size n = 20 that have blood type O-negative. (a) What is the probability that 3 of the faculty have blood type O-negative? (b) What is the probability that at least one of the faculty has blood type O-negative? Approach: This is a hypergeometric experiment with N = 200, n = 20, and k = 12. The possible values of the random variable X are x = 0, 1, 2, Á , 12. (Our sample cannot have more than k = 12 faculty with blood type O-negative.) We use Formula (1) to compute the probabilities. Solution (a) We are looking for the probability of obtaining 3 successes, so x = 3. P132 = 112C321200 - 12C 20 - 32 112C321188C172 = = 0.0833 C 200 20 200C20 There is a 0.0833 probability that, in a random sample of 20 faculty, exactly 3 have blood type O-negative. If we conducted this experiment 100 times, we would expect to select 3 faculty that have blood type O-negative about 8 times. (b) The phrase at least means greater than or equal to. The values of the random variable X that are greater than or equal to 1 are 1, 2, 3, Á , 12. Computing probabilities for all these random variables is time consuming. It is much easier to use the Complement Rule and compute P1X Ú 12 = 1 - P102. P1X Ú 12 = 1 - P102 = 1 - 112C021200 - 12C20 - 02 200C20 = 1 - 112C021188C202 200C20 = 0.7282 There is a 0.7282 probability that, in a random sample of 20 faculty, at least 1 has blood type O-negative. If we conducted this experiment 100 times, we would expect to select at least one of the faculty that have blood type O-negative about 73 times. EXAMPLE 3 Using the Hypergeometric Probability Distribution Problem: The hypergeometric probability distribution is used in acceptance sampling. Suppose that a machine shop orders 500 bolts from a supplier. To determine whether to accept the shipment of bolts, the manager of the facility randomly selects 12 bolts. If none of the 12 randomly selected bolts is found to be defective, he concludes that the shipment is acceptable. (a) If 10% of the bolts in the population are defective, what is the probability that none of the selected bolts are defective? (b) If 20% of the bolts in the population are defective, what is the probability that none of the selected bolts are defective? Copyright © 2010 Pearson Education, Inc. M06_SULL8028_03_SE_C06CD.QXD 6–4 8/12/08 4:00 PM Page 6–4 Chapter 6 Discrete Probability Distributions Approach: This is a hypergeometric experiment with N = 500 and n = 12. In part (a), we have that k = 0.115002 = 50 defectives. The possible values of the random variable X are x = 0, 1, 2, Á , 12 (you cannot have more successes than the sample size). Notice that a success means finding a defective bolt. In part (b), we have that k = 0.215002 = 100 defectives. The possible values of the random variable X are x = 0, 1, 2, Á , 12. We use Formula (1) to compute the probabilities. Solution (a) We are looking for the probability of obtaining 0 successes, so x = 0. 150C021500 - 50C12 - 02 150C021450C122 P102 = = = 0.2783 500C12 500C12 There is a 0.2783 probability that, in a random sample of 12 bolts, none are defective (if 10% of the bolts in the population are defective). If we conducted this experiment 100 times, we would expect to observe no defective bolts about 28 times. (b) We are looking for the probability of obtaining 0 successes, so x = 0. 1100C021500 - 100C12 - 02 1100C021 400C122 P102 = = = 0.0665 C 500 12 500C12 There is a 0.0655 probability that, in a random sample of 12 bolts, the manager will select none that are defective (if 20% of the bolts in the population actually are defective). If we conducted this experiment 100 times, we would expect to observe no defective bolts about 7 times. Notice that, as the number of defective bolts increases, the probability of not selecting a single defective bolt decreases. Now Work Problems 17(a)–(e) EXAMPLE 4 Computing Hypergeometric Probabilities Using Technology Problem: The hypergeometric probability distribution is used in acceptance sampling. Suppose that a machine shop orders 500 bolts from a supplier. To determine whether to accept the shipment of bolts, the manager of the facility randomly selects 12 bolts. If none of the 12 randomly selected bolts are found to be defective, he concludes that the shipment is acceptable. If 10% of the bolts in the population are defective, what is the probability that none of the selected bolts are defective? Approach: Statistical software or graphing calculators with advanced statistical features have the ability to determine hypergeometric probabilities. We use MINITAB to determine the probabilities. The steps for determining hypergeometric probabilities using MINITAB or Excel can be found in the Technology Step-byStep on page 6–7. Solution: We use MINITAB to determine the probability. Recall that N = 500, k = 50, and n = 12. See Figure 15. Figure 15 Probability Density Function* Hypergeometric with N ⫽ 500, M ⫽ 50, n ⫽ 12 x 0.00 P( X ⫽ x ) 0.278250 Interpretation: There is a 0.2783 probability that, in a random sample of 12 bolts, none are defective (if 10% of the bolts in the population are defective). If we conducted this experiment 100 times, we would expect to observe no defective bolts about 28 times. *MINITAB’s notation differs slightly from the notation that we use in this text. Instead of using k to represent the number of successes in the population, MINITAB uses M. Copyright © 2010 Pearson Education, Inc. M06_SULL8028_03_SE_C06CD.QXD 7/30/08 6:36 PM Page 6–5 Section 6.4 The Hypergeometric Probability Distribution 3 6–5 Compute the Mean and Standard Deviation of a Hypergeometric Random Variable We discussed finding the mean and standard deviation of a discrete random variable in Section 6.1. The formulas can be used to find the mean and standard deviation of a hypergeometric random variable as well. However, a simpler method exists. Mean and Standard Deviation of a Hypergeometric Random Variable A hypergeometric random variable X has mean and standard deviation given by the formulas mX = n # k N and sX = a N - n # # k b n B N - 1 N #N - k N (2) where n is the sample size k is the number of successes in the population N is the size of the population k is the proportion of successes in the population. If you look carefully at N k the formulas for the mean and standard deviation and replace with p, we almost N have the formulas for the mean and standard deviation of a binomial random N - n variable. (Note that is a finite population correction factor that approaches N - 1 1 as the population size increases, while n stays fixed and small relative to N. For this reason, we ignore its effect on the standard deviation of a binomial random variable) The ratio EXAMPLE 5 Computing the Mean and Standard Deviation of a Hypergeometric Random Variable Problem: Suppose that a researcher goes to a small college of 200 faculty, 12 of which have blood type O-negative. She obtains a simple random sample of n = 20 of the faculty. Determine the mean and standard deviation of the number of randomly selected faculty that will have blood type O-negative. Approach: This is a hypergeometric probability experiment with N = 200, n = 20, and k = 12. We use Formula (2) to find the mean and the standard deviation, respectively. Solution mX = n # k 12 = 20 # = 1.2 N 200 and sX = a N - n # # k b n B N - 1 N #N - k 200 - 20 # # 12 = a b 20 N B 200 - 1 200 # 200 - 12 = 1.01 200 Interpretation: We expect that, in a random sample of 20 faculty members, 1.2 will have blood type O-negative. If we take many different samples of size 20 from this population, the mean number of faculty that have blood type O-negative will approach 1.2. Now Work Problem 17(f) Copyright © 2010 Pearson Education, Inc. M06_SULL8028_03_SE_C06CD.QXD 6–6 7/30/08 6:36 PM Page 6–6 Chapter 6 Discrete Probability Distributions 6.4 ASSESS YOUR UNDERSTANDING Concepts and Vocabulary Applying the Concepts 1. Explain the similarities and differences between the hypergeometric probability distribution and the binomial probability distribution. 2. What criteria must be satisfied for a random variable X to be a hypergeometric random variable? 3. When listing the possible values of the hypergeometric random variable X, it must be the case that x is less than or equal to the smaller of n or k. Why? 4. In your own words, explain the logic behind Formula (1). Skill Building In Problems 5–8, verify that the following probability experiments represent hypergeometric probability experiments. Then determine the values of N, n, k and list the possible values of the random variable X. 5. In Michigan’s Winfall Lottery, a player must choose 6 numNW bers between 1 and 49, inclusive. Six balls numbered between 1 and 49 are then randomly selected from an urn. The random variable X represents the number of matching numbers. 6. In a neighborhood of 95 homes, 35 have pets. Suppose that 12 homes are selected at random. The random variable X represents the number of homes in the sample that have pets. 7. A manufacturer received an order of 250 computer chips. Unfortunately, 12 of the chips are defective. To test the shipment, the quality-control engineer randomly selects 20 chips from the box of 250 and tests them. The random variable X represents the number of defective chips in the sample of 20. 8. A baseball team has 25 players, 7 of whom bat left-handed. Suppose that the manager of this team is frustrated with the way the team is playing, so he decides to randomly select 9 players to play in the upcoming game. The random variable X is the number of left-handed batters in the game. In Problems 9–12, a hypergeometric probability experiment is conducted with the given parameters. Compute the probability of obtaining x successes. 9. N = 150, n = 20, k = 30, x = 5 10. N = 60, n = 8, k = 25, x = 3 11. N = 230, n = 15, k = 200, x = 12 12. N = 150, n = 10, k = 10, x = 1 In Problems 13–16, compute the mean and standard deviation of the hypergeometric random variable X. 13. N = 150, n = 20, k = 30 14. N = 60, n = 8, k = 25 15. N = 230, n = 15, k = 200 16. N = 150, n = 10, k = 10 17. Michigan’s Classic Lotto 47 In Michigan’s Classic Lotto 47 NW Lottery, a player must choose 6 numbers between 1 and 47, inclusive. Six balls numbered from 1 and 47 are then randomly selected from an urn. The random variable X represents the number of matching numbers. (a) What is the probability of matching 3 numbers? (b) What is the probability of matching 4 numbers? (c) What is the probability of matching 5 numbers? (d) What is the probability of matching 6 numbers? (e) A winning ticket is one in which the player matches 3, 4, 5, or 6 numbers. What is the probability of purchasing a winning ticket? Would it be unusual to purchase a winning ticket? (f) What is the mean and standard deviation of the random variable X? For a randomly selected ticket, how many numbers do you expect to match? 18. Got a Pet? In a neighborhood of 95 homes, 35 have pets. Suppose that 12 homes are selected at random. The random variable X represents the number of homes in the sample that have pets. (a) What is the probability of obtaining 8 homes with a pet? (b) What is the probability of obtaining 9 homes with a pet? (c) What is the probability of obtaining 12 homes with a pet? Would it be unusual to select 12 homes that have a pet? (d) What is the mean and standard deviation of the random variable X? 19. Acceptance Sampling A manufacturer received an order of 250 computer chips. Unfortunately, 12 of the chips are defective. To test the shipment, the quality-control engineer randomly selects 20 chips from the box of 250 and tests them. The random variable X represents the number of defective chips in the sample. (a) What is the probability of obtaining 4 defective chips? (b) What is the probability of obtaining 3 defective chips? (c) What is the probability that the quality-control engineer will not find any defective chips? (d) What is the probability of obtaining 14 defective chips? (e) How many defective chips would you expect to select? 20. Baseball Lineup A baseball team has 25 players, 7 of whom bat left-handed. Suppose that the manager of this team is frustrated with the way the team is playing, so he decides to randomly select 9 players to play in the upcoming game. The random variable X will be the number of left-handed batters in the game. (a) What is the probability of creating a lineup with 2 lefties? (b) What is the probability of creating a lineup with 1 lefty? (c) What is the probability of creating a lineup with no lefties? (d) What is the probability of creating a lineup with 8 lefties? (e) How many lefties would you expect to find in the lineup? 21. Hung Jury A hung jury is one that is unable to come to a unanimous decision regarding the guilt of the defendant. Suppose that there is a pool of 30 potential jurors, but 2 of the 30 potential jurors would never be willing to convict, regardless of the evidence presented. What is the probability that the trial will result in a hung jury, regardless of the evidence, if the jury consists of 12 randomly selected jurors? Copyright © 2010 Pearson Education, Inc. M06_SULL8028_03_SE_C06CD.QXD 7/30/08 6:36 PM Page 6–7 Section 6.4 The Hypergeometric Probability Distribution 22. Messy Sock Drawer Suppose that you wake up for work in the dark and find that the lights don’t work in your bedroom. In addition, your sock drawer is a mess and contains 12 black socks and 17 blue socks that otherwise look alike. What is the probability that you randomly select two black socks if you select exactly 2 socks? 23. Acceptance Sampling Suppose that a concrete manufacturer has made 200 concrete cylinders that are supposed to withstand 4,000 pounds per square inch of pressure. As the quality-control manager, you decide to randomly test 4 of the cylinders to be sure they are manufactured to specification. You will only accept the shipment if all 4 cylinders pass 6–7 the inspection. What is the probability that the shipment is accepted: (a) If 10% of the 200 cylinders are defective? (b) If 20% of the 200 cylinders are defective? (c) If 40% of the 200 cylinders are defective? (d) If 60% of the 200 cylinders are defective? (e) If 80% of the 200 cylinders are defective? (f) Draw a horizontal axis and label it Percent Defective. Draw a vertical axis and label it Probability Accept Shipment. Plot probability accept shipment against the percent defective and connect the points in a smooth curve. This curve is referred to as an operating characteristic curve. TECHNOLOGY STEP-BY-STEP Computing Hypergeometric Probabilities Using Technology TI-83/84 Plus The TI-83/84 Plus graphing calculators do not have this feature. Excel Computing P(x) 1. If desired, enter the possible values of the random variable X whose probability you wish to compute in column A. For example, if we want the probability that x = 0, 1, 2, or 3 in Example 3(a), we enter 0, 1, 2, and 3 into column A. 2. With the cursor in cell B1, select the fx icon. Highlight Statistical in the Function category window. Highlight HYPGEOMDIST in the Function name window. Click OK. 3. Fill in the window as shown to obtain the probabilities from Example 3(a). Click OK. MINITAB Computing P(x) 1. If desired, enter the possible values of the random variable X whose probability you wish to compute in C1. For example, if we want the probability that x = 0, 1, 2, or 3 in Example 3(a), we enter 0, 1, 2, and 3 into C1. Computing P (X ◊ x) Follow the same steps as for computing P1x2. In the window that comes up after selecting Hypergeometric Á , select the radio button for Cumulative probability. 2. Select the Calc menu, highlight Probability Distributions, then highlight Hypergeometric . . . 3. Fill in the window as shown to obtain the probabilities from Example 3(a). Click OK. Note that, if we only want P102, it is simplest to select the Input constant: radio button and enter 0 in the box. 4. Copy the contents in cell B1 to the remaining cells. Copyright © 2010 Pearson Education, Inc. Z01_SULL8028_03_SE_ANS_C06CD.QXD 7/4/08 12:18 AM Page 1 6.4 Assess Your Understanding Answers 6.4 Assess Your Understanding (page 000) 5. 7. 9. 11. 13. 15. 17. N = 49, n = 6, k = 6, X = 0, 1, 2, Á , 6 N = 250, n = 20, k = 12, X = 0, 1, 2, Á , 12 0.1856 0.1939 Probability Accept Shipment mX = 4, sX = 1.67 mX = 13.0, sX = 1.26 (a) 0.01986 (b) 0.001146 (c) 0.0000229 (d) 0.000000093 (e) 0.02103 (f) 0.766; 0.772; 0.766 19. (a) 0.0087 (b) 0.0507 (c) 0.3590 (d) 0 (e) 0.96 21. 0.6483 23. (a) 0.6539 (b) 0.4065 (c) 0.1270 (d) 0.0245 (e) 0.0014 (f) 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 20 40 60 80 100 Percent Defective Copyright © 2010 Pearson Education, Inc. AN6–1