Two sample test
Transcription
Two sample test
Successful Statistics LLC Pocket Stats: Quick Statistical Tools You can Remember! Andy Sleeper Successful Statistics LLC [email protected] 970-420-0243 1 (c) 2009 Successful Statistics LLC Successful Statistics LLC Outline Pocket stats: What? Why? Two sample test: – Tukey’s end count test One sample test: – Fisher’s sign test with pocket stats approximation Sample sizes for pass-fail tests: – Pocket stats approximation and exact formula 2 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Successful Statistics LLC Memory in Decision Making When quick, reliable decisions are required, memorized tools can be very helpful For example, take a normal distribution ~95% µ−2σ – What is the probability of observing a value within 2 standard deviations of the mean? – Within 3 standard deviations? µ+2σ 99.73% µ−3σ 3 µ µ (c) 2009 Successful Statistics LLC Successful Statistics LLC “Pocket Stats” Tools Objective is to make quick and correct decisions about data Simple enough to remember No computer required – Mental math only – Perhaps a calculator Errors from approximation should be conservative, or at least reasonable 4 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC µ+3σ Successful Statistics LLC Three Pocket Stats Tools Tukey’s end count test – For detecting a difference between two distributions based on two samples Fisher’s one-sample sign test – For detecting a change in median based on one sample – Pocket stats version is an approximation of this established tool available in statistical software Sample size calculation for pass-fail tests – For calculating minimum number of pass-fail tests required, assuming zero failures – Pocket stats version is a conservative approximation 5 (c) 2009 Successful Statistics LLC Successful Statistics LLC Signals and Noise Statistical tools allow us to identify significant effects or signals without being confused by random variation or noise RANDOM VARIATION (NOISE) 6 “95% confidence” means <5% probability that random variation looks like a significant effect, when the true effect is zero (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Successful Statistics LLC Outline Pocket stats: What? Why? Two sample test: – Tukey’s end count test One sample test: – Fisher’s sign test with pocket stats approximation Sample sizes for pass-fail tests: – Pocket stats approximation and exact formula 7 (c) 2009 Successful Statistics LLC Successful Statistics LLC Example: Comparing Two Samples Two measuring instruments A and B measured the same sample material 6 times each. Are the instruments different? Here are the measurements for each sample, sorted in increasing order: – Sample A: 47, 49, 50, 52, 53, 55 – Sample B: 42, 44, 45, 46, 50, 51 A B 40 8 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC B B B 45 A A A A B B 50 A 55 Successful Statistics LLC Tukey’s End Count Test End count = count of As above all Bs + count of Bs below all As High end count provides confidence that the two distributions are different Remember: end count of 7 for 95% confidence John Tukey developed an published this test in 1959 John Tukey 1915 – 2000 9 (c) 2009 Successful Statistics LLC Successful Statistics LLC Tukey’s End Count Test (Cont.) Measured values Count the number of measurements in group H that are > all the measurements in group L plus the number of measurements in group L that are < all in group H This sum is the “end count” of the data 8 H HH H HHH H L LL 8 L LLL L End count: 16 10 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC 5 5 HH H H H L HL H L H LL L LL 10 3 4 HH H L H L L H L HH L L LL 7 3 2 HH H L H L L H L H H LL H LL 5 H LH L H L L H L H H L H LL 3 H H LL H L L H L H H LL H H NA Successful Statistics LLC Exercise Compare the distribution of parts manufactured by two machines Parts were measured in random order In this table, the measurements are sorted in increasing order Is there a difference between these two machines? How confident are you? 11 (c) 2009 Successful Statistics LLC Successful Statistics LLC 12 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Number of parts in second group The critical end counts of 7, 10, 13 are rules of thumb meant to be remembered. They are generally safe If sample sizes are unequal, the critical end count may be different These tables list the critical end counts for 95% and 99% confidence Number of parts in second group Tukey’s End Count Test With Unequal Sample Sizes 3 4 5 6 7 8 9 10 11 12 3 7 7 8 9 9 9 10 10 Minimum total endcount for 95% confidence Number of parts in the first group 4 5 6 7 8 9 10 11 7 7 8 9 9 9 10 7 7 7 7 8 9 9 8 7 7 7 7 7 8 8 8 7 7 7 7 7 7 8 8 7 7 7 7 7 7 7 8 8 7 7 7 7 7 7 7 9 8 7 7 7 7 7 7 9 8 8 7 7 7 7 7 8 8 8 8 7 7 7 7 9 8 8 8 8 7 7 7 12 10 9 8 8 8 8 7 7 7 7 3 4 5 6 7 8 9 10 11 12 3 11 11 12 13 13 Minimum total endcount for 99% confidence Number of parts in the first group 4 5 6 7 8 9 10 11 11 11 12 13 9 10 10 11 11 12 9 9 10 10 10 11 11 9 9 9 9 10 10 10 11 10 10 9 9 9 10 10 10 10 10 10 9 9 9 10 10 11 10 10 9 10 10 10 10 11 11 10 10 10 10 10 10 12 11 11 10 10 10 10 10 12 12 11 10 10 10 10 10 12 13 12 12 11 10 10 10 10 10 10 Successful Statistics LLC Comparing End Count to Other TwoSample Tests Test Tests for change in… Assuming… Tukey end count Distributions No distribution Two-sample t test Population mean Normal distribution F test Population standard deviation Normal distribution Mann Whitney (aka Wilcoxon) test Distributions No distribution Notes about Tukey end count test Is not available when one sample has both highest and lowest values Makes no assumption of any distribution family Critical end counts do not change (much) as sample size increases Critical end counts do not change (much) for uneven sample sizes NOT an approximation of another test – this is a unique test procedure Results of this test may not match other tests 13 (c) 2009 Successful Statistics LLC Successful Statistics LLC Outline Pocket stats: What? Why? Two sample test: Tukey’s end count test One sample test: – Fisher’s sign test with pocket stats approximation Sample sizes for pass-fail tests: – Pocket stats approximation and exact formula 14 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Successful Statistics LLC Fisher One Sample Sign Test Question: Based on one sample, has the median shifted from a specified value? Example: A bank has a goal for median loan processing time to be 10 days or less In 20 loans, only 5 were < 10 days Is the median above 10? 15 Loan Processing Time 30 25 20 15 10 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 (c) 2009 Successful Statistics LLC Successful Statistics LLC Fisher One-Sample Sign Test Pocket Stats Version Let M0 be the hypothetical median value Let n be the sample size – Subtract from n the count of values exactly equal to M0 Let s be the count of values < M0 or the count of values > M0, whichever is smaller If s ≤ n − n then median ≠ M0 2 with 95% confidence For 99% confidence: n s ≤ − 2 16 2n n s ≤ − 3n For 99.9% confidence: 2 means “round down” means “round up” The rounding rules both mean s must be smaller to be significant (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Successful Statistics LLC Example: Control Chart Does this control chart show any out of control conditions? 17 (c) 2009 Successful Statistics LLC Successful Statistics LLC Example: Measurement Agreement Is there a significant bias between supplier and customer measurements of these ten parts? Part ID 1 2 3 4 5 6 7 8 9 10 Supplier: 220 216 221 215 224 213 219 223 221 224 Customer: 218 215 222 212 223 210 218 221 221 222 First, is this a two-sample or a one-sample problem? 18 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Tukey mean-difference plot Successful Statistics LLC Fisher One-Sample Sign Test Exact Formulas Let M0 be the hypothetical median value Ronald Fisher Let n be the sample size 1890-1962 – Subtract from n the count of values = M0 Let s be the count of values < M0 or the count of values > M0, whichever is smaller s P-value is 21− s ∑ n i =0 i in Excel: =2*BINOMDIST(s,n,0.5,TRUE) Confidence is 1 – P-value Critical value for C confidence in Excel: =CRITBINOM(n,0.5,((1-C)/2))-1 19 (c) 2009 Successful Statistics LLC Successful Statistics LLC Comparing Pocket Stats to Exact Version of Fisher One-Sample Test Confidence Level Comparison Fisher 1-sample sign test 95% Pocket Stats 95% Exact 99% Pocket Stats 99% Exact 1 0.99 l e v Le e c n e id f n o C la tu c A 0.98 0.97 0.96 0.95 0.94 At 95%, this pocket stats tool is usually but not always a conservative approximation! 0.93 0.92 0.91 0.9 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 6 9 2 1 1 1 2 2 2 3 3 3 3 4 4 4 5 5 5 6 6 6 6 7 7 7 8 8 8 9 9 9 9 Sample Size 20 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Successful Statistics LLC Outline Pocket stats: What? Why? Two sample test: Tukey’s end count test One sample test: Fisher’s sign test with pocket stats approximation Sample sizes for pass-fail tests: – Pocket stats approximation and exact formula 21 (c) 2009 Successful Statistics LLC Successful Statistics LLC Sample Size for Pass-Fail Tests How many units need to pass a pass-fail test, with zero failures, to prove that the probability of failing units is < p? Example: – At the end of a Black Belt project, we want to verify that a problem has been fixed with a pass-fail test. We want to show that the problem affects less than 1% of the units – How many units need to pass the test with zero failures to prove < 1% failure rate with high confidence? 22 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Successful Statistics LLC Sample Size for Pass-Fail Tests Pocket Stats Approximation Confidence Sample Size 95% 3/p 99% 5/p 99.9% 7/p p is the probability of defective units If this many units pass a passfail test with zero failures, you have the specified confidence that the true failure rate is < p 23 Exact Formula ln (1 − C ) n= ln(1 − p ) C is the confidence, expressed as a number between 0 and 1 p is the probability of defective units If this many units pass a passfail test with zero failures, you have C×100% confidence that the true failure rate is < p (c) 2009 Successful Statistics LLC Successful Statistics LLC Examples An electronics company is planning reliability testing on a remote control. One test is a 1-meter drop test onto concrete. How many units must be dropped without failure to prove that 99.5% would pass the same test with 95% confidence? Because of resource limitations, long-term humidity testing can be performed on only 30 units. If all pass the test, what does this prove with 95% confidence? 24 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Successful Statistics LLC Comparing Pocket Stats Sample Size to Exact Formula Pass-Fail Sample Size Comparison 95% Pocket Stats 99% Pocket Stats 99.9% Pocket Stats 95% Exact 99% Exact 99.9% Exact 100000 Pocket stats sample size is always slightly larger than exact sample size s re u li 10000 Fa o r e Z h 1000 it w ze iS le 100 p m aS 10 0.01% 0.10% 1.00% 10.00% 100.00% Probability of defective units 25 (c) 2009 Successful Statistics LLC Successful Statistics LLC Outline Pocket stats: What? Why? Two sample test: Tukey’s end count test One sample test: Fisher’s sign test with pocket stats approximation Sample sizes for pass-fail tests: Pocket stats approximation and exact formula 26 (c) 2009 Successful Statistics LLC (c) 2007 Successful Statistics LLC Successful Statistics LLC References 27 Sleeper, A (2009) “Pocket Stats: Quick Significance Tests You Can Remember” 3/23/2009 www.sixsigmaiq.com Sleeper, A (2009) “Pocket Stats: Quick Significance Tests You Can Remember, Part 2” 4/13/2009 www.sixsigmaiq.com Sleeper, A (2009) “Pocket Stats, Part 3: Sample Size for Pass-Fail Tests” 11/16/2009 www.sixsigmaiq.com Sleeper, A (2006) Design for Six Sigma Statistics: 59 Tools for Diagnosing and Solving Problems in DFSS Initiatives, McGraw-Hill – ONLY $89.95! Tukey, J. W. (1959) “A Quick, Compact, Two-Sample Test to Duckworth’s Specifications” Technometrics, Vol. 1,Statistics No.LLC1, Feb. 1959, pp 21-38 (c) 2009 Successful (c) 2007 Successful Statistics LLC