Two sample test

Transcription

Two sample test
Successful Statistics LLC
Pocket Stats:
Quick Statistical Tools You can
Remember!
Andy Sleeper
Successful Statistics LLC
[email protected]
970-420-0243
1
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Outline
Pocket stats: What? Why?
Two sample test:
– Tukey’s end count test
One sample test:
– Fisher’s sign test with pocket stats
approximation
Sample sizes for pass-fail tests:
– Pocket stats approximation and exact formula
2
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Successful Statistics LLC
Memory in Decision Making
When quick, reliable
decisions are required,
memorized tools can
be very helpful
For example, take a
normal distribution
~95%
µ−2σ
– What is the
probability of
observing a value
within 2 standard
deviations of the
mean?
– Within 3 standard
deviations?
µ+2σ
99.73%
µ−3σ
3
µ
µ
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
“Pocket Stats” Tools
Objective is to make quick and correct
decisions about data
Simple enough to remember
No computer required
– Mental math only
– Perhaps a calculator
Errors from approximation should be
conservative, or at least reasonable
4
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
µ+3σ
Successful Statistics LLC
Three Pocket Stats Tools
Tukey’s end count test
– For detecting a difference between two
distributions based on two samples
Fisher’s one-sample sign test
– For detecting a change in median based on one
sample
– Pocket stats version is an approximation of this
established tool available in statistical software
Sample size calculation for pass-fail tests
– For calculating minimum number of pass-fail tests
required, assuming zero failures
– Pocket stats version is a conservative
approximation
5
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Signals and Noise
Statistical tools allow us to identify
significant effects or signals without being
confused by random variation or noise
RANDOM VARIATION
(NOISE)
6
“95% confidence” means <5% probability
that random variation looks like a
significant effect, when the true effect is
zero
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Successful Statistics LLC
Outline
Pocket stats: What? Why?
Two sample test:
– Tukey’s end count test
One sample test:
– Fisher’s sign test with pocket stats
approximation
Sample sizes for pass-fail tests:
– Pocket stats approximation and exact formula
7
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Example: Comparing Two Samples
Two measuring instruments A and B
measured the same sample material 6
times each. Are the instruments different?
Here are the measurements for each
sample, sorted in increasing order:
– Sample A: 47, 49, 50, 52, 53, 55
– Sample B: 42, 44, 45, 46, 50, 51
A
B
40
8
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
B B B
45
A A A A
B B
50
A
55
Successful Statistics LLC
Tukey’s End Count Test
End count = count of As
above all Bs + count of
Bs below all As
High end count provides
confidence that the two
distributions are
different
Remember: end count
of 7 for 95% confidence
John Tukey developed
an published this test in
1959
John Tukey
1915 – 2000
9
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Tukey’s End Count Test (Cont.)
Measured values
Count the number of measurements in group H that
are > all the measurements in group L plus the
number of measurements in group L that are < all in
group H
This sum is the “end count” of the data
8
H
HH
H
HHH
H
L
LL
8
L
LLL
L
End count: 16
10
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
5
5
HH
H
H
H
L
HL
H
L
H
LL
L
LL
10
3
4
HH
H
L
H
L
L
H
L
HH
L
L
LL
7
3
2
HH
H
L
H
L
L
H
L
H
H
LL
H
LL
5
H
LH
L
H
L
L
H
L
H
H
L
H
LL
3
H
H
LL
H
L
L
H
L
H
H
LL
H
H
NA
Successful Statistics LLC
Exercise
Compare the
distribution of parts
manufactured by two
machines
Parts were measured in
random order
In this table, the
measurements are
sorted in increasing
order
Is there a difference
between these two
machines?
How confident are you?
11
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
12
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Number of parts in second
group
The critical end
counts of 7, 10, 13
are rules of thumb
meant to be
remembered. They
are generally safe
If sample sizes are
unequal, the
critical end count
may be different
These tables list
the critical end
counts for 95% and
99% confidence
Number of parts in second
group
Tukey’s End Count Test With
Unequal Sample Sizes
3
4
5
6
7
8
9
10
11
12
3
7
7
8
9
9
9
10
10
Minimum total endcount for 95% confidence
Number of parts in the first group
4
5
6
7
8
9
10
11
7
7
8
9
9
9
10
7
7
7
7
8
9
9
8
7
7
7
7
7
8
8
8
7
7
7
7
7
7
8
8
7
7
7
7
7
7
7
8
8
7
7
7
7
7
7
7
9
8
7
7
7
7
7
7
9
8
8
7
7
7
7
7
8
8
8
8
7
7
7
7
9
8
8
8
8
7
7
7
12
10
9
8
8
8
8
7
7
7
7
3
4
5
6
7
8
9
10
11
12
3
11
11
12
13
13
Minimum total endcount for 99% confidence
Number of parts in the first group
4
5
6
7
8
9
10
11
11 11
12
13
9
10
10 11
11
12
9
9
10
10 10
11
11
9
9
9
9
10 10
10
11
10
10
9
9
9
10
10
10
10
10
10
9
9
9
10
10
11
10
10
9
10 10
10
10
11
11
10 10
10 10
10
10
12
11
11 10
10 10
10
10
12
12
11 10
10 10
10
10
12
13
12
12
11
10
10
10
10
10
10
Successful Statistics LLC
Comparing End Count to Other TwoSample Tests
Test
Tests for change in…
Assuming…
Tukey end count
Distributions
No distribution
Two-sample t test
Population mean
Normal distribution
F test
Population standard deviation
Normal distribution
Mann Whitney (aka
Wilcoxon) test
Distributions
No distribution
Notes about Tukey end count test
Is not available when one sample has both highest and lowest values
Makes no assumption of any distribution family
Critical end counts do not change (much) as sample size increases
Critical end counts do not change (much) for uneven sample sizes
NOT an approximation of another test – this is a unique test procedure
Results of this test may not match other tests
13
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Outline
Pocket stats: What? Why?
Two sample test:
Tukey’s end count test
One sample test:
– Fisher’s sign test with pocket stats
approximation
Sample sizes for pass-fail tests:
– Pocket stats approximation and exact formula
14
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Successful Statistics LLC
Fisher One Sample Sign Test
Question: Based on
one sample, has the
median shifted from
a specified value?
Example: A bank has
a goal for median
loan processing time
to be 10 days or less
In 20 loans, only 5
were < 10 days
Is the median above
10?
15
Loan Processing Time
30
25
20
15
10
5
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Fisher One-Sample Sign Test
Pocket Stats Version
Let M0 be the hypothetical median value
Let n be the sample size
– Subtract from n the count of values exactly equal to M0
Let s be the count of values < M0 or the count of values
> M0, whichever is smaller
If s ≤  n  −  n  then median ≠ M0
2
 
with 95% confidence
For 99% confidence:
n
s ≤  −
2
16

2n

n
s ≤   −  3n 
For 99.9% confidence:
2
  means “round down”
  means “round up”
The rounding rules both mean s must be smaller to be
significant
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Successful Statistics LLC
Example: Control Chart
Does this control chart show any out of
control conditions?
17
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Example: Measurement Agreement
Is there a significant bias between supplier and customer
measurements of these ten parts?
Part ID
1
2
3
4
5
6
7
8
9
10
Supplier:
220
216
221
215
224
213
219
223
221
224
Customer:
218
215
222
212
223
210
218
221
221
222
First, is this a two-sample or a one-sample problem?
18
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Tukey mean-difference
plot
Successful Statistics LLC
Fisher One-Sample Sign Test
Exact Formulas
Let M0 be the hypothetical median value
Ronald Fisher
Let n be the sample size
1890-1962
– Subtract from n the count of values = M0
Let s be the count of values < M0 or the count
of values > M0, whichever is smaller
s
P-value is 21− s ∑  n 
i =0
i
in Excel: =2*BINOMDIST(s,n,0.5,TRUE)
Confidence is 1 – P-value
Critical value for C confidence in Excel:
=CRITBINOM(n,0.5,((1-C)/2))-1
19
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Comparing Pocket Stats to Exact
Version of Fisher One-Sample Test
Confidence Level Comparison
Fisher 1-sample sign test
95% Pocket Stats
95% Exact
99% Pocket Stats
99% Exact
1
0.99
l
e
v
Le
e
c
n
e
id
f
n
o
C
la
tu
c
A
0.98
0.97
0.96
0.95
0.94
At 95%, this pocket stats tool is
usually but not always a
conservative approximation!
0.93
0.92
0.91
0.9
5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9
6 9 2
1 1 1 2 2 2 3 3 3 3 4 4 4 5 5 5 6 6 6 6 7 7 7 8 8 8 9 9 9 9
Sample Size
20
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Successful Statistics LLC
Outline
Pocket stats: What? Why?
Two sample test:
Tukey’s end count test
One sample test:
Fisher’s sign test with pocket stats
approximation
Sample sizes for pass-fail tests:
– Pocket stats approximation and exact formula
21
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Sample Size for Pass-Fail Tests
How many units need to pass a pass-fail
test, with zero failures, to prove that the
probability of failing units is < p?
Example:
– At the end of a Black Belt project, we want to
verify that a problem has been fixed with a
pass-fail test. We want to show that the
problem affects less than 1% of the units
– How many units need to pass the test with zero
failures to prove < 1% failure rate with high
confidence?
22
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Successful Statistics LLC
Sample Size for Pass-Fail Tests
Pocket Stats Approximation
Confidence
Sample Size
95%
3/p
99%
5/p
99.9%
7/p
p is the probability of
defective units
If this many units pass a passfail test with zero failures, you
have the specified confidence
that the true failure rate is < p
23
Exact Formula
 ln (1 − C ) 
n=

 ln(1 − p ) 
C is the confidence, expressed
as a number between 0 and 1
p is the probability of
defective units
If this many units pass a passfail test with zero failures, you
have C×100% confidence that
the true failure rate is < p
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Examples
An electronics company is planning
reliability testing on a remote control. One
test is a 1-meter drop test onto concrete.
How many units must be dropped without
failure to prove that 99.5% would pass the
same test with 95% confidence?
Because of resource limitations, long-term
humidity testing can be performed on only
30 units. If all pass the test, what does this
prove with 95% confidence?
24
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Successful Statistics LLC
Comparing Pocket Stats Sample Size
to Exact Formula
Pass-Fail Sample Size Comparison
95% Pocket Stats
99% Pocket Stats
99.9% Pocket Stats
95% Exact
99% Exact
99.9% Exact
100000
Pocket stats sample size is
always slightly larger than
exact sample size
s
re
u
li
10000
Fa
o
r
e
Z
h
1000
it
w
ze
iS
le
100
p
m
aS
10
0.01%
0.10%
1.00%
10.00%
100.00%
Probability of defective units
25
(c) 2009 Successful Statistics LLC
Successful Statistics LLC
Outline
Pocket stats: What? Why?
Two sample test:
Tukey’s end count test
One sample test:
Fisher’s sign test with pocket stats
approximation
Sample sizes for pass-fail tests:
Pocket stats approximation and exact formula
26
(c) 2009 Successful Statistics LLC
(c) 2007 Successful Statistics LLC
Successful Statistics LLC
References
27
Sleeper, A (2009) “Pocket Stats: Quick Significance
Tests You Can Remember” 3/23/2009
www.sixsigmaiq.com
Sleeper, A (2009) “Pocket Stats: Quick Significance
Tests You Can Remember, Part 2” 4/13/2009
www.sixsigmaiq.com
Sleeper, A (2009) “Pocket Stats, Part 3: Sample Size
for Pass-Fail Tests” 11/16/2009
www.sixsigmaiq.com
Sleeper, A (2006) Design for Six Sigma Statistics: 59
Tools for Diagnosing and Solving Problems in DFSS
Initiatives, McGraw-Hill – ONLY $89.95!
Tukey, J. W. (1959) “A Quick, Compact, Two-Sample
Test to Duckworth’s Specifications” Technometrics,
Vol.
1,Statistics
No.LLC1, Feb. 1959, pp 21-38
(c) 2009
Successful
(c) 2007 Successful Statistics LLC