Basic Statistics for Proficiency testing

Transcription

Basic Statistics for Proficiency testing
Basic Statistics for Laboratory
Proficiency Testing
PRAKIT BOONPORNPRASERT DVM, M.A.
Virology Laboratory
National Institute of Animal Health
OUTLINE
• 1) Type of PTA Proficiency Testing
Programmed
• 2) Composition of the proficiency
panel and statistical design
• 3) Data Preparation
• 4) summary statistics and example
Continue
•
•
•
•
5) Robust Z-scores and Outliers
6) Graphical Displays
7) Robust Z-scores interpretation
8) Youden diagram interpretation
Type of PTA Proficiency Testing
Programme
1. Testing Interlaboratory
Comparisons
2. Calibration Interlaboratory
Comparisons
Testing Interlaboratory Comparisons
* testing
interlaboratory comparisons ,
which involve concurrent testing of
samples by two or more laboratories
and calculation of consensus values
from all participants’ results.
Testing Interlaboratory Comparisons
Calibration Interlaboratory Comparisons
•
calibration interlaboratory
comparisons in which one test item
is distributed sequentially among two
or more participating laboratories
and each laboratory’s results are
compared to reference values.
Calibration Interlaboratory Comparisons
Assign Value
• consensus value
- an assigned value obtained from the
results submitted by participants (e.g.
for most testing programs the median
is used as the assigned value)
Assign Value
• reference value
- an assigned value which is provided by
a Reference Laboratory
Consensus Value
• The advantages of participant consensus include
low cost, because the assigned value does not require additional analytical work.
• No one member or group is accorded higher
status.
• Calculation of the value is usually
straightforward.
Composition of the proficiency panel
Statistical design
• Negative sample
• Strong positive Sample
• Weak positive Sample
Consensus Value
• The principal disadvantages of participant
consensus values are, first, that they are not
independent of the participant results.
• if the majority of results are biased,
participants whose results are unbiased may
unfairly receive extreme z-scores.
Composition of the proficiency panel
Statistical design
• Between-laboratories variation
• Within-laboratory variation
• participants must perform the same
testing more than once (e.g. Twice or
paired samples).
Example of AI Type A PT Panel
Type of paired sample
1. Uniform pairs
identical blind duplicates
(where the results are expected
to be the same)
Type of paired sample
2. Split pairs
slightly different blind duplicates
(where the results should be slightly
different. )
Analysis of paired sample
• The statistical analysis of the
results is the same for both
types of pairs (uniform or split),
but the interpretation is slightly
different .
Data Preparation
• Prior to commencing the statistical
analysis, a number of steps are
undertaken to ensure that the data
collected is accurate and appropriate
for analysis.
Continue
• It is during this checking phase
that gross errors and potential
problems with the data in general may
be identified.
• In some cases the results are then
transformed.
for example
• For microbiological count data the
statistical analysis is usually carried
out on the log 10 of the results,
rather than the raw counts.
• For HI test the statistical is usually
carried out the Log titer i.e. dilution
1/32 = 25
Quantitative Data
Qualitative Data
The Normal Distribution
Robust statistics
• Robust statistics are based on the
assumption that the data are a sample
from an essentially normal distribution
contaminated with heavy tails and a
small proportion of outliers.
Robust statistics
• Insensitive to the presence of outliers
and heavy tails to avoid undue
influence from poor results, and this is
why the median or a robust mean is
valuable.
Robust statistics
•
The median, however, is more robust
when the frequency distribution is
strongly skewed.
summary statistics
•
•
•
•
•
•
•
No. of results
Median
Normalised IQR
Robust CV
Minimum
Maximum
Range
summary statistics
summary statistics
summary statistics
summary statistics
summary statistics
• The no. of results is simply the
total number of results received
for a particular test/sample, and
is denoted by N.
Example
• จากการทดสอบ AI Type A โดยวิธี Realtime
PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม
ทดสอบได้ ผลดังนี ้ 36.97, 35.88, 36.04 ,35.46
,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30
• ค่ า N =
summary statistics
• The median is the middle value of
the group, i.e. half of the results
are higher than it and half are
lower. If N is an odd number the
median is the single central value,
i.e. X[(N+1)/2].
Continue
• If N is even, the median is the average
of the two central values, i.e. (X[N/2]
+X[(N/2)+1])/2.
• For example if N is 9 the median is
the 5th sorted value and if N is 10 the
median is the average of the 5th and 6th
values.
Example
• จากการทดสอบ AI Type A โดยวิธี Realtime
PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม
ทดสอบได้ ผลดังนี ้ 36.97, 35.88, 36.04 ,35.46
,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30
• ค่ า Median =
summary statistics
• The normalised IQR is a measure of
the variability of the results.
• It is equal to the interquartile
range (IQR) multiplied by a factor
†(0.7413), which makes it
comparable to a standard
deviation.
Example
• จากการทดสอบ AI Type A โดยวิธี Realtime
PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม
ทดสอบได้ ผลดังนี ้ 36.97, 35.88, 36.04 ,35.46
,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30
• ค่ า Interquatile Range =
• ค่ า The normalised IQR =
Continue
• In most cases Q1 and Q3 are obtained
by interpolating between the data
values.
• The IQR = Q3 – Q1 Where
• Q3=3*(N+1)/4
• Q1=(N+1)/4
The IQR = (Q3 – Q1)=(37.2 - 35.88)
The normalised IQR = IQR × 0.7413
summary statistics
• The robust CV is a coefficient of
variation (which allows for the
variability in different samples/tests to
be compared) and is equal to the
normalised IQR divided by the median,
expressed as a percentage - i.e. robust
CV = 100 × normalised IQR ÷ median.
Example
• จากการทดสอบ AI Type A โดยวิธี Realtime
PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม
ทดสอบได้ ผลดังนี ้ 36.97, 35.88, 36.04 ,35.46
,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30
• ค่ า The robust CV =
summary statistics
• The minimum is the lowest value (i.e.
X[1]).
• The maximum is the highest value
(X[N]).
• The range is the difference between
them (X[N]–X[1]).
Example
• จากการทดสอบ AI Type A โดยวิธี Realtime
PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม
ทดสอบได้ ผล ดังนี ้ 36.97, 35.88, 36.04 ,35.46
,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30
• ค่า The Minimum =
• ค่า The Maximum =
• ค่า Range =
Robust Z-scores and Outliers
• Z-scores based on robust summary
statistics (the median and normalised
IQR).
• Where pairs of results have been
obtained, two z-scores are calculated
a between-laboratories z-score and
a within-laboratory z-score.
Robust Z-scores
• Suppose the pair of results are from
two samples called A and B.
• The median and normalised IQR of all
the sample A results are denoted by
median(A) and normIQR(A),
respectively (similarly for sample B).
Uniform pair sample
Z-scores
• The z-score is a measure of the
deviation of the result from the
assigned value.
• Z score= (X-Xbar)/SD
• X= result reported by participant
• Xbar= assigned value
• SD= standard deviation for proficiency
assessment
Robust Z-scores
A simple robust z-score (denoted by Z) for
a laboratory’s sample A result would then
be:
The standardised sum (denoted by S)
and standardised difference (D)
• The standardised sum (denoted by S)
and standardised difference (D) for the
pair of results are:
Between-laboratories z-score
• The between-laboratories z-score
(denoted by ZB) is then calculated as
the robust z-score for S
Within-laboratory z-score
• The within-laboratory z-score (ZW) is
the robust z-score for D
Outlier
• An outlier is defined as any
result/pair of results with an
absolute z-score greater than or
equal to three, i.e. Z ≥ 3.0 or
Z ≤ -3.0.
• Outliers are identified in the table by a
marker (§) beside the z-score.
Outlier
• This outlier criteria, | Z | ≥ 3.0, has
a confidence level of about 99%
(related to the normal distribution)
• A confidence level of approximately
95% z-score in this region
(i.e. 2.0 < | Z | < 3.0)
Outlier of ZB uniform and split pairs
• A positive between-laboratories outlier
(i.e. ZB ≥ 3.0) indicates that both
results for that pair are too high.
• A negative between-laboratories
outlier (i.e. ZB ≤-3.0) indicates that the
results are too low.
Outlier of ZW uniform and split pairs
• For uniform pairs, where the results
are on identical samples, a withinlaboratory outlier of either sign (i.e.
| ZW| ≥ 3.0) indicates that the
difference between the results is too
large.
Continue
• For split pairs, where the analyte is
at different levels in the two
samples, a positive within-laboratory
outlier (i.e. ZW ≥ 3.0) indicates that
the difference between the two results
is too large.
Continue
• For split pairs, a negative withinlaboratory outlier (i.e. ZW ≤ -3.0)
indicates that the difference is too
small or in the ‘opposite direction’ to
the medians.
Z-score of single results
• A single result on one sample (X),
a simple robust z-score is calculated
as
• Z = {X-median(X)}/normIQR(X)
Continue
• The sign of the z-score indicates
whether the result is too high (positive
z-score) or too low (negative zscore).
• But whether this is due to
between-laboratories or withinlaboratory variation, or both, is
unknown.
Graphical Displays
• The z-score bar-chart
• The Youden diagram
• Very useful to participants - especially
those participants with outliers
because they can see how their results
differ from those submitted by other
laboratories.
The z-score bar-chart
The z-score bar-chart
• The advantages of these charts are that
each laboratory is identified and the
outliers are clearly indicated.
• They are not graphs of the actual
results.
The Youden diagram
• The advantages of these diagrams
are that they are plots of the
actual data.
• So the laboratories with results
outside the ellipse can see how their
results differ from the others.
The Youden diagram
As a guide to the interpretation of the
Youden diagrams:
• laboratories with significant systematic
error components (i.e. betweenlaboratories variation) will be outside the
ellipse in either the upper right hand
quadrant (as formed by the median lines)
or the lower left hand quadrant, i.e.
inordinately high or low results for both
samples;
As a guide to the interpretation of
the Youden diagrams:
• laboratories with random error
components (i.e. within-laboratory
variation) significantly greater than other
participants will be outside the ellipse
and (usually) in either the upper left or
lower right quadrants, i.e. an inordinately
high result for one sample and low for the
other.
The Youden diagram
• It is important to note however that
Youden diagrams are an illustration of
the data only, and are not used to
assess the results (this is done by the zscores).
Summary table
Between-laboratories z-score (ZB)
Within-laboratories z-score (ZW)
Youden diagram of Type A
• Statistic Significant but
Diagnostics not significant
Example of AI Type A PT Panel
ปกติ
Between Lab Variation (ZB)
Within Lab Variation (ZW)
Youden plot
ระดับสู ง,เท่ ากัน
Between Lab Variation (ZB)
Within Lab Variation (ZW)
Youden plot
ระดับตา่ ,เท่ ากัน
Between Lab Variation (ZB)
Within Lab Variation (ZW)
Youden plot
ระดับสู ง,ไม่ เท่ ากัน
Between Lab Variation (ZB)
Within Lab Variation (ZW)
Youden plot
ระดับตา่ ,ไม่ เท่ ากัน
Between Lab Variation (ZB)
Within Lab Variation (ZW)
Youden plot
THANK YOU
• 1. Paul Selleck (AAHL) for HI - PT testing
• 2. Brian Mehan (AAHL) for PCR - PT testing
• 3. Gemma Carlile (AAHL) for Statistics for PT
panel
• 4. Dr. Vimol Jirathanawat : Director of NIAH
• 5. Dr. Sujira Pachariyanon : Head of Virology
Laboratory