Basic Statistics for Proficiency testing
Transcription
Basic Statistics for Proficiency testing
Basic Statistics for Laboratory Proficiency Testing PRAKIT BOONPORNPRASERT DVM, M.A. Virology Laboratory National Institute of Animal Health OUTLINE • 1) Type of PTA Proficiency Testing Programmed • 2) Composition of the proficiency panel and statistical design • 3) Data Preparation • 4) summary statistics and example Continue • • • • 5) Robust Z-scores and Outliers 6) Graphical Displays 7) Robust Z-scores interpretation 8) Youden diagram interpretation Type of PTA Proficiency Testing Programme 1. Testing Interlaboratory Comparisons 2. Calibration Interlaboratory Comparisons Testing Interlaboratory Comparisons * testing interlaboratory comparisons , which involve concurrent testing of samples by two or more laboratories and calculation of consensus values from all participants’ results. Testing Interlaboratory Comparisons Calibration Interlaboratory Comparisons • calibration interlaboratory comparisons in which one test item is distributed sequentially among two or more participating laboratories and each laboratory’s results are compared to reference values. Calibration Interlaboratory Comparisons Assign Value • consensus value - an assigned value obtained from the results submitted by participants (e.g. for most testing programs the median is used as the assigned value) Assign Value • reference value - an assigned value which is provided by a Reference Laboratory Consensus Value • The advantages of participant consensus include low cost, because the assigned value does not require additional analytical work. • No one member or group is accorded higher status. • Calculation of the value is usually straightforward. Composition of the proficiency panel Statistical design • Negative sample • Strong positive Sample • Weak positive Sample Consensus Value • The principal disadvantages of participant consensus values are, first, that they are not independent of the participant results. • if the majority of results are biased, participants whose results are unbiased may unfairly receive extreme z-scores. Composition of the proficiency panel Statistical design • Between-laboratories variation • Within-laboratory variation • participants must perform the same testing more than once (e.g. Twice or paired samples). Example of AI Type A PT Panel Type of paired sample 1. Uniform pairs identical blind duplicates (where the results are expected to be the same) Type of paired sample 2. Split pairs slightly different blind duplicates (where the results should be slightly different. ) Analysis of paired sample • The statistical analysis of the results is the same for both types of pairs (uniform or split), but the interpretation is slightly different . Data Preparation • Prior to commencing the statistical analysis, a number of steps are undertaken to ensure that the data collected is accurate and appropriate for analysis. Continue • It is during this checking phase that gross errors and potential problems with the data in general may be identified. • In some cases the results are then transformed. for example • For microbiological count data the statistical analysis is usually carried out on the log 10 of the results, rather than the raw counts. • For HI test the statistical is usually carried out the Log titer i.e. dilution 1/32 = 25 Quantitative Data Qualitative Data The Normal Distribution Robust statistics • Robust statistics are based on the assumption that the data are a sample from an essentially normal distribution contaminated with heavy tails and a small proportion of outliers. Robust statistics • Insensitive to the presence of outliers and heavy tails to avoid undue influence from poor results, and this is why the median or a robust mean is valuable. Robust statistics • The median, however, is more robust when the frequency distribution is strongly skewed. summary statistics • • • • • • • No. of results Median Normalised IQR Robust CV Minimum Maximum Range summary statistics summary statistics summary statistics summary statistics summary statistics • The no. of results is simply the total number of results received for a particular test/sample, and is denoted by N. Example • จากการทดสอบ AI Type A โดยวิธี Realtime PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม ทดสอบได้ ผลดังนี ้ 36.97, 35.88, 36.04 ,35.46 ,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30 • ค่ า N = summary statistics • The median is the middle value of the group, i.e. half of the results are higher than it and half are lower. If N is an odd number the median is the single central value, i.e. X[(N+1)/2]. Continue • If N is even, the median is the average of the two central values, i.e. (X[N/2] +X[(N/2)+1])/2. • For example if N is 9 the median is the 5th sorted value and if N is 10 the median is the average of the 5th and 6th values. Example • จากการทดสอบ AI Type A โดยวิธี Realtime PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม ทดสอบได้ ผลดังนี ้ 36.97, 35.88, 36.04 ,35.46 ,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30 • ค่ า Median = summary statistics • The normalised IQR is a measure of the variability of the results. • It is equal to the interquartile range (IQR) multiplied by a factor †(0.7413), which makes it comparable to a standard deviation. Example • จากการทดสอบ AI Type A โดยวิธี Realtime PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม ทดสอบได้ ผลดังนี ้ 36.97, 35.88, 36.04 ,35.46 ,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30 • ค่ า Interquatile Range = • ค่ า The normalised IQR = Continue • In most cases Q1 and Q3 are obtained by interpolating between the data values. • The IQR = Q3 – Q1 Where • Q3=3*(N+1)/4 • Q1=(N+1)/4 The IQR = (Q3 – Q1)=(37.2 - 35.88) The normalised IQR = IQR × 0.7413 summary statistics • The robust CV is a coefficient of variation (which allows for the variability in different samples/tests to be compared) and is equal to the normalised IQR divided by the median, expressed as a percentage - i.e. robust CV = 100 × normalised IQR ÷ median. Example • จากการทดสอบ AI Type A โดยวิธี Realtime PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม ทดสอบได้ ผลดังนี ้ 36.97, 35.88, 36.04 ,35.46 ,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30 • ค่ า The robust CV = summary statistics • The minimum is the lowest value (i.e. X[1]). • The maximum is the highest value (X[N]). • The range is the difference between them (X[N]–X[1]). Example • จากการทดสอบ AI Type A โดยวิธี Realtime PCR ได้ ค่า Ct value ของห้ องปฏิบัติการที่เข้ าร่ วม ทดสอบได้ ผล ดังนี ้ 36.97, 35.88, 36.04 ,35.46 ,35.74, 36.22, 36.64 ,38.20, 37.20 ,38.30 • ค่า The Minimum = • ค่า The Maximum = • ค่า Range = Robust Z-scores and Outliers • Z-scores based on robust summary statistics (the median and normalised IQR). • Where pairs of results have been obtained, two z-scores are calculated a between-laboratories z-score and a within-laboratory z-score. Robust Z-scores • Suppose the pair of results are from two samples called A and B. • The median and normalised IQR of all the sample A results are denoted by median(A) and normIQR(A), respectively (similarly for sample B). Uniform pair sample Z-scores • The z-score is a measure of the deviation of the result from the assigned value. • Z score= (X-Xbar)/SD • X= result reported by participant • Xbar= assigned value • SD= standard deviation for proficiency assessment Robust Z-scores A simple robust z-score (denoted by Z) for a laboratory’s sample A result would then be: The standardised sum (denoted by S) and standardised difference (D) • The standardised sum (denoted by S) and standardised difference (D) for the pair of results are: Between-laboratories z-score • The between-laboratories z-score (denoted by ZB) is then calculated as the robust z-score for S Within-laboratory z-score • The within-laboratory z-score (ZW) is the robust z-score for D Outlier • An outlier is defined as any result/pair of results with an absolute z-score greater than or equal to three, i.e. Z ≥ 3.0 or Z ≤ -3.0. • Outliers are identified in the table by a marker (§) beside the z-score. Outlier • This outlier criteria, | Z | ≥ 3.0, has a confidence level of about 99% (related to the normal distribution) • A confidence level of approximately 95% z-score in this region (i.e. 2.0 < | Z | < 3.0) Outlier of ZB uniform and split pairs • A positive between-laboratories outlier (i.e. ZB ≥ 3.0) indicates that both results for that pair are too high. • A negative between-laboratories outlier (i.e. ZB ≤-3.0) indicates that the results are too low. Outlier of ZW uniform and split pairs • For uniform pairs, where the results are on identical samples, a withinlaboratory outlier of either sign (i.e. | ZW| ≥ 3.0) indicates that the difference between the results is too large. Continue • For split pairs, where the analyte is at different levels in the two samples, a positive within-laboratory outlier (i.e. ZW ≥ 3.0) indicates that the difference between the two results is too large. Continue • For split pairs, a negative withinlaboratory outlier (i.e. ZW ≤ -3.0) indicates that the difference is too small or in the ‘opposite direction’ to the medians. Z-score of single results • A single result on one sample (X), a simple robust z-score is calculated as • Z = {X-median(X)}/normIQR(X) Continue • The sign of the z-score indicates whether the result is too high (positive z-score) or too low (negative zscore). • But whether this is due to between-laboratories or withinlaboratory variation, or both, is unknown. Graphical Displays • The z-score bar-chart • The Youden diagram • Very useful to participants - especially those participants with outliers because they can see how their results differ from those submitted by other laboratories. The z-score bar-chart The z-score bar-chart • The advantages of these charts are that each laboratory is identified and the outliers are clearly indicated. • They are not graphs of the actual results. The Youden diagram • The advantages of these diagrams are that they are plots of the actual data. • So the laboratories with results outside the ellipse can see how their results differ from the others. The Youden diagram As a guide to the interpretation of the Youden diagrams: • laboratories with significant systematic error components (i.e. betweenlaboratories variation) will be outside the ellipse in either the upper right hand quadrant (as formed by the median lines) or the lower left hand quadrant, i.e. inordinately high or low results for both samples; As a guide to the interpretation of the Youden diagrams: • laboratories with random error components (i.e. within-laboratory variation) significantly greater than other participants will be outside the ellipse and (usually) in either the upper left or lower right quadrants, i.e. an inordinately high result for one sample and low for the other. The Youden diagram • It is important to note however that Youden diagrams are an illustration of the data only, and are not used to assess the results (this is done by the zscores). Summary table Between-laboratories z-score (ZB) Within-laboratories z-score (ZW) Youden diagram of Type A • Statistic Significant but Diagnostics not significant Example of AI Type A PT Panel ปกติ Between Lab Variation (ZB) Within Lab Variation (ZW) Youden plot ระดับสู ง,เท่ ากัน Between Lab Variation (ZB) Within Lab Variation (ZW) Youden plot ระดับตา่ ,เท่ ากัน Between Lab Variation (ZB) Within Lab Variation (ZW) Youden plot ระดับสู ง,ไม่ เท่ ากัน Between Lab Variation (ZB) Within Lab Variation (ZW) Youden plot ระดับตา่ ,ไม่ เท่ ากัน Between Lab Variation (ZB) Within Lab Variation (ZW) Youden plot THANK YOU • 1. Paul Selleck (AAHL) for HI - PT testing • 2. Brian Mehan (AAHL) for PCR - PT testing • 3. Gemma Carlile (AAHL) for Statistics for PT panel • 4. Dr. Vimol Jirathanawat : Director of NIAH • 5. Dr. Sujira Pachariyanon : Head of Virology Laboratory