Development and evaluation of a urine protein expert system
Transcription
Development and evaluation of a urine protein expert system
Clinical Chemistry 42:8 1214-1222 (1996) Development and evaluation of a urine protein expert system MIRosiv IVANDIC,* WALTER HOFMANN, Based on the quantitative determination of creatmine, total albumin, a1-microglobulin, IgG, a2-macroglobuliii, and N-acetyl-I3,n-glucosaininidase in urine in combination with a test strip screening, the findings of hematuria, leukocyturia, and proteinuria can be assigned to prerenal, renal, or postrenal causes. Using this graded diagnostic strategy as a knowledge base, we developed a computerbased expert system for urine protein differentiation (“UPES”) as a decision-supporting tool. The knowledge base was implemented as a combination of “Wthen” rules and two-step bivariate distance classffication of marker proteins. The knowledge for this form of pattern recognition was derived from the results for a set of 267 patients with clinically and histologically documented nephropathies. To determine the diagnostic value of UPES, we tested another set of data: results for 129 urine analyses from 94 patients. Using these data, the system reached 98% concordance with the clinical diagnoses for the patients and was superior to the diagnostic interpretations of four human experts. UPES has been successfully integrated into the laboratory routine process, including automated data import. ing system proteinuria nephropathy WALTER G. GUDER or, better, knowledge-based systems have been developed and are being used with increasing frequency. Laboratory medicine, given its high degree of specialization and its use of objective protein, INDEXING and quantitative findings, seems especially suited to benefit from these computer programs [1, 2]. Here we describe such a decision-supporting system, the Urine Protein Expert System (UPES), developed for the interpretation of urine protein differentiation.’ As with electrophoretic techniques [3-5], quantitative analysis of urine marker proteins has been successfully applied to detect and differentiate nephropathies [5-7]. The multivariate evaluation of the excretion pattern allows differentiation of prerenal from glomerular, tubular, and postrenal causes of proteinuria and hematuria [8-11]. Knowledge for describing and interpreting complex urine protein patterns has accumulated in recent years, a result of collaboration between nephrologists and clinical chemists. We have tried to implement this knowledge in the form of “if/then” rules in the knowledge base of UPES, a knowledge base that contains facts and strategies drawn from literature as well as from heuristics and empirical guidelines. The rules have been worked out in close collaboration with specialists in the field of urine protein differentiation. Because various nephropathies could not be sufficiently identified by interpretation of excretion patterns when based on rules alone, we have used another method of knowledge representation, geometric distance classification, to extract and apply the knowledge of this multivariate pattern recognition. Using this hybrid model of a knowledge base, UPES is able to process the laboratory results provided and to propose a medical report generated from 36 text elements. Twenty-four of those elements (all the ones used in this paper) are listed in the Appendix. knowledge-based system . decision-supportalbumin . a,-microglobulin #{149} a2-macroglobulin #{149} kidney diseases #{149} hematuria #{149} leukocyturia TERMS: #{149} Continuously changing medical knowledge has resulted in increasing specialization in medicine. Providing optimal medical care requires experts who can keep up with the enormous information flow; however, such experts are not always available. To conserve the knowledge of a specialist and to widely distribute this knowledge, software tools called expert systems Matenais and Methods Analytical procedures. Test strip screening was performed with test strips from Behring (Marburg, Germany). Quantitative determinations of total protein, albumin, ce,-microglobulmn, IgG, a2-macroglobulmn (turbidimetrically), N-acetyi-/3,n-glu- Institut f#{252}r Klinische Chemie, St#{228}dt. Krankenhaus Munchen-Bogenhausen, Englschalkinger Str. 77, D-81925 Munchen, Germany. Author for correspondence. Fax +49 89 9270 2113; e-mail [email protected]. Dedicated to H. Keller of ZUrich (Switzerland), on the occasion of his 70th birthday. This paper contains part of the results of the doctoral thesis of MI. Received November 7, 1995; accepted April I, 1996. Nonstandard abbreviations: UPES, Urine Protein Expert System; /3-NAG, N-acervl-(3,o-glucosaminidase; and GFR, glomerular filtration rate. 1214 1215 Clinical Chemisliy 42, No. 8, 1996 cosaminidase (p-NAG) (kinetically and photometrically), and creatinine in urine as well as serum concentrations of creatinine and a1 -microglobulin were performed as described elsewhere [12]. The reference values used are from previous publications [7, 9]. Hardware and software. The knowledge-based system for urine protein analysis was developed by using an IBM-compatible PC (80386 CPU) with 1 MB RAM and DOS. A Turbo C Compiler (V. 2.0; Borland, Munich, Germany) and a BGI Printer Toolkit (Ryle Design, Mt. Pleasant, MI) served as programing tools. The statistical software package SAS (V. 6.10; SAS Institute, Cary, NC) was used to perform discriminant analysis. Geometric distance classification. Geometric distance classification is a method for describing and separating multidimensional pattern classes. Patterns are defined by complete quantitative or qualitative data. Patterns of distinguishable classes form distinct clusters in multidimensional spaces. In geometric distance classification, groups of geometric figures such as spheroids and ellipsoids are used to represent these clusters (Fig. 1). The distance classifier GEODICLA [13] was developed separately to determine the position and size of such spheroids and ellipsoids automatically. The program selects random members from each class from a training set of typical examples and defines their geometric “region of influence” [14]. This is done by taking their coordinates as the centers of the figures and extending a user-defined minimal radius until reaching either a maximum radius or the “nearest” example of a different class. If an example is picked that is already covered by a figure belonging to the same class, then this example can be classified already and does not need its own region of interest. When every training example is covered, the training is stopped. Reclassifying the training set now always results in a 100% classification rate. The resulting geometric shapes can be adapted manually after this automatic “learning process.” Using the software tool GEODICLA, we have generalized the information contained in the urine protein patterns of two training sets and condensed them into two sets of figures: circles and ellipses. To classify an unknown urine pattern, we compare it with these representatives sets in UPES: The geometric distance from this pattern to the centers of each of the circles! ellipses is calculated and compared with the radius of each circle/ellipse. If the pattern lies within a circle/ellipse, the associated class is stored. Comparison of the stored classes leads UPES to its diagnostic conclusion. Training sets. Protein patterns of 503 second morning urines from 267 patients with clinically or histologically diagnosed nephropathies were used to train the distance classifier GEODICLA. Medical The urines were collected Department of the Hospital from patients of the II. Munchen-Harlaching and of the III. Medical Department and the Department of Neurosurgery of the Hospital Munchen-Bogenhausen. Depending on their clinical diagnoses, patients were assigned to the following diagnostic groups: primary glomerulopathy-different forms of glomerulonephritis, histologically secondary diagnoses interstitial thy, chronic umented documented glomerulopathy- diabetic nephropathies, clinical nephropathy-e.g., acute tubulo-toxic nephropainterstitial nephropathy, partly histologically doc- diagnoses renal dysfunction-protein excretion patterns ranging from normal values to as much as twice the upper reference limit from patients from any of these three diagnostic groups that were not histologically documented were based on clinical criteria (e.g., anamnesis, clinical examination, laboratory results,medical imaging, clinicalcourse) and made by Diagnoses the physician treating the patient. composition of the training set. Table 1 summarizes the Validation set.To evaluate the diagnostic interpretation of urine protein patterns, we used data from 129 urine analyses. These test data were collected from 94 patients of the II. Medical Department of the Hospital Munchen-Harlaching and the III. Medical Department of the Hospital Munchen-Bogenhausen. As in the training set, the urines were assigned to the diagnostic groups primary glomerulopathy, secondary glomerulopathy, and interstitial nephropathy, according to their diagnoses (Table 1). Discriminant analysis. To compare the diagnostic performance of the distance classifier with the performance of a statistical method, we performed classificatory linear and quadratic discriminant analysis. We used the training set to compute the parameters (coefficients and constants) of the linear and quadratic functions. Equal prior probabilities were assumed for all four diagnostic groups. The same validation data were used to evaluate the results of discriminant analysis as were used with geometric distance classification (Table 1). #{149} #{149}#{149}S #{149} a. #{149} #{149} #{149}#{149}.#{149} a #{149}% #{149}#{149}. #{149} a. #{149}S . #{149}SS U at #{149} #{149} U #{149}I. .R #{149} Fig. 1. Use of circles to describe clusters of two different classes: (A) individual examples of two different classes forming two distinct clusters; (B) an #{149} #{149}1 U... #{149}a U #{149}#{149}#{149} #{149} optimal characterization of the clusters by using six circles B. C. (GEODICLA; see text); (C) the representatives of the two classes. resulting six Ivandi#{233} et al.: Urine 1216 protein expert system (UPES) Table 1. Compos ition of the train ing and the valid ation collective. Training collective Urine Diagnostic samples Patients groups n Primary glomerulopathy 285 57 Secondary glomerulopathy 123 Interstitial nephropathy 66 Renal dysfunction Total 29 503 % n % n % 44 46 36 27 29 24 97 36 76 59 62 66 13 33 12 7 5 6 20 267 and glucose. As an option, the by providing glomerular filtrationrate (GFR) can be considered the data for serum Apart from creatinine and serum the results of the serum a,-microglobulin. and urine analysis, additional data concerning the patient and the request of the urine protein differentiation data can be entered into UPES by using an input screen or can be imported automatically by retrieving a file. KNOWLEDGE 7 - 5 - EASE The knowledge base of UPES is divided into five modulesPlausibility and consistency check, Hematuria, Leukocyturia, Proteinuria, and GFR-which are considered if necessary. The implemented strategy is represented as if/then rules. The geometric distance classification is used only in the Proteinuria module to interpret the marker protein patterns. 129 - 94 (see next section), Medicalassessment ofGFR. The GFR module is considered only if the concentrations of the optional serum analytes creatinine and a,-microglobulin are provided. a,-Microglobulin partially fills the diagnostic gap associated with creatinine, by sometimes detecting a decrease of GFR earlier than creatinine does [15- 17]. A major serum I; see Appendix).The GFR is assumed to be decreased if concentrations of both analytes are increased (text element 2). In combination with a normal urine excretion pattern, this is interpreted analytes restriction are within of the GFR their reference as a lossof functioning nephrons is unlikely ranges (text if both element that iscompletely compensated by the remaining nephrons (text element 3). An increase of only a,-microglobulin in serum indicates a possible restriction in glomerular clearance (diagnostic gap of creatinine). In this case, determination of creatinine clearance is recommended to confirm or to exclude this suspicion (text element 4). If only creatinine is increased, this more likely indicates the presence of pseudocreatinines or increased muscle mass (text element 5), given the greater diagnostic sensitivity of a, -microglobulin. Medical assessment of hematuria. Whenever the test strip result for blood is positive, the Hematuria module is considered, to distinguish Plausibility and consistemy check. All data are checked for plausibility during the input or import process; formats and thresholds are used to exclude values that exceed medical and analytical ranges. For analytical validation, this module considers the values for total protein, albumin, test strip protein, and the two serum measurements (creatinine and a,-microglobulin). A warning appears on the screen (“Discrepancy between test strip, albumin, and total protein!”) if the comparison of the test strip result and the quantitative measurements fulfills one of the following conditions: protein test strip positive and total protein 200 mg/L protein test strip negative and albumin >300 mg/L albumin > total protein and albumin >50 mg/L These rules take into account that the detection limit of the test strip is -300 mg/L albumin and thus detect false-positive and false-negative test strip results. If the value for urine protein excretion is normal and one of 5 - the serum values indicates a decreased GFR the user is asked to check the input data. For interpretation of a urine protein pattern, UPES requires at least the data for urine creatinine, total protein, albumin, and a,-microglobulin. For differential diagnosis during the decision process, the system asks for data on IgG, a2-macroglobulin, and p-NAG if necessary. The program refers all quantitative measurements to the urine creatinine content to take into account the concentration of the urine sample [7]. These quantitative data are processed together with the results of the urine test strips for assessing leukocytes (granulocyte esterase), hemogloprotein, Patients n DATA bin (pseudoperoxidase), Urine sampies 117 Results INPUT Validation collective prerenal from glomerular, tubular, and postrenal causes. Prerenal causes of the test strip result are assumed if the criteria for prerenal proteinuria are met (i.e., a “protein gap”; see text element 6) [18]. If albumin excretion is <100 mg/L, differentiationof renal and postrenal hematuria by urine protein analysis is not possible [9]. In such cases, UPES suggests using phase-contrast microscopy to look for dysmorphic ervthrocytes [19, 20] (text element 7). At higher albumin concentrations,the system considersthe ratios of albumin with a2-macroglobulin, IgG, and a,-microglobulin to assign the hematuria to a renal (glomerular or tubulo-interstitial)or postrenal bleeding [9, 10]. If a2-macroglobulin and lgG results have not yet been provided, UPES asks for theirmanual input. Because of their molecular size, only small amounts of a,-macroglobulin (250 kDa) and IgG (125 kDa) usually pass the glomerular filter,and those are reabsorbed in the tubule. When Clinical Cbemisy albumin ratios with these proteins in urine are similar to those in plasma, therefore, a postrenal lesion is indicated (a2-macroglobulinlalbumin >0.02 and IgG/albumin >0.2). In this case, the system proposes that the clinician repeat the urine protein differentiation to exclude additional renal hematuria Medical assessment of leukoyturia. The Leukocyturia module is considered whenever the leukocyte esteraseteststripshows a positive result. An isolated leukocyturia in combination with a normal urineproteinpatternindicateseithera contamination of the urine sample or an inflammation of the lower urinarytract (textelement 10).Leukocyturia with a slightglomerular proteinuria(totalprotein <150 mg/g creatinine,albumin <100 creatinine, a,-microglobulin <14 mg/g creatinine) can have both renaland postrenalcauses(textelement 11),whereas substantial glomerular involvement or tubular proteinuria indicates renal in an inflammatory process (text element 12). Medical assessment of proteinuria. In contrastto the previoustwo modules, the Proteinuria module is used in all cases to interpret the various urine protein ratios.Active renal diseasecan be excluded ifteststrip results are negative and the concentrations of urine totalprotein,albumin, and a,-microgbobulinarewithin their reference ranges (text element 13). Normal excretion of both marker proteinsbut increasedIgG in urine may indicate (e.g.)monoclonal gammopathies (textelement 14). If totalprotein excretion is >300 mg/L and the sum of albumin, protein a1-microglobulin, excretion, prerenal and IgG is <30% causes such as Bence of the total Jones protein- uria might account for this disproportion [18]. ImmunofIxation is suggested for further confirmation. This finding initiates temporary report (text element 15), and the decision process stopped. Renal described proteinuria and assigned can be quantitatively to different kinds and a is qualitatively of nephropathies 1217 Table 2. DescrIption of tubular and glomerular proteinurla according to the excretion of the marker proteins albumin and a1-microglobulin. a1.Microglobuiin Albumin after postrenal hematuria has ceased (text element 8). In renalhematuria (a2-macroglobulinlalbumin<0.02), gbmerular and tubulo-interstitial causes can be distinguished by the concentrations of IgG: In tubular hematuria, even small amounts of filtered IgG cannot be reabsorbed (IgG/albumin >0.2). Increased excretion of the tubular marker a,-microglobulin is taken as additional confirmation of the tubulointerstitial lesion (text element 9). mg/g 42, No. 8, 1996 by analysis of the excretion pattern of albumin, a,-microglobulin, and IgG. Using albumin as a glomerular marker and a1microglobulin as a tubular marker, UPES describes the extent of glomerular and tubularproteinuriaas borderline,slight, significant, distinct, and nephrotic, according to the thresholds given in Table 2. The IgG/albumin ratio helps to distinguish “selective” (<0.03) from “nonselective” (>0.03) proteinuria in gbmerulopathies with albuminuria >500 mg/g creatinine. An example of a description of a renal proteinuria is text element 16 (albumin 1100 mg/g creatinine, a,-microgbobulin 25 mg/g creatinine, IgG 15 mg/g creatinine). Apart from thisquantitativeand qualitativedescription,a renal proteinuria can also be assigned to different diagnostic classes of renal diseases. mg/g 14-20 20-50 50-100 >100 Description creatinine 20-30 Borderline 30-100 Slight 100-1000 1000-3000 >3000 Significant Distinct Njephrotic The training sets of patients with clinically or histologically documented diagnoses show that tububo-interstitial nephropathies and primary and secondary glomerubopathies are each characterized by a specific urine protein pattern. The clusters of these disease groups can be defined and separated in logarithmic coordinates, with the marker proteins albumin and a,-microglobulin making up the x- and y-axes, respectively (Fig. 2, top). A renalproteinuriawith albumin <40 mg/g creatinineand a,-microgbobulin <28 mg/g creatinine cannot be clearly assigned to only one of the three disease classes because of the overlapping zones of the clusters. Such a slight proteinuria can be interpretedas “renaldysfunction,”which can have renaland extrarenal causes: e.g., metabolic disorder, fever, intense physical exercise (see text element 17). In the overlapping zone between primary and secondary glomerubopathies as well as interstitial nephropathies, further diagnostic information might be achieved by takingIgG intoconsideration(Fig.2, bottom). An excretion pattern from an unknown patient can be assigned to any of the diagnostic groups by comparing it with the position of the different clusters in both coordinates. To implement thisvisualclassification in the knowledgebased system UPES, we used the geometric distance classifier GEODICLA [13]. After five fictitious patterns had been added to the training samples to detect implausible marker constellations (Fig. 2, top), and the learning and abstracting process of GEODICLA had been performed, the information contained in these 508 single urine protein patterns of the training sets was specifically condensed into some representative examples: The clusters of the different classes were now described with 60 circles (albumin-a,-microglobulin patterns) and 15 ellipses (albumin-IgG patterns). For diagnostic interpretation of urine findings of an unknown patient, UPES calculates the logarithm of the patient’s concentrations of albumin and a1-microglobulin. The geometric distance of this pattern to the centers of each of the 60 circles is calculated and compared with their radii. If the pattern lies within a circle, the corresponding class is stored. According to the classes stored after this first classification, the system identifies the excretionpatternas belonging with one of the following diagnostic groups: renal dysfunction (text element 17),primary glomerubopathy (textelement 18),secondary glomerubopathy (element 18), primary or secondary gbmerulopathy (text element 19),tubulo-interstitial nephropathy 1218 Ivandi#{233} et al.: Urine protein expert system (UPES) A 4 #{163} A 5o A aALA A A .#{149} Si AL LLj 0 A #{149} a, 0 jA #{149} Is #{149}. A x 10 S DOD S #{149} . . 0 iwo 10 ic Albumin (mglg creatinine) iww- a C iwo #{149} C #{163} a a a a SAL U E 0 10 Fig. 2. Albumin-a1-microglobulin patterns (top) and albumin-lgG patterns (bottom) of the diagnostic classesof the training a A a #{163} collective: S thies 10 ito Albumin iwo a,-microglobulin - primary nephropa- () and secondary(#{149}) glomerulopathies, the dysfunctional collective (*) and the plausibility collective (#{149}). (mglg creatinine) (element 18), glomerulopathy with interstitial involvement or interstitial nephropathy with secondary glomerulopathy (text element 20),or implausiblemarker proteinpattern. Only ifthisfirst classification based on the a,-microgbobulinl albumin ratiorevealsan ambiguous diagnosis(elements 19 and 20) does UPES consider IgG in a second step: One of a pair of diagnoses in an ambiguous diagnosis is more probable if the albumin/IgG pattern of the patient is covered by ellipses of only one class(unambiguous classification; text element 21). This two-step pattern identification in UPES reflects that the diagnostic discrimatory power of IgG is less than that of a1microgbobulin. Depending on the result of the two-step classification, additional rules are considered. If the diagnostic pattern classification reveals a glomerulopathy, the system takes into account the possibility that the tubular component of a proteinuriamight resultfrom tubularoverload caused by an excessiveglomerular proteinuria. In nephrotic proteinuria (albumin >3 g/g creatinine),therefore,the extentof the tubularshare iscorrectedby using the following equation, derived from urine findingsin selectedpatientswith glomerulonephritiswhose renalinterstitialspace was devoid of major histopathological findings[21]: a1-microgbobulin (corr.) = e#{176}#{176}#{176}#{176}22 . .,lbumin tubulo-interstitial (A), 4.7 This the equation cluster estimation approximately of primary of the describes gbomerulopathies amount the lower in Fig. of tububo-interstitial margin 2 and of allows involvement in gbomerular diseases:Tubular proteinuriais assumed to result from tubularoverloadifthe correctedvalueof a1-microglobulin is <14 mg/g concentrations creatinine (text element 22). a,-Microgbobulin >14 mg/g creatinine are interpreted as showing an involvement of the renal interstitialspace in gbomerulone- phritis (text element 23). To differentiate acute from chronic tubular disorders in interstitial nephropathies, UPES requests data for the catalytic concentrationof the tubular enzyme p-NAG. In acute lesions (e.g., caused by nephrotoxic antibiotics), the excretion of p-NAG usuallyexceeds20 U/L ifa1-microgbobulinexcretionis >40 mg/g creatinine (text element 24) [12]. Chronic tububointerstitial excretion OUTPUT diseases without (FINAL are described a major by increased increase a,-microgbobulin of p-NAG. REPORT) UPES composes the finalreport from the selectedtextitems afterthe urine and serum proteinfindingshave been medically assessed. Clinical Chemistry 42, No. 8, 1996 EVALUATING THE VALIDATION SET KNOWLEDGE BASE WITH 1219 most (46, or 61%) of the 76 urines to the diagnosticgroup “primary or secondary glomerulopathy.” The excretion ratio for IgG/albumin misled the system in 3 of these 46 decisions to favor the primary type of glomerulopathy. The remaining 27 urines (36%) were classified as “renal dysfunction” because of the low quantitiesof marker proteinsexcreted. Finally, UPES interpreted the urine patterns of all 7 interstitial nephropathies correctly. THE To compare the medical interpretation of proteinuria by UPES, statistical methods and human expertise, we assessed the results of urine proteindifferentiation of the validationset(129 urines from 94 patients)as classified by IJPES, linear and quadratic discrimination fimctions, and four experts in our laboratory who were familiarwith thismethod of urine analysis. The resultsof these evaluationsare given in Table 3; misclassifIcations are summarized as “others.” Because there are no gold standards for evaluating urinary protein patterns,it was difficult to define correct and false interpretations. Patients with a documented diabeticnephropa- Discriminant functions. As an alternative classification method, we used the discriminantfunctionsestimated from the albumin, excretion. The patterns were describedby human experts and by the system as reflecting gbomerular and (or)tubulardysfunction, secondary gbomerulopathy, primary or secondary gbomerubopa- a,-microglobulin, and IgG patterns of the trainingset (no implausible constellations were included). Each protein pattern was classified to the diagnostic group having the highest group probability, as computed with linear and quadratic discriminant functions. Resubstitution of the trainingsetresultedin a reclassification rateof 75% by lineardiscriminantfunctionsand 79% thy, or mixed by quadratic discriminant functions. thy, for example, showed many different patterns of protein (gbomerubar and tubular) nephropathy, and all of these diagnostic groups were assumed to be a correct interpretation. Only the description “primary gbomerubopathy” would be judged a clearmisclassification of these patients. UPES. Of 46 urines from patients with gbomerubonephritis, UPES identified 9 primary gbomerubopathies (20%) by first-step classification. The correct but more global diagnosis “primary or secondary glomerulopathy” was chosen in the majority of cases(31 of 46 urines,67%) because of the overlappingzones of the albuminla,-microgbobulin patterns. Using the IgG excretion in a second-stage pattern classification correctly assigned 6 of these 31 ambiguous cases to the primary gbomerulopathy group. Two patientswith gbomerubonephritisand albuminuria >10 g/g creatininecould not be interpretedby UPES. Only 2 of 76 urines(3%) with secondary glomerulopathies were misclassified as primary glomerulopathies by UPES. Both of these urines showed substantial albuminuria (844 and 552 mg/g creatinine) and IgG excretion (63 and 59 mg/g creatinine) but no significant tubularproteinuria. Again, UPES assigned To allow consideration of an ambiguous classification, as in UPES, we took into account the differencebetween the two highest group probabilities. If this differencewas <0.3, the pattern was assigned to both classes(ambiguous classification). Linear discriminant functions described 38 samples (29%) of the validationsetas caused by “renaldysfunction”;68 patterns (53%) were interpretedcorrectlyas belonging to other diagnostic groups matching the known diagnosis. By quadratic discriminant functions,37 cases(29%) were classified as “renaldysfunction,” whereas other, correct diagnostic classes were chosen for 65 samples (50%). In total,there were 23(18%) vs 27 (21%) misclassifications by linear and quadratic discriminant functions, respectively (Table 3). Human greatly, experts. The quality of the human expertisevaried depending on the experience of each expert with urine protein differentiation. Generally, the humans interpreted more proteinuriasas being “renaldysfunction”than did UPES. Two experts more oftendecided on an unambiguous diagnosis,atthe Table 3. Diagnostic interpretation of urine protein differentiatIons of 46 prImary glomerulopathles, 76 secondary glomerulopathies, and 7 interstitIal nephropathies by UPES, linear and quadratic discrimlnant functions, and four experts. Clinical diagnosis Secondary GP PrimaryGP Expertise UPES Expert Expert Expert Expert LDF QDF a 1 2 3 4 Correct” TP Pulm.GP GP GP/TP Dys Others GP GP/TP Dys TP GP/TP Dys Others 9 31 1 3 2” 0 46 1 27 2” 6 0 1 0 2 19 0 22 22 22 41 18 38 8 12 12 1 6 5 9 1 0 2 3 3 5 5 2 0 0 0 2 6 10 0 0 11 5 11 13 43 32 30 17 15 11 0 0 1 9 0 0 32 34 32 39 33 35 1 10 2 6 17 17 5 5 6 5 7 5 0 0 0 2 0 2 1 0 1 0 0 0 1 2 1 0 0 0 diagnoses are listed (prim./sec.) GP = Sec.GP (primary/secondary) glomerulopathy, TP = tubulo-interstitial Others nephropathy, patterns interpreted as implausible constellations and urines not classified or misclassified are summarizedas Others. b 2 patterns not classified by UPES. “2 patterns classified LDF, linear discriminated by UPES as a primary glomerulopathy. function; QDF, quadratic discriminant function. or Dys = renal dysfunction, whereas Ivandi#{233} et al.: Urine 1220 risk of increasing preferred the their misclassifications; more general diagnosis the other “primary protein two experts or secondary glomerulopathy,” to be on the safe side. Notably, one expert reliedon a positiveglucoseteststripresultto classify a glomerubopathy as the secondary type. In contrast to UPES, he and two other expertsfailedto identifythe primary gbomerubopathy in a 32-year-old woman with IgA nephropathy and familial glucosuria. Discussion Evaluation with the validation data set showed that noninvasive urine proteindifferentiation may be a usefuldiagnosticstrategy in nephrology. The knowledge-based system UPES performed well in diagnostic interpretation of urine protein patterns, correctly distinguishing all interstitial nephropathies from gbmerulopathies. It misclassified only 2 of 129 urines (2%), incorrectlyconcluding that patternsof significant glomerular proteinuria had instead indicated a primary gbomerubopathy. Discriminant functionswere not able to deal properly with the overlapping zones of allclinicalclasses. The four human experts also had problems correctly classifying primary and secondary gbomerulopathies-which are difficult to distinguish by clinical chemistry means. After the evaluation,we adjusted the knowledge base of UPES to improve the medical assessment. We added one circle to the secondary glomerulopathy class so that this diagnostic group would be considered in cases of significant glomerular proteinuria.Another circlewas also added to the primary gbomerubopathy class to ascertain the identification of cases of excessive proteinuria. The addition of these two circles will help prevent misclassification in similarcases. Knowledge-based systems, as means of rationalization, accelerate the time-consuming process of medical assessment and increase the economic efficiency of a clinical laboratory. Such programs make possible consistent and standardized medical assessmentof constantand high quality, especially when dealing with the highly complex data produced in increasingly specialized areas [22-24]. Apart from learning effects, transparent data interpretation rather than simple “data intoxication” [25] may provide clinical physicians with useful additional information [26]. The knowledge-based system we designed provides for the first time a concise decision-supporting system to exclude and differentiate proteinuria, hematuria, and leukocyturia. Working with the complex excretion pattern of different marker proteins, UPES can distinguish prerenal, glomerular, tubulo-interstitial, and postrenal causes of pathological urine findings. By using two differentmethods for knowledge representation, we essentially implemented the strategyand experienceof a specialist in urine protein differentiation as a knowledge base. Modelling the framework of the knowledge base with if/then rules makes itpossibleto integratethe heuristicsthat guide a human expert in the diagnosticdecisionprocess.Rules allowthe designof a modular knowledge baseto maintain a clearstructure and facilitate regular update. Furthermore, the user can easily retrace the decisions formulated by the system. Diagnostic pattern classification in urine protein differentiation can be expert system (UPES) implemented in a rule bae by using constant thresholds to describe the different clusters by squares bike a mosaic. However,good resolutionforsufficient representationof the clusters isobtained only by using a largenumber of thresholds.Thus, the quality of classification is limited by the number of rules needed to compare the patternof the patientwith the margins of allclusters. Although thisiseasilydone in a two-dimensional pattern recognition, more dimensions increase the number and complexity of rules exponentially. Because rules, therefore, did not appear to be the optimal solution, we booked foralternative ways of knowledge representation. Classificatory discriminant analysis [27], for example, can designate and separate the different diagnostic groups in a statistical way, but several assumptions are necessary that are not always met (e.g.,mubtivariatenormal distributionof data). Moreover, a largersetof examples of allclassesisnecessaryto finddiscriminatingfunctionsthatare generallyvalid,and every change in this collective (e.g., adding a new patient not yet correctly classified) requires a complete recalculation. Nevertheless,we used the trainingset of urine protein patternsto estimateassociatedlinearand quadraticdiscriminantfunctions. Using these functionsto classifythe validationset,however, revealedmajor difficulties in dealingwith overlappingzones of the diagnostic groups. Another flexible tool used successfully in laboratory medicine forrobustpatternrecognitionisneuralnetworks [28-30].These models forknowledge-processingand representationare abbeto deal with complex, uncertain, and even incomplete data. In a self-organizing process they use the information contained in training data to build up and adjust their “knowledge.” After this dynamic learning process, the adapted network structure itself incorporates the knowledge base [31]. Quasi-parallel processing of data enables fastclassification in neural networks but also makes it difficult for users to influenceand understand their behavior. The successful training of these “black boxes” depends on theirarchitecture(i.e., number of neurons and layers). The lack of general rules for constructionmeans that finding the rightconfigurationof a neuralnetwork requiresmuch empirical testing. In designing UPES, we chose another way to simulate the diagnostic identification of marker patterns as an important part of the expert’s considerations. Geometric distance classification allows the system to recognize and separate quantitative multidimensional patterns [14]. Implemented in the flexible software tool GEODICLA 113], thisclassification method can describe and separatecomplex clustersin terms of spheroidsand ellipsoidswith straightand obliqueaxes.Information contained in a trainingsetisspecifically integratedand generalizedin a rapid automated learningprocess.In contrastto most neuralnetworks and statistical classification methods, the resultingrepresentativesalways guarantee a 100% reclassification ratewhen cbassifying the training collective. Multiple features such as mathematical preprocessing, several learning modes, and different ways of distance calculation help influence the self-organizing process and optimize its results. The geometric figures and their parameters can easily be adjusted and extended. A simple local modification of the geometric knowledge base, e.g., if a pattern 1221 Clinical Chemistry 42, No. 8, 1996 isnot yet correctlyclassified or isnot classified at all,adds new knowledge to a classification system. Updates of the knowledge base are thus facilitated. Geometric distance classification enables UPES to make robust and nonparamen-ic pattern recognitions. Further, the diagnostic classification can be elucidated by showing the circles/ellipses on the screen together with a symbol representing the patient’spattern.Thus, a user does not have to accept the UPES interpretation of the marker proteins as if it were a Greek oracle. The quality of a decision-supporting system for daily routine assessment such as UPES depends on easy and comfortable use of the system as well as the knowledge integrated being highly accurate and sufficiently extensive. Because widespread use of the system depends on its acceptanceby users,considerationsof comfort and safety have played a major role in its development. The complete integration of the knowledge-based system in the computer network structure of our laboratoryas webb as the automated data import and export minimizes errors during data transfer and contributes greatly to the comfortable and problem-free use of UPES in daily routine. Use of programming language C guaranteesthatmedical evaluationof the dataisnot time consuming: UPES takes2 s to compose the reports from a filecontaining data for 30 patients(the estimated average number of dailyrequests), using a Model 486 PC (33 MHz). Actually, >90% of the reports created by UPES are not modified. In the remaining cases, additional clinical information (e.g., known renaltransplant)isconsidered. Apart from these practicalaspects,the credibility and reliability of a decisionby implemented knowledge are an essential condition for the widespread use of an expertsystem,especially in medical fields. Evaluation with the validationset confirmed thatinterpretation of urine proteindifferentiation isa complex and difficult task sufficiently solved by UPES. Moreover, the evaluation results provided evidence that even experts can learn from a continually growing knowledge base of an expert system. Given that gold standards have yet to be defined for many of the observed protein patterns (e.g., “dysfunction”), future prospective studies may help improve the predictive qualities of the system. Consideration of additional clinical information, implementation of other urine results (e.g., microscopy), and extension to previous urine protein patterns are currentlyunder development. We conclude that urine protein differentiation in itspresent form issuperior to traditional urine analysis as a mirror of renal function [32] and is a valuableadditionto the morphological information provided by histopathobogy and medical imaging. Use of the decision-supportingsystem UPES for medical assessment of urine proteindifferentiation providesa standard of high and constant quality. A graduated and transparent decision process is implemented in a hybrid knowledge base that uses both production rules and geometric distance classification as complementary methods of knowledge representation.In the hands of a responsible physician, UPES can be a useful tool for increasing the efficiencyand qualityof a laboratory. References 1. Spackman KA, Conrrelly OP. Knowledge-based systems in laboratory medicine and pathology. Arch Pathol Lab Med 1987;111: 116-9. 2. Winkel P. The application of expert systems in the clinical laboratory. din Chem 1989:35:1595-600. 3. PesceAi, Boreisha I, Pollak VE. Rapid differentiation of glomerular and tubular proteinuria by sodium dodecyl sulfate polyacrylamide gel electrophoresis. Clin Chim Acta 1972;40:27-34. 4. Boesken WH, Kopf K, Schollmeyer P. Differentiation of proteinuria diseases by diskelectrophoretic molecular weight analysis of urinary proteins. Clin Nephrol 1973:1:311-6. 5. Petersen A, Evrin PE, Berggard I. Differentiation of glomerular, tubular and normal proteinuria: determinations of urinary excretion of $32-microglobulin, albumin and total protein. J Clin Invest 1969;48:1189-98. 6. Cameron JS,BlandfordG.The simple assessment of selectivity in high proteinuria. Lancet 1966;i:242. 7. Hofmann W, Guder WG. A diagnostic programme for quantitative analysis of proteinuria. J CIin Chem Clin Biochem 1989:27:589- 600. 8. Hofmann W, Rossm#{252}ller B, Guder WG, Edel HH. A new strategy for characterizing proteinuria and haematuria from a single pattern of definedproteins in urine. EurJ ClinChem ClinBiochem 1992:30: 707-12. 9. Hofmann W, Schmidt 0, Guder WG, Edel H. Differentiation of hematuria by quantitative determination of urinary marker proteins. Kim Wochenschr 1991;69:68-75. 10. Guder WG, Hofmann W. Differentiation of proteinuria and haematuna by single protein analysis in urine. Clirr Biochem 1993:26: 277-82. 11. Hofmann W, Sedlmeir-Hofmanrr C, lvandi M, Schmidt 0, Guder WG, Edel H. Assessment of clinically characterized of urinary.protein patterns patients. Typical examples on the basis with reports. Lab Med 1993;17:502-12. 12. Schmidt 0, Hofmann W, Guder WG. Adaptation of the diagnostic strategy of urine protein differentiation to the Hitachi 911 analyzer. Lab Med 1995:19:153-61. 13. lvandi#{227} M. Entwicklung und Evaluierung eines wissensbasierten Befundungssystems zur Urineiwei8differenzierung [Dissertationl. M#{252}nchen: Ludwig-Maximilians-Universitat, 1995. 14. Barschdorff D, Bothe A. Signal classification using a new selforganising and fast converging neural network. Noise Vibration 1991;9:11-9. 15. Itoh Y, Enomoto H, Takagi K, Kawai T. Clinical usefulness of serum a1-microglobulin as a sensitive indicator for renal insuffi- ciency. Nephron 1983;33:69-70. 16. Weber MH, VerwiebeR.a1-Microglobulin (protein HC):featuresof a promisingindicator of proximaltubulardysfunction. Eur J Clirr Chem Clin Biochem 1992;30:683-91. 17. Jung M, Jung K. Low-molecular-mass proteins in serum as markers of glomerular filtration rate: cystatin C, a1-microglobulin and p2mrogloi Lab Med 1994;18:461-5. 18. Boege F,KoehlerB, LiebermannF.Identification and quantification of Bence-Jones proteinuria by automated nephelometric screening. J dIm Chem Clin Biochem 1990;28:37-42. 19. Birch OF, Fairley KF. Haematuria: 1979;ii:845-6. glomerular or nonglomerular? 20. K#{227}hler H, Wandel E, Brunck B. Acarrthocyturia-a characteristic marker for glomerular bleeding. Kidney Int 1991;40:115-20. 21. Hofmann W. A mathematical equation to discriminate overload proteinuria from tubulo-interstitial involvement in glomerular diseases. Clin Nephrol 1995;44:28-31. 22. Keller H,Trendelenburg C, eds.Clinical biochemistry data presentation and interpretation. Berlin: de Gruyter, 1989. 1222 Ivandi#{233} et ab.:Urine proteinexpertsystem (UPES) 23. Bepperling C, Hehrmann R, Haas H, Hotz G, Olbricht T, Schmidt R, et al. A knowledge-based system for the interpretation ofthyroid hormone measurements: evaluation and optimisation of the system Pro.M.D.-SD in five clinical laboratories. Lab Med 1994; 18:564-71. 24. TrendelenburgC, PohI B. Pro.M.D. expert system and itsapplication in laboratory medicine.Ann Biol Clin 1993:51:226-7. 25. O’Moore RR. Decision-supporting based on laboratory data. Methods Inform Med 1988;27:187-90. 26. WmnkelP. Interpreting the results is the expertise of the laboratory. Clin Chim Acta 1994;224:S9-S51. 27. SolbergHE. Discriminant analysis. CritRev dIm Lab Scm 1978;9: 209-42. 28. lvandk BT, Kratzer MA.A, Fateh-Moghadam A. Analysisof serum electrophoresis pattern by artificial neural networks. Lab Med 1992:16:128-33. 29. Furlong JW, Dupuy ME, Heinsimer JA. Neural network analysis of serial cardiac enzyme data-a clinical application of artificial machine intelligence. Am J dIm Pathol 1991:96:134-41. 30. Reibnegger G, Weiss G, Wachter H. Self-organizing neural networks as a means of cluster analysis in clinical chemistry. Eur J dIm Chem dIm Biochem 1993:31:311-7. 31. McClelland JL, Rumelhart DE. Explorations in parallel distributed processing. Cambridge, MA: MIT Press, 1989. 32. Hofmann W, Regenbogen C, Edel H, Guder WG. Diagnostic strategies in urinalysis. Kidney mt 1994:46(47, Suppl):111S-4S. Appendix: Text Elements Selected by UPESto Compose a LahOFatOIY Report 1. Based on the serum findings,a major decrease of glomerular filtration rate is not likely. 2. The glomei-ular filtration rate is reduced. 3.An isolatedincreaseof both serum valuesin combination with normal urine protein excretion might reflect a boss of functioning nephrons. The protein reabsorption is fully compensated by the remaining nephrons. An active renal disease is unlikely. 4.To exclude the possibility of a reduced GFR, gbomerular clearanceshould be investigated. 5. The isolatedincreaseof serum creatininemight indicate so-called pseudocreatinines.Alternatively,increased muscle mass or a meat diet may be involved. 6. Discrepancy between the sum of albumin, IgG, and a, -microglobubin and the concentration of total protein (single proteins/total protein <0.3) in combination with a positive test-strip result for blood indicates a prerenab hematuria. Definite report follows after additional tests to exclude myogbobinuriaor hemogbobinuria. 7. Differentiationof renal and postrenal hematuria by protein analysisisimpossibleat albumin concentrations<100 mgfL. Phase-contrast microscopy of a fresh morning urine may allow the differentiation of renal and postrenal causes of hematuna (acanthocytes?). 8. Most likely,postrenab hematuria is present. Because additionalrenal excretion of proteins cannot be excluded, a control after disappearance of hematuria issuggested. 9. Most likely, renal(glomerular/tububo-interstitial) hemaispresent.A slightadditionalpostrenalsource of erythro- tuna cannot be excluded. 10. The detection of beukocyte cytes possible postrenal with leukocytes. inflammation esterase or contamination may indicate a of the urine 11.The urineproteindifferentiation should be repeatedafter leukocyturia has stopped, because inflammations in the lower urinary tract can also cause a slight proteinuria. 12. The detection mation with renal of leukocyte involvement, esterase may indicate inflamif there was no leukocyte contamination during sampling. 13. Analyses of the marker proteins in the urine do not indicateany dysfunction of gbomerular protein filtration and tubular reabsorption. No signs of hematuria or granulocyturia are present. 14.An isolatedincreaseof IgG may indicate,e.g.,monocbonab gammopathies. 15. Discrepancy between the sum of albumin, IgG, and a, -microglobulin and the concentration of total protein (single proteins/total protein <0.3) indicatesa prerenal proteinuria. Immunofixation will be performed to exclude Bence Jones proteinunia. The definitive reportwillfollow afterthisinvestigation. 16.A distinctselective glomerular proteinuriawith simultaneous slight tubular 17. The permeability proteinuria is found. findings are consistent or a tububo-interstitial with impaired gbomerular dysfunction (or both). A slight increase of the marker proteins does not necessarily indicatea renal disease.If clinical cues are missing,a control measurement made under standardizedconditions(no intense physical stress before investigation, optimal metabolic and hypertonic equilibrium of diabetic and hypertonic patients) is recommended. 18. The findingsare consistentwith a tubulo-interstitial nephropathy/primary glomerubopathy/secondary glomerubopathy. 19.The findingsare consistentwith a primary or secondary glomerubopathy (e.g., diabetes mebbitus, hypertension). 20. The findings are consistent with either (a) a glomeru-. bopathy with impaired tubulo-interstitiab reabsorption or (b) an interstitial nephropathy with secondary glomerulopathy. 21. The IgC excretion indicates a primary glomerulopathy/ secondary glomerubopathy/tubulo-interstitiab nephropathy. 22. The increased excretion of the tubular marker a,microglobubinisthe resultof a tubularoverload (exhaustionof the tubularreabsorptivecapacity). 23. The extent of interstitial fibrosis correlates with the excretionof the tubularmarker proteina,-microglobubin. 24. The increased excretion of the tubular enzyme (3-NAG indicates a possible acute disorder of proximal tubular cells.