The Impact of Selecting a High Hemoglobin
Transcription
The Impact of Selecting a High Hemoglobin
REVIEW ARTICLE The Impact of Selecting a High Hemoglobin Target Level on Health-Related Quality of Life for Patients With Chronic Kidney Disease A Systematic Review and Meta-analysis Fiona M. Clement, PhD; Scott Klarenbach, MD, MSc; Marcello Tonelli, MD, SM; Jeffrey A. Johnson, PhD; Braden J. Manns, MD, MSc Background: Treatment of anemia in chronic kidney disease (CKD) with erythropoietin-stimulating agents (ESAs) is commonplace. The optimal hemoglobin treatment target has not been established. A clearer understanding of the health-related quality of life (HQOL) impact of hemoglobin target levels is needed. We systematically reviewed the randomized controlled trial (RCT) data on HQOL for patients treated with low to intermediate (9.0-12.0 g/dL) and high hemoglobin target levels (⬎12.0 g/dL) and performed a meta-analysis of all available 36-item short-form (SF-36) RCT data. Methods: We conducted a search to identify all RCTs of ESA therapy in patients with anemia associated with CKD (1966–December 2006). Inclusion criteria were (1) 30 or more participants, (2) anemic adults with CKD, (3) epoetin (alfa and beta) or darbepoetin used, (4) a control arm, and (5) reported HQOL using a validated measure. All available SF-36 data underwent meta-analysis using the weighted mean difference. Results: Of 231 full texts screened, 11 eligible studies were identified. The SF-36 was used in 9 trials. Reporting of these data was generally incomplete. Data from each domain of the SF-36 were summarized. Statistically significant changes were noted in the physical function (weighted mean difference [WMD], 2.9; 95% confidence interval [CI], 1.3 to 4.5), general health (WMD, 2.7; 95% CI, 1.3 to 4.2), social function (WMD, 1.3; 95% CI, −0.8 to 3.4), and mental health (WMD, 0.4; 95% CI, 0.1 to 0.8) domains. None of the changes would be considered clinically significant. Conclusions: Our study suggests that targeting hemoglobin levels in excess of 12.0 g/dL leads to small and not clinically meaningful improvements in HQOL. This, in addition to significant safety concerns, suggests that targeting treatment to hemoglobin levels that are in the range of 9.0 to 12.0 g/dL is preferred. Arch Intern Med. 2009;169(12):1104-1112 A Author Affiliations: Departments of Community Health Sciences and Medicine and Centre for Health and Policy Studies, University of Calgary, Calgary, Alberta, Canada (Drs Clement and Manns); Department of Medicine, Division of Nephrology, University of Alberta, Edmonton (Drs Klarenbach and Tonelli); and Institute for Health Economics, Edmonton, Alberta (Dr Johnson). NEMIA IS A COMMON COMplication of chronic kidney disease (CKD) and is associated with adverse clinical outcomes and poor health-related quality of life (HQOL).1-3 Treatment of anemia before the advent of erythropoietin-stimulating agents (ESAs) relied on routine blood transfusions. Although the main advantage of treatment of anemia in CKD with ESAs once focused on preventing the need for blood transfusions, over time, it has shifted to encompass other clinical considerations. Specifically, although the labeling for ESAs in most countries refers to their ability to improve anemia and reduce the need for transfusions, 4 many studies have addressed the use of ESAs at varying doses and their effect on survival, cardiovascular morbidity, or HQOL.5 (REPRINTED) ARCH INTERN MED/ VOL 169 (NO. 12), JUNE 22, 2009 1104 The use of ESAs has become commonplace in most developed countries, and the debate over their use now principally focuses on the optimal target hemoglobin level.1,2,5 This debate has intensified over the past 15 years, with the publication of several large randomized controlled trials (RCTs) testing the effect of using ESAs to For editorial comment see page 1100 achieve various hemoglobin targets.3,6 Many of these trials compared low to intermediate (hereinafter, low/intermediate) hemoglobin target levels, typically in the 95 to 11.5 g/dL range, with high hemoglobin target levels, typically higher than 13.0 g/dL. Perhaps unexpectedly, many of these RCTs have raised important safety concerns with high hemoglo- WWW.ARCHINTERNMED.COM ©2009 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/09/2014 bin target levels, with findings of higher rates of adverse events, including mortality and vascular access thrombosis. 7-11 These concerns have led the US Food and Drug Administration, among other regulatory agencies, to change the labeling for ESAs to target hemoglobin levels lower than 12.0 g/dL.12 (To convert hemoglobin to grams per liter, multiply by 10.0.) Given high baseline mortality rates and the difficulty in finding therapies that improve survival in patients with end-stage renal disease (ESRD),13 and the findings from the Hemodialysis Study14 and Dialysis Clinical Outcomes Revisited Study,15 HQOL improvement has become a focus of treatment of CKD. Broadly, HQOL is defined as the way a patient feels or functions, aspects of which can improve after successful treatment of a condition.16 Minimally, measurement of HQOL includes assessment of functional status, mental health or emotional well-being, social engagement, and symptom states.16 Although there are many different scales that have been used to measure HQOL, the 36item short-form instrument (SF36) is among the most commonly applied instrument to measure HQOL. It comprises 36 questions that assess 8 domains of HQOL: physical function, physical role, pain, general health, vitality, social function, emotional role, and mental health.17 Despite the safety concerns that have been associated with normalization of hemoglobin,3,13 knowing that patients with ESRD who are treated with hemodialysis have very poor baseline QOL has led some to suggest that targeting hemoglobin levels higher than 12.0 g/dL might still be reasonable in some patients with ESRD in an effort to improve QOL.2,5 A clearer understanding of the impact of higher-dose ESA on HQOL will enable rational comparison of putative HQOL benefits with potential safety issues. In this study, we aimed to summarize the available data from RCTs on HQOL for patients treated to different hemoglobin target levels and determine the magnitude of HQOL differences. Given that most studies that have measured HQOL in this area have used the SF-36, we also conducted a meta-analysis of SF-36 data for all RCTs comparing low/ intermediate and high hemoglobin target levels. METHODS SEARCH STRATEGY We conducted a comprehensive and exhaustive search to identify all RCTs of ESA therapy in adults with anemia associated with CKD. No language restrictions were applied. MEDLINE (1966 through December 12, 2006), EMBASE (1988 through December 12, 2006), all evidence-based medicine reviews, and a variety of other gray literature sources (n = 48) were searched using different search terms for ESA combined with CKD search terms (eTable 1, http://www .archinternmed.com, provides search strategy used for MEDLINE; strategy adapted for other search engines). Each citation or abstract was independently screened by a subject specialist and 1 other reviewer. Any trial considered relevant by at least 1 reviewer was retrieved for further review. The reference lists of included trials and relevant reviews were also scanned for pertinent trials. In addition, we contacted ESA manufacturers in Canada and the authors of included studies for information about further studies. STUDY SELECTION The full text of each potentially relevant article was independently assessed by 2 reviewers for inclusion in the review using predetermined eligibility criteria. Studies were eligible for inclusion if they met the following 5 criteria: (1) it was a parallel RCT design with at least 30 participants in each treatment group, (2) the population was limited to anemic adults (age ⱖ 18 years) with CKD (including hemodialysisdependent and nonhemodialysis dependent), (3) epoetin (alfa and beta) or darbepoetin used, (4) the control arm was either a different agent or hemoglobin target, or no active treatment (eg, placebo), and (5) the study reported HQOL using a validated measure defined as one in which its reliability, internal consistency, and responsiveness had been published in the peer-reviewed literature. Because small studies are unlikely to publish long-term outcomes or HQOL, studies with less than 30 participants were excluded to improve the efficiency of the literature search. Initial dis- (REPRINTED) ARCH INTERN MED/ VOL 169 (NO. 12), JUNE 22, 2009 1105 agreements on study selection were resolved through consensus. DATA EXTRACTION From all included studies, data were extracted on trial characteristics (country, design, sample size, duration of follow-up, objective, score according to the criteria of Jadad et al18 [hereinafter, Jadad score]), participants (age, sex, diabetic status, cardiovascular status, previous hematopoietic hormone use), illness severity (hemoglobin levels, renal function, dialytic modality), therapeutic regimens (type, dose, schedule, route of administration, target hemoglobin level), control regimens and cointerventions, and HQOL outcomes. Trial quality was assessed using the Jadad score.18 Given that the RCTs were conducted over an 18-year period, several different HQOL measures were used by the different studies. Moreover, some studies used disease-specific measures, and some used generic HQOL measures. As such, we first report qualitatively the results of our systematic review. STATISTICAL ANALYSIS We planned a priori to combine the results of HQOL measures for trials that compared the use of ESAs targeting either a low/intermediate hemoglobin target level (ie, a hemoglobin level of 9.0-12.0 g/dL) or high hemoglobin target level (ie, a hemoglobin level ⬎12.0 g/dL). The a priori analysis plan was to combine all available HQOL data. The primary outcome was the change from baseline HQOL. Because most HQOL instruments report continuous data thus, a priori, the weighted mean difference (WMD) was to be used for each instrument separately. However, owing to the lack of standardized reporting and the fact that few studies used similar HQOL scales aside from the SF-36, only the SF-36 data were suitable for meta-analysis. For this analysis, each domain of the SF-36 was summarized separately. When combining SF-36 domains for base case analyses, we included information from all studies in which any information for any SF-36 domain was reported. Given that several studies selectively reported information for only 1 domain, we completed 2 sensitivity analyses as follows: (1) we combined data only for studies reporting all domains of the SF-36, and (2) we WWW.ARCHINTERNMED.COM ©2009 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/09/2014 from between-study variation.19 Where possible, changes in HQOL for each outcome measure were also compared with the minimal clinically important difference (MCID). For the SF-36, we conservatively considered 5 points to be the MCID.20,21 differences expected between trials, we decided a priori to combine results using a random-effects model. Statistical heterogeneity was quantified using the I2 statistic. The I2 statistic approximates the percentage of total variation (within- and between-study) resulting included all studies in which any SF-36 data were reported; for domains that were either not reported or reported as “P⬎.05,” we assumed no change over time. If there were multiple time points reported per outcome, we included only the last time point. Owing to the RESULTS 2289 Citations identified from electronic search and broad screened INCLUDED STUDIES 2062 Abstracts excluded: 2038 Not relevant 14 Not retrievable 10 No translator available 4 Citations identified from other sources Figure 1 shows the flowchart for trial selection.Fromatotalof2289citations, 231 potentially relevant articles were retrieved for review. Of these, 220 did notmeettheselectioncriteriaandwere excluded. In particular, 23 excluded articles did not report HQOL, and 3 wereexcludedbecausetheydidnotuse a validated instrument.7,9,10 There was substantial initial agreement for study inclusion (=0.86). A total of 11 primary articles met all 5 of the inclusion criteria.3,6,8,11,13,22-27 Of these, 10 compared a low/intermediate target level with a high hemoglobin target level, and 1 compared placebo with a target level of 9.5 to 11.0 g/dL and a target level of 11.5 to 13.0 g/dL.22 Although these target levels overlap our definition of low/intermediate (9.0-12.0 g/dL) and high (⬎12.0 g/dL), we classified the 9.5 to 11.0 g/dL level as low/ intermediate and the 11.5 to 13.0 g/dL level as high. 231 Potentially relevant studies retrieved 194 Studies excluded: 36 Not original human research 51 Not an RCT 12 Crossover 1 Children 3 No CKD 11 No relevant comparison 38 Sample size <30 25 Multiple publications 11 No relevant clinical outcome 6 No usable data reported 26 Eligible RCTs excluded: 23 No HQOL reported 3 No validated instrument used 11 Relevant studies Figure 1. Flowchart of included studies. CKD indicates chronic kidney disease; HQOL, health-related quality of life; RCT, randomized controlled trial. Table 1. Included Study Characteristics High Target Level Group Low to Intermediate Target Level Group RI Follow-up Jadad Score a Target Hb Level, g/dL No. Mean Age, y Male, % Target Hb Level, g/dL No. Mean Age, y Male, % HD EPO vs EPO vs placebo 6 mo 5 11.5-13.0 38 43 26 9.5-11.0 40 44 19 HD HD PD/HD Predialysis Predialysis HD Predialysis Predialysis Predialysis Predialysis EPO vs EPO EPO vs EPO EPO vs EPO EPO vs EPO EPO vs EPO EPO vs EPO EPO vs EPO EPO vs EPO EPO vs EPO EPO vs EPO 29 mo 48 wk 48 wk 24 mo 24 mo 96 wk 16 mo 24 mo 9 mo 15 mo 4 4 3 4 5 5 4 4 4 3 13.0-15.0 13.0-14.0 13.5-16.0 12.0-13.0 12.0-14.0 13.5-14.5 13.5 13.0-15.0 13.0-15.0 13.0-15.0 618 70 129 75 85 293 715 301 195 88 65 62 63 53 57 52 66 59 59 58 50 47 67 51 55 60 44 57 58 51 9.0-10.0 9.5-10.5 9.0-12.0 9.0-10.0 9.0-10.5 9.5-11.5 11.3 10.5-11.5 11.0-12.0 10.5-11.5 615 72 124 80 87 294 717 302 194 82 64 60 63 54 57 49 66 59 58 57 48 44 63 42 52 60 46 51 61 50 Study Population Canadian Erythropoietin Study Group22 b Besarab et al13 c Foley et al23 Furuland et al8 d Roger et al27 Levin et al24 Parfrey et al25 Singh et al3 e Drüeke et al6 Rossert et al11 f Ritz et al26 Abbreviations: EPO, erythropoietin; Hb, hemoglobin; HD, hemodialysis; PD, peritoneal dialysis; RI, randomized intervention. SI conversion factor: To convert hemoglobin to grams per liter, multiply by 10.0. a The scoring system developed by Jadad et al.18 b The Canadian Erythropoietin Study Group22 also included a placebo arm with 40 patients; mean age, 48 years; 25% were male. c Terminated early (intended to follow for 3 years): continuation was felt to be unlikely to show benefit and the higher mortality in the high target level group was not statistically significant compared with the mortality in the low to intermediate target level group. d Although protocol included predialysis CKD patients, quality of life was only measured in patients undergoing hemodialysis. e Terminated early (intended to follow for 3 years): there was only a 5% chance that continuation would show benefit; adverse events and quality of life data were considered. f Terminated early (intended to follow for 3 years); there were safety concerns surrounding the occurrence of pure red cell aplasia. (REPRINTED) ARCH INTERN MED/ VOL 169 (NO. 12), JUNE 22, 2009 1106 WWW.ARCHINTERNMED.COM ©2009 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/09/2014 Table 2. Health-Related Quality-of-Life (HQOL) Outcomes Time of Assessment Study Canadian Erythropoietin Study Group22 Baseline 2 mo 6 mo Besarab et al13 Baseline and every 6 mo Foley et al23 Baseline 24 wk 48 wk Furuland et al8 Baseline 48 wk 1y Roger et al27 Baseline 24 mo Levin et al24 Baseline Every 6 mo Reported HQOL Instrument Overall mean change (1) KDQ in HQOL instruments (2) SIP from baseline to 6 mo (3) Time trade-off Results Subscale Low High Notes Physical Fatigue Relationships Depression Frustration Global 1.6 1.4 (PLA: 0.4) Statistically significant 0.9 1.1 (PLA: 0.1) differences between 0.6 0.6 (PLA: 0.1) placebo and low treatment 0.4 0.7 (PLA: 0.1) arms in physical, fatigue, 0.0 0.4 (PLA: 0.0) relationships, and −5.3 −7.8 (PLA: −2.9) depression 0.02 0.06 (PLA: 0.0) Overall change per SF-36 Physical function domain increase of 0.6 Not reported by treatment percentage point arm. increase in hematocrit Other 7 domains of SF-36 not reported Overall mean change (1) KDQ Physical 1.17 1.10 SF-36 and HUI scores were in KDQ from baseline (2) SF-36 Fatique −0.01 0.25 reported as “unchanged to 48 wk (3) HUI (mark 2) Relationships 0.20 −0.07 over time” (not reported Depression 0.20 0.05 numerically or by treatment Frustration 0.15 −0.07 arm) Mean change from KDQ Physical 0.25 0.66 Statistically significant baseline to 48 wk Fatigue −0.33 0.16 difference between Relationships 0.0 0.0 treatment groups for Depression −0.40 0.0 physical symptoms and Frustration 0.0 0.0 depression subscale Mean change from (1) Renal QOL profile Global 5 7 No statistically significant baseline to 24 mo (2) SF-36 Physical function −7.3 −4.3 differences between Physical role 15.8 2.7 treatment arms Pain −9.2 −8.2 General health −4.1 0.9 Vitality −4.6 −1.0 Social function −10.9 −5.6 Emotional role −5.9 5.9 Mental health −1.0 −0.6 Mean change from SF-36 Physical function −9.3 −2.5 No statistically significant baseline to 24 mo Physical role −4.1 −3.9 differences between Pain −4.2 −1.0 treatment arms General health −4.9 0.3 Vitality −5.1 −2.7 Social function −2.5 −4.7 Emotional role −12.3 −3.7 Mental health −0.5 1.8 (continued) TRIAL CHARACTERISTICS INSTRUMENTS Table 1 summarizes the character- A total of 10 instruments were used to report HQOL in the 11 included trials. An overview of the measures used and the minimal clinically important difference cited for each measure is provided in eTable 2 (http: //www.archinternmed.com). The most commonly used instrument was the SF-36, administered in 9 of the 11 trials. Reporting of these data was generally poor. There was a wide variety of reporting formats, with some trials reporting baseline data and others reporting only change from baseline. Only 5 trials reported data on all domains and scales measured.3,8,11,22,27 istics of included trials. The trials ranged in size from 78 to 1432 patients. Five trials considered patients with CKD undergoing hemodialysis, whereas 6 considered those with CKD who were not. Health-related quality of life was the specified primary end pointin2studies.8,22 Otherprimaryend pointsconsideredweretimetofirstcardiovascular event or death,3,6,13 change inleftventricularfunction,23-27 andrate of glomerular filtration rate decline.11 The mean Jadad score was 4, indicating generally high-quality studies. Of note, 3 of the 12 trials were terminated early owing to futility (ie, the perception that use of the high hemoglobin target level was unlikely to have led to better outcomes if the trial were completed). Patients were blinded to the allocated hemoglobin target level in only 2 of 11 studies.24,25 SUMMARY OF TRIAL FINDINGS The findings of each trial are summarized in Table 2. Most trials (REPRINTED) ARCH INTERN MED/ VOL 169 (NO. 12), JUNE 22, 2009 1107 compared the mean change from baseline in the measured HQOL instrument or domain between the treatment groups. For studies comparing low/intermediate and high target level groups, the differences reported between groups were generally small, were noted on a minority of measured domains, and were consistently below the cited minimal clinically important differences (eTable 2). In particular, the magnitude of the between-group differences rarely exceeded the 5-point threshold commonly considered clinically important for the SF-36.20,21 META-ANALYSIS OF SF-36 DOMAINS Given that the SF-36 was the most commonly used instrument, we undertook a WMD meta-analysis of each domain, comparing low/intermediate and high target level hemoglobin WWW.ARCHINTERNMED.COM ©2009 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/09/2014 Table 2. Health-Related Quality-of-Life Outcomes (Continued) Study Time of Assessment Results HQOL Instrument Subscale Low High Notes Parfrey et al25 Baseline, 24, 36, 48, 60, 72, 84, and 96 wk Mean change from baseline to mean follow-up value (1) Functional Assessment of Chronic Illness Therapy (2) KDQOL (3) SF-36 Global Quality of social interaction Vitality −3.21 0.03 −2.31 −1.0 0.8 1.21 Singh et al3 Baseline 36 mo Mean change from baseline to 36 mo (1) KDQ Linear Analogue Self-assessment (2) SF-36 Drüeke et al6 Baseline 1y 2y Mean change from baseline to 1 y SF-36 Rossert et al11 Baseline 9 mo Study termination Percentage with score of full health Mean score at 9 mo (1) Katz Index of ADLs (2) SF-36 Baseline 15 mo Mean change from baseline to 15 mo Global 1.1 Energy 15.5 Activity 13.3 Overall QOL 11.9 Physical function 2.1 Physical role 7.5 Pain 2.4 General health 1.8 Vitality 8.2 Social function 3.5 Emotional role 5.9 Mental health 2.4 Physical function −2.1 Physical role −6.6 Pain −1.2 General health −0.7 Vitality −3.0 Social function −2.1 Emotional role Mental health 95% of patients had a score of 6 (no limitation in ADLs) at baseline and 9 mo Physical function 65 Physical role 58 Pain 66 General health 53 Vitality 53 Social function 81 Emotional role 71 Mental health 75 Physical function −0.3 Physical role Pain General health Vitality Social function Emotional role Mental health Statistically significantly higher SF-36 vitality scores in the high treatment arm at 24, 36, 48, 60, and 72 wk No other domains reported. No significant differences in KDQOL and FACIT-F Statistically significant change in SF-36 emotional role in high treatment arm. No statistically significant change in any other measure Ritz et al26 Reported SF-36 1.6 16.6 15.0 11.2 3.2 6.4 0.4 3.0 10.0 1.3 0.8 1.7 3.5 2.4 3.0 3.9 1.8 2.7 72 68 71 58 59 82 76 76 5.3 Statistically significant 6 domains of SF-36 reported in high treatment arm Difference maintained at 2 y for general health and vitality domains 2 SF-36 domains not reported Statistically significantly higher SF-36 vitality domain for high treatment arm Statistically significant change in SF-36 general health domain in the high treatment arm Other 7 domains not reported Abbreviations: ADLs, activities of daily living FACIT-F, Functional Assessment of Chronic Illness Therapy–Fatigue subscale; HUI, Health Utilities Index; KDQ, kidney disease questionnaire; KDQOL, Kidney Disease Quality of Life; PLA, placebo; QOL, quality of life; SF-36, 36-item short-form instrument; SIP, sickness impact profile. groups. To calculate a WMD, either both baseline and final scores or an overall mean change is required. As such, only 4 of the 9 trials that measured SF-36 reported useable data (Table 3). However, only the Correction of Hemoglobin and Outcomes in Renal Insufficiency (CHOIR) study group3 reported all 8 domains; the Cardiovascular Risk Reduction in Early Anemia Treatment With Epoetin Beta (CREATE) study group6 published 6 domains, whereas the Anemia Correction in Diabetes (ACORD) study group26 and Parfrey et al25 reported only 1 SF-36 domain. All authors were contacted to obtain additional data; 2 responded and pro- vided unpublished data24,27 (Table 3). Thus, 6 studies, including 3 with complete data on all SF-36 domains, were included in the meta-analysis. Figure 2 shows the forest plot of all reported SF-36 domain data. The WMDsrangedfrom−3.0(favoringthe low intermediate target level arm) in the emotional role domain to 3.2 (favoring the high target level arm) in the vitality domain. A change that would beconsideredclinicallysignificantwas not found in any of the domains (Table 4). Significant heterogeneity was found in the physical role (I2 =68.6%; P=.02), social function (I2 =72.6%; P=.01), and mental health (I2 =70.5%; P=.02) domains. (REPRINTED) ARCH INTERN MED/ VOL 169 (NO. 12), JUNE 22, 2009 1108 SENSITIVITY ANALYSES When only studies reporting all domains of the SF-36 were included, smaller differences between arms were noted, with the low/intermediate target level arm being favored in more domains. When an assumption of no difference in the unreported domains is made, the mean change in domain scores was similar to the base case. COMMENT Although we found small improvements in HQOL in 4 of 8 SF-36 do- WWW.ARCHINTERNMED.COM ©2009 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/09/2014 Table 3. Mean SF-36 Domain Scores by Treatment Arm for Studies That Measured SF-36 Scores and Reported Any Data for Any SF-36 Domain Low to Intermediate Target Level Arm Source Physical Function Roger et al27 Baseline Final Change Levin et al24 Baseline Final Change Parfrey et al25 Baseline Final Change Singh et al3 Baseline Final Change Drüeke et al6 Baseline Final Change Ritz et al26 Baseline Final Change (n=80) 72.4 65.1 −7.3 (n=82) 66.6 57.7 −9.3 (n=294) NR NR NR (n=717) 42.4 44.5 2.1 (n=302) NR NR −2.1 (n=82) NR NR NR Physical Role Pain General Health Vitality Social Function Emotional Role Mental Health 39.2 55.0 15.8 80.7 71.5 −9.2 49.0 44.9 −4.1 52.8 48.2 −4.6 82.2 71.3 −10.9 25.4 31.3 −5.9 72.1 71.1 −1.0 57.5 51.9 −4.1 65.6 62.1 −4.2 52.5 47.8 −4.9 50.2 45.5 −5.1 78.3 76.3 −2.5 83.1 71.9 −12.2 70.8 70.6 −0.5 NR NR NR NR NR NR NR NR NR 54.7 52.4 −2.3 NR NR NR NR NR NR NR NR NR 32.5 40.0 7.5 58.0 60.4 2.4 42.6 44.4 1.8 36.6 44.8 8.2 63.7 67.2 3.5 57.4 63.3 5.9 70.2 72.6 2.4 NR NR −6.6 NR NR NR NR NR −1.2 NR NR −0.7 NR NR −3.0 NR NR NR NR NR −2.1 NR NR NR NR NR NR NR NR −0.33 NR NR NR NR NR NR NR NR NR NR NR NR High Target Level Arm Source Physical Function Roger et al27 Baseline Final Change Levin et al24 Baseline Final Change Parfrey et al25 Baseline Final Change Singh et al3 Baseline Final Change Drüeke et al6 Baseline Final Change Ritz et al26 Baseline Final Change (n=750) 63.3 59.0 −4.3 (n=80) 70.3 68.6 −2.5 (n=293) NR NR NR (n=715) 41.9 45.1 3.2 (n=301) NR NR 3.5 (n=88) NR NR NR Physical Role Pain General Health Vitality Social Function Emotional Role Mental Health 46.6 49.3 2.7 71.1 62.9 −8.2 44.1 45.0 0.9 47.2 46.2 −1.0 72.3 66.7 −5.6 31.1 37.0 5.9 73.5 72.9 −0.6 55.4 50.0 −3.9 69.3 68.5 −1.0 54.5 54.6 0.3 54.6 51.7 −2.7 80.7 75.0 −4.7 80.0 74.3 −3.7 69.1 71.2 1.8 NR NR NR NR NR NR NR NR NR 54.1 55.3 1.2 NR NR NR NR NR NR NR NR NR 31.9 38.3 6.4 57.8 59.2 0.4 41.3 43.3 3.0 35.2 45.2 10.0 63.7 65.0 1.3 57.2 58.6 0.8 69.6 71.3 1.7 NR NR 2.4 NR NR NR NR NR 3.0 NR NR 3.9 NR NR 1.8 NR NR NR NR NR 2.7 NR NR NR NR NR NR NR NR 5.33 NR NR NR NR NR NR NR NR NR NR NR NR Abbreviations: NR, nonreported data; SF-36, 36-item short-form instrument. mains, the changes were well below the threshold for a minimal clinically important difference of a 5-point change(allwere⬍3.2pointsona100point scale).21 It is often argued that targeting a high hemoglobin target level would be most likely to have an impact on vitality scores, the domain that captures the sentiment of “having more energy.” However, vitality (REPRINTED) ARCH INTERN MED/ VOL 169 (NO. 12), JUNE 22, 2009 1109 scores were only 3.2 (95% confidence interval [CI], 1.87-4.44) points higher inthehightargetlevelgroupcompared with the low/intermediate target level group. Higher scores were observed WWW.ARCHINTERNMED.COM ©2009 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/09/2014 Source WMD (95% CI) Physical Function Roger et al27 % Weight 3.00 (0.02 to 5.98) 6.80 (–0.27 to 13.87) 1.10 (–1.35 to 3.55) 5.54 (2.04 to 9.04) 2.91 (1.29 to 4.53) 29.63 5.24 43.69 21.44 100 Physical Role Roger et al27 Levin et al24 Singh et al3 Drüeke et al6 Subtotal (I 2 = 68.6%, P = .02) –13.10 (–26.10 to 0.10) 0.20 (–12.81 to 13.21) –1.10 (–5.45 to 3.25) 8.97 (1.43 to 16.50) 0.28 (–3.20 to 3.77) 7.19 7.18 64.25 21.38 100 Pain Roger et al27 Levin et al24 Singh et al3 Subtotal (I 2 = 52.6%, P = .12) 1.00 (0.01 to 1.99) 3.20 (–4.59 to 10.99) –2.00 (–4.84 to 0.84) 0.71 (–0.22 to 1.64) 87.85 1.42 10.72 General Health Roger et al27 Levin et al24 Singh et al3 Drüeke et al6 Ritz et al26 Subtotal (I 2 = 30.2%, P = .22) 5.00 (0.04 to 9.96) 5.22 (–0.82 to 11.26) 1.20 (–0.72 to 3.12) 4.21 (1.22 to 7.21) 5.66 (–0.44 to 11.76) 2.71 (1.26 to 4.15) 8.50 5.73 56.84 23.31 Vitality Roger et al27 Parfrey et al25 Levin et al24 Singh et al3 Drüeke et al6 Subtotal (I 2 = 0.0%, P = .66) 3.60 (0.03 to 7.17) 3.52 (1.31 to 5.73) 2.40 (–4.38 to 9.18) 1.80 (–0.51 to 4.11) 4.53 (1.67 to 7.39) 3.17 (1.89 to 4.44) 12.74 33.32 3.54 30.57 19.84 100 Social Function Roger et al27 Levin et al24 Singh et al3 Drüeke et al6 Subtotal (I 2 = 72.6%, P = .01) 5.30 (0.04 to 10.56) –2.20 (–10.21 to 5.81) –2.20 (–5.41 to 1.01) 4.85 (1.08 to 8.62) 1.30 (–0.84 to 3.44) 16.51 7.11 44.31 32.07 100 0.00 (–12.99 to 12.99) 8.50 (–4.73 to 21.73) –5.10 (–10.09 to 0.11) –3.01 (–7.41 to 1.38) 11.44 11.04 77.51 100 0.40 (0.00 to 0.80) 2.30 (–1.75 to 6.35) –0.70 (–2.60 to 1.20) 4.80 (1.77 to 7.82) 0.44 (0.06 to 0.83) 93.43 0.90 4.07 1.60 100 Levin et al24 Singh et al3 Drüeke et al6 Subtotal (I 2 = 44.8%, P = .14) Emotional Role Roger et al27 Levin et al24 Singh et al3 Subtotal (I 2 = 47.2%, P = .15) Mental Health Roger et al27 Levin et al24 Singh et al3 Drüeke et al6 Subtotal (I 2 = 70.5%, P = .02) –26.1 0 Favors Low Hemoglobin 100 5.62 100 26.1 Favors High Hemoglobin Figure 2. Forest plot of all reported 36-item short-form instrument domain data. CI indicates confidence interval; WMD, weighted mean difference. inthehighhemoglobintargetlevelarm in 3 domains (vitality, physical function, and general health) across all studies. However, again, the improvements were very small and would not be regarded as a clinically meaningful improvement (2.9 for physical function, 2.7 for general health, and 3.2 for vitality). Although improvements in HQOL are commonly cited as a rea- son to target a higher hemoglobin level, numerous limitations of the existing literature should be noted. Health-related quality of life was not reported in 26 of 37 identified RCTs, and among the 11 RCTs that did measure HQOL, 9 used the SF-36, although only 1 study3 reported full data on the SF-36 (although 2 additional authors provided full SF-36 data24,27). Given that patient blind- (REPRINTED) ARCH INTERN MED/ VOL 169 (NO. 12), JUNE 22, 2009 1110 ing is potentially difficult in this type of study, and that 9 of 11 studies were open label, it is possible that partial blinding of patients to their assigned group might have contributed to the small changes noted in these measures. Our findings highlight the importance of distinguishing statistical significance from clinical significance. The minimal clinically WWW.ARCHINTERNMED.COM ©2009 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/09/2014 Table 4. Results of Meta-analysis and Sensitivity Analyses Scenario Base Case: Using All Available Data Domain Physical function Physical role Pain General health Vitality Social function Emotional role Mental health Only Using Studies That Reported All SF-36 Domains Assuming No Change in Unreported SF-36 Domains Studies, No. WMD (95% CI) Studies, No. WMD (95% CI) Studies, No. WMD (95% CI) 4 4 3 5 5 4 3 4 2.9 (1.3 to 4.5) 0.3 (−3.2 to 3.8) 0.7 (−0.2 to 1.6) 2.7 (1.3 to 4.2) 3.2 (1.9 to 4.4) 1.3 (−0.8 to 3.4) −3.0 (−7.4 to 1.4) 0.4 (0.1 to 0.8) 3 3 3 3 3 3 3 3 2.2 (0.4 to 4.0) −2.1 (−6.0 to 1.9) 0.7 (−0.2 to 1.6) 2.0 (0.3 to 3.7) 2.3 (0.5 to 4.2) −0.4 (−3.0 to 2.2) −3.0 (−7.4 to 1.4) 0.4 (−0.0 to 0.8) 6 6 6 6 6 6 6 6 1.8 (0.5 to 3.1) 0.1 (−1.7 to 1.9) 0.5 (−0.7 to 1.4) 1.9 (0.7 to 3.1) 3.0 (1.8 to 4.3) 0.6 (−0.9 to 2.1) −0.4 (−2.0 to 1.2) 0.4 (0.1 to 0.8) Abbreviations: CI, confidence interval; SF-36, 36-item short-form instrument; WMD, weighted mean difference. important difference is defined as the smallest amount of benefit that patients can recognize and value.28 Thus, a clinically important change is relevant and important to patients, whereas a statistical change is simply a detectable change that may or not be noticeable to patients. Given that most of these trials were designed to measure differences in clinical outcomes, including survival, these RCTs may have been powered to detect statistically significant changes in measurements that are of no clinical relevance to patients. If multiple time points were measured, our analysis considered the latest time point reported. For some domains of the SF-36, particularly vitality, a short-term improvement may be observed at the other time points, such as 6 or 12 months. However, the clinical relevance of short-term small improvements that are not sustained over longer periods of follow-up is unknown. The use of ESAs has considerably altered the treatment of patients with anemia related to CKD. Although the initial use of ESAs was aimed at preventing the requirement for transfusion, much of the subsequent research around them has focused on determining whether using them to achieve a high hemoglobin target level is associated with clinical benefit. Despite concern regarding the safety of ESAs when targeted to a high hemoglobin level, some guidelines continue to advocate that hemoglobin target levels in excess of 12.0 g/dL may still be reasonable for selected patients; this seems to be the result of expected improvements in HQOL.2,5 Our analyses do not substantiate the basis for these guidelines. Our results continue to be very relevant because recent reports confirm that nearly 30% of American patients who undergo hemodialysis have a hemoglobin level in excess of 12.0 g/dL.29 There were several strengths and limitations of our study. We systematically reviewed the literature, including contacting manufacturers, to ensure that we identified all relevant RCTs. Despite this, it should be noted that many of the RCTs did not measure HQOL, and among those that did, selective reporting occurred, resulting in an incomplete data set with which to rigorously assess HQOL. To account for this, we conducted 2 sensitivity analyses, the results of which suggest that our base case analysis may overestimate the actual treatment effect. Our findings and the strength of our conclusions are limited by the available evidence. Given that HQOL is one of the major theoretical benefits of ESA, the selective reporting of certain domains, the variation in instruments used to measure HQOL, and the lack of directly measured data on utility are major weaknesses of the available literature. However, we were able to include some unpublished data supplied by the authors of the original studies, thereby minimizing some of the reporting bias that may be present. Last, there are methodological limitations with meta-analyses. However, we conducted the review according to a prespecified protocol, (REPRINTED) ARCH INTERN MED/ VOL 169 (NO. 12), JUNE 22, 2009 1111 used a well-defined comprehensive literature search strategy designed by an expert librarian, performed quality assessment and data extraction with duplicate reviewers, and used rigorous statistical methodology that would be expected to reduce the extent of any bias. Erythropoietin-stimulating agents are an important aspect of treatment for patients with CKD, especially for those undergoing hemodialysis or who are at risk of requiring blood transfusion without treatment. The cited goal of treatment to a higher hemoglobin target level is often to improve HQOL. However, our systematic review revealed a weak evidence base in support of that strategy, with poorly reported data and only 2 RCTs that examined HQOL as the primary study end point. Our study suggests that targeting a hemoglobin level in excess of 12.0 g/dL leads to small and not clinically meaningful improvements in HQOL. This, in addition to considerable safety concerns, would suggest that targeting treatment to hemoglobin levels that are in the range of 9.0 to 12.0 g/dL is reasonable. Accepted for Publication: February 7, 2009. Correspondence: Braden J. Manns, MD,MSc,DepartmentofNephrology, Foothills Medical Centre, 1403 29th St NW, Calgary, AB T2N 2T9, Canada (braden.manns@calgaryhealthregion .ca). Author Contributions: Study concept and design: Clement, Klarenbach, Tonelli, Johnson, and Manns. WWW.ARCHINTERNMED.COM ©2009 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/09/2014 Analysis and interpretation of data: Clement, Klarenbach, Tonelli, and Manns. Drafting of the manuscript: Clement and Manns. Critical revision of the manuscript for important intellectual content: Clement, Klarenbach, Tonelli, Johnson, and Manns. Statistical analysis: Clement and Tonelli. Obtained funding: Manns. Administrative, technical, and material support: Clement and Johnson. Study supervision: Klarenbach, Tonelli, and Manns. Methodologic HQOL expertise: Johnson. Financial Disclosure: None reported. Funding/Support: Dr Clement is supported by a postdoctoral fellowship award from the Canadian Health Services Research Foundation and from the Alberta Heritage Foundation for Medical Research (AHFMR). Dr Johnson holds a Canada Research Chair in Diabetes Health Outcomes and holds a Health Scholar Award from the AHFMR. Drs Manns and Tonelli are supported by a Canadian Institutes for Health Research (CIHR) New Investigator Award. Drs Tonelli and Klarenbach are supported by Population Health Investigator awards from the AHFMR. Dr Klarenbach is supported by a Scholarship Award from the Kidney Foundation of Canada. Additional Information: Two eTables are available at http://www .archinternmed.com. Additional Contributions: Tamara Durec, MLIS, designed the comprehensive literature search strategy. We gratefully acknowledge the researchers who provided unpublished SF-36 data. REFERENCES 1. Barrett BJ, Fenton SS, Ferguson B, et al; Canadian Society of Nephrology. Clinical practice guidelines for the management of anemia coexistent with chronic renal failure. J Am Soc Nephrol. 1999; 10(suppl 13):S292-S296. 2. KDOQI; National Kidney Foundation. KDOQI clinical practice guidelines and clinical practice recommendations for anemia in chronic kidney disease. Am J Kidney Dis. 2006;47(5)(suppl 3):S11-S145. 3. Singh AK, Szczech L, Tang KL, et al; CHOIR Investigators. Correction of anemia with epoetin alfa in chronic kidney disease. N Engl J Med. 2006; 355(20):2085-2098. 4. US Food and Drug Administration. Procrit label, Epogen label. 2007. http://www.fda.gov/cder/foi /label/2007/103234s5122lbl.pdf. Accessed August 14, 2008. 5. KDOQI. KDOQI clinical practice guideline and clinical practice recommendations for anemia in chronic kidney disease: 2007 update of hemoglobin target. Am J Kidney Dis. 2007;50(3):471-530. 6. Drüeke TB, Locatelli F, Clyne N, et al; CREATE Investigators. Normalization of hemoglobin level in patients with chronic kidney disease and anemia. N Engl J Med. 2006;355(20):2071-2084. 7. Bahlmann J, Schoter KH, Scigalla P, et al. Morbidity and mortality in hemodialysis patients with and without erythropoietin treatment: a controlled study. Contrib Nephrol. 1991;88:90-106. 8. Furuland H, Linde T, Ahlme´n J, Christensson AJ, Strömbom A, Danielson U. A randomized controlled trial of haemoglobin normalization with epoetin alfa in pre-dialysis and dialysis patients. Nephrol Dial Transplant. 2003;18(2):353-361. 9. Gouva C, Nikolopoulos P, Ioannidis JPA, Siamopoulos KC. Treating anemia early in renal failure patients slows the decline of renal function: a randomized controlled trial. Kidney Int. 2004;66 (2):753-760. 10. Revicki DA, Brown RE, Feeny DH, et al. Healthrelated quality of life associated with recombinant human erythropoietin therapy for predialysis chronic renal disease patients. Am J Kidney Dis. 1995;25(4):548-554. 11. Rossert J, Levin A, Roger SD, et al. Effect of early correction of anemia on the progression of CKD. Am J Kidney Dis. 2006;47(5):738-750. 12. Ongoing safety review erythropoiesis-stimulating agents (ESAs) epoetin alfa (marketed as Procrit, Epogen) [and] darbepoetin alfa (marketed as Aranesp): update. US Food and Drug Administration. http://www.fda.gov/cder/drug/infopage/RHE /default.htm. Accessed August 7, 2008. 13. Besarab A, Bolton WK, Browne JK, et al. The effects of normal as compared with low hematocrit values in patients with cardiac disease who are receiving hemodialysis and epoetin. N Engl J Med. 1998;339(9):584-590. 14. Eknoyan G, Beck GJ, Cheung AK, et al; Hemodialysis (HEMO) Study Group. Effect of dialysis dose and membrane flux in maintenance hemodialysis. N Engl J Med. 2002;347(25):2010-2019. 15. Suki WN, Zabaneh R, Cangiano JL, et al. Effects of sevelamer and calcium-based phosphate binders on mortality in hemodialysis patients. Kidney Int. 2007;72(9):1130-1137. 16. Jaeschke R, Singer J, Guyatt G. Measurement of health status: ascertaining the minimal clinically (REPRINTED) ARCH INTERN MED/ VOL 169 (NO. 12), JUNE 22, 2009 1112 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. WWW.ARCHINTERNMED.COM ©2009 American Medical Association. All rights reserved. Downloaded From: https://jamanetwork.com/ on 09/09/2014 important difference. Control Clin Trials. 1989; 10(4):407-415. Ware JE Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36), I: conceptual framework and item selection. Med Care. 1992; 30(6):473-483. Jadad AR, Moore RA, Carroll D, et al. Assessing the quality of reports of randomized clinical trials: is blinding necessary? Control Clin Trials. 1996; 17(1):1-12. Higgins JPT, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med. 2002;21 (11):1538-1558. Wyrwich KW, Spertus JA, Kroenke K, Tierney WM, Babu AN, Wolinsky FD; Heart Disease Expert Panel. Clinically important differences in health status for patients with heart disease: an expert consensus panel report. Am Heart J. 2004;147(4):615-622. Wyrwich KW, Tierney WM, Babu AN, Kroenke K, Wolinsky FD. A comparison of clinically important differences in health-related quality of life for patients with chronic lung disease, asthma, or heart disease. Health Serv Res. 2005;40(2):577-591. Canadian Erythropoietin Study Group. Association between recombinant human erythropoietin and quality of life and exercise capacity of patients receiving haemodialysis. BMJ. 1990;300 (6724):573-578. Foley RN, Parfrey PS, Morgan J, et al. Effect of hemoglobin levels in hemodialysis patients with asymptomatic cardiomyopathy. Kidney Int. 2000; 58(3):1325-1335. Levin A, Djurdjev O, Thompson C, et al. Canadian randomized trial of hemoglobin maintenance to prevent or delay left ventricular mass growth in patients with CKD. Am J Kidney Dis. 2005;46(5):799-811. Parfrey PS, Foley RN, Wittreich BH, Sullivan DJ, Zagari MJ, Frei D. Double-blind comparison of full and partial anemia correction in incident hemodialysis patients without symptomatic heart disease. J Am Soc Nephrol. 2005;16(7):2180-2189. Ritz E, Laville M, Bilous RW, et al; Anemia Correction in Diabetes Study Investigators. Target level for hemoglobin correction in patients with diabetes and CKD: primary results of the Anemia Correction in Diabetes (ACORD) Study. Am J Kidney Dis. 2007; 49(2):194-207. Roger SD, McMahon LP, Clarkson A, et al. Effects of early and late intervention with epoetin alpha on left ventricular mass among patients with chronic kidney disease (stage 3 or 4): results of a randomized clinical trial. J Am Soc Nephrol. 2004; 15(1):148-156. Barrett B, Brown D, Mundt M, Brown R. Sufficiently important difference: expanding the framework of clinical significance. Med Decis Making. 2005;25(3):250-261. Collins AJ, Brenner RM, Ofman JJ, et al. Epoetin alfa use in patients with ESRD: an analysis of recent US prescribing patterns and hemoglobin outcomes. Am J Kidney Dis. 2005;46(3):481-488.