Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators SANDER GREENLAND
Transcription
Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators SANDER GREENLAND
Biostatistics (2000), 1, 1, pp. 113–122 Printed in Great Britain Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators SANDER GREENLAND Department of Epidemiology, UCLA School of Public Health, Los Angeles, CA 90095-1772, USA A number of small-sample corrections have been proposed for the conditional maximum-likelihood estimator of the odds ratio for matched pairs with a dichotomous exposure. I here contrast the rationale and performance of several corrections, specifically those that generalize easily to multiple conditional logistic regression. These corrections or Bayesian analyses with informative priors may serve as diagnostics for small-sample problems. Points are illustrated with a small exact performance comparison and with an example from a study of electrical wiring and childhood leukemia. The former comparison suggests that small-sample bias may be more prevalent than commonly realized. Keywords: Bias; Case-control studies; Conditional logistic regression; Cox model; Epidemiologic methods; Likelihood analysis; Logistic models; Matching; Odds ratio; Proportional hazards; Relative risk; Risk assessment. 1. I NTRODUCTION The conditional maximum-likelihood (CML) estimator of a common odds ratio for matched pairs was introduced by Kraus (1960) and has since become a mainstay of epidemiologic analysis (Breslow and Day, 1980; Clayton and Hills, 1993; Kelsey et al., 1996; Rothman and Greenland, 1998). Jewell (1984), however, described the severe small-sample bias that can arise in the estimator, and derived and compared some bias corrections. Since then other corrections and comparisons have appeared. The present note contrasts several corrections that have an obvious Bayesian rationale or a straightforward extension to conditional-logistic regression. A new estimator is introduced that is a minor adaptation of formulas for ordinary logistic regression. Estimators are illustrated in an exact performance comparison, and in a matched-pair study of power lines and childhood leukemia (Ebi et al., 1999). The former comparison suggests that bias may be a frequent problem in small or overmatched studies. The CML odds-ratio estimators have positive probability of being infinite and so have infinite exact expectations, even though they are unbiased to first order. Following earlier literature (Jewell, 1984, 1986; Liu, 1989), for ease of writing I will use the term ‘bias’ to refer to bias of higher order. One can also can view the bias problem as one in which estimates far above the true parameter value occur with unacceptably high probability. 2. A PPROXIMATE BIAS CORRECTIONS Several approximate corrections have been proposed and evaluated for matched odds-ratio estimates with a dichotomous exposure and for 2 × 2-table (unmatched) odds-ratio estimators (Jewell, 1984, 1986; Becker, 1989; Liu, 1989; Walter and Cook, 1991). These corrections are of two forms: those that correct c Oxford University Press (2000) Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 S UMMARY 114 S. G REENLAND for bias on the logarithmic scale, and those that correct for bias on the odds-ratio (arithmetic) scale. 2.1. Logarithmic corrections 2.2. Arithmetic corrections ˆ E(θˆ ) = θ implies E(eθˆ ) > eθ ; consequently, the above estimators For any nonconstant estimator θ, undercorrect for bias on the arithmetic scale (Jewell, 1984). This raises the issue of whether one should examine the odds ratios or the log odds ratios. A common presumption is that one should focus on the log odds ratios because of the extreme asymmetry of the distribution of the odds ratios. I and others maintain that this presumption is an example of ignoring context to suit the statistics. In a well-designed study, the odds ratios, not their logs, are proportional to disease rates (Rothman and Greenland, 1998). These rates, in turn, are proportional to the overall costs of disease (Morgenstern and Greenland, 1990). The magnitudes of these costs are a primary target of interest for public health and subsequent policy debates, and hence the relevant estimation errors are proportional to arithmetic, not logarithmic, errors in relative-risk estimates. Several approximate bias corrections for the arithmetic scale have been proposed for discrete x (Bishop et al., 1975; Good, 1983; Jewell, 1984). These turn out to be very close to the Laplace estimator obtained by adding 1 rather than 12 to each cell (Bishop et al., 1975; Good, 1983). Unlike the others, this Laplace correction is invariant under exposure recoding and has a simple Bayesian derivation using a uniform prior on expit(β) (Bishop et al., 1975; Good, 1983); this prior is equivalent to the mean-zero logistic prior on β with c.d.f. expit(β). Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 It is possible to adapt a well-known bias correction for unconditional ML estimators (Byth and McLachlan, 1978; Anderson and Richardson, 1979; Schaefer, 1983; Cordeiro and McCullagh, 1991) to matched-pair CML estimators. The contribution of a matched pair with case regressor vector x1 and control regressor vector x0 to the conditional logistic likelihood (Breslow and Day, 1980; Clayton and Hills, 1993) simplifies to expit(d β), where β is the vector of logistic coefficients, d = x1 − x0 , and expit(η) = (1 + e−η )−1 is the logistic transform. The full conditional likelihood thus can be written in the form of a no-intercept unconditional logistic likelihood for binomial observations defined by n(x1 , x0 ) ‘successes’ out of n(x1 , x0 )+n(x0 , x1 ) trials, where n(x1 , x0 ) is the number of pairs with case regressor x1 and control regressor x0 . The distribution of n(x1 , x0 ) is binomial given n(x1 , x0 ) + n(x0 , x1 ); if x1 = x0 , the distribution does not depend on β and hence concordant pairs do not contribute to the likelihood. Let i index the pairs, let D be the diagonal matrix of observed pair differences di , p the vector of conditional probabilities pi = expit(di β) for the pairs, W the diagonal matrix diag[ pi (1 − pi )], and H = D W D. The second-order approximation to the bias in the CML estimator is then b = H −1 D W r , where ri = di H −1 di ( pi − 12 ); see Cordeiro and McCullagh (1991). A bias correction is obtained by using the CMLE βˆ to compute b, then subtracting the result bˆ from βˆ (Anderson and Richardson, 1979; Schaefer, 1983); a corresponding variance estimate for βˆ − bˆ may be computed by the delta method (Bishop et al., 1975, Chapter 14). For discrete x there is another logarithmic-scale correction, due to Haldane, which adds 12 to each cell (here, pair count) and then applies ML to the augmented counts (Bishop et al., 1975; Good, 1983; Jewell, 1984). For a binary x, the resulting ‘augmented likelihood’ for β is identical to the posterior distribution for β under a Jeffreys prior (Leonard and Hsu, 1994). The augmented counts are sometimes multiplied by a constant to restore the sample total to its original value (Bishop et al., 1975), which affects only the variance estimates. Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators 115 2.3. Bayes estimators 3. E XACT RESULTS FOR DICHOTOMOUS MATCHED - PAIR STUDIES In the case of a matched-pair study of a dichotomous exposure with no covariates, it is easy to compute the bias of odds-ratio estimates directly from the exact conditional distribution of the discordant pairs (Jewell, 1984); there is no need for approximate or simulation studies. Let u and v be the numbers of discordant pairs with the case exposed and with the control exposed. The conditional distribution of u is binomial with probability expit(β) and total N = u + v; the CML, Haldane, and Laplace odds-ratio estimates are u/v, (u + 12 )/(v + 12 ), and (u + 1)/(v + 1) (Breslow and Day, 1980; Good, 1983; Jewell, 1984; Clayton and Hills, 1993); the Mantel–Haenszel and CML estimators are identical in this case. One must use an ad hoc redefinition of the CML estimator at v = 0 to give it a finite mean; following Jewell (1984), I equated it to the Haldane estimator when a zero occurred. The log-scale CML bias correction simplifies to bˆ = (u − v)2 /2uv N (1) ˆ so the corrected CML estimate is u/veb (again, with ad hoc replacement by the Haldane estimator when a zero occurred). The posterior mode of the log odds ratio β under a normal(0, τ 2 ) prior is the solution β˜ of β = [u − N · expit(β)]τ 2 . (2) ˜ I here evaluate eβ with τ 2 = 1 as a ‘Bayes point estimator’ of the odds ratio, because it is a special case of logistic penalized-likelihood estimators studied elsewhere (Breslow and Clayton, 1993; Greenland, 1997; Breslow et al., 1998). Table 1 presents the exact expectations of the above estimators under various scenarios. When the true odds ratio was small (2 or less), the Bayes-normal(0,1) estimator appeared least biased, though little different from the Laplace estimator, but was severely overcorrected (biased downward) for odds ratios of 4 or more. Excepting the rather extreme case of N = 8, ω = 8, the Laplace estimator was nearly unbiased in all cases examined. As expected, the uncorrected CML estimator had considerable bias even when the true odds ratio was 1, and the corrected-CML and Haldane estimators were arithmetically undercorrected. Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 If one believes that random error is a major contributor to the results, it would be natural to pursue estimators with even lower expected squared error (ESE) than the corrected estimators, such as a Bayes estimator based on a prior that is (hopefully) more concentrated near the true coefficient vector than the priors implicit in the above procedures. This leaves the task of specifying the prior. For ease of illustration, consider a matched-pair study of a dichotomous exposure, with no covariates. In this case β is the pair-specific log odds ratio. Many epidemiologic controversies about harmful effects revolve around whether the true relative risk (which the odds ratio is supposed to approximate) is 1 versus 1.5 or 1 versus 2, with virtually no prior probability given to values above 3 or 4 by anyone, largely because most estimates are below 2. The electric powercancer literature is an example (Portier and Wolfe, 1998; Greenland et al., 2000b). Other examples can be found in the nutrition and diet literature, such as in the coffee–heart disease controversy (Greenland, 1993a). In these contexts, the upper prior percentiles derived from normal(µ, τ 2 ) distributions for β with µ close to zero and τ 2 between 12 and 1 more closely corresponds to meta-analysis results and to the spectrum of expert opinions than do percentiles derived from the priors implicit in the Haldane or Laplace estimators. For example, the upper 90th prior percentiles for the odds ratio under normal(0,1) and normal(0, 12 ) priors for β are 3.6 and 1.9, whereas the upper 90th odds-ratio percentiles under the Haldane and Laplace priors are 40 and 9. 116 S. G REENLAND Table 1. Expected values and percent probabilities of twice truth or more for odds-ratio estimators in a matched-pair study of dichotomous exposure.* OR = odds ratio, N = number of discordant pairs, CMLC = CML with ML bias correction, Bayes = Bayes estimator using normal(0,1) prior for β (see text) Expected value True OR 1 1.2 2 4 8 CML CMLC Haldane Laplace 8 1.4 1.3 1.3 1.2 16 1.2 1.1 1.1 1.1 Bayes CML CMLC Haldane Laplace Bayes 1.1 14 14 14 14 4 1.1 11 11 11 11 4 24 1.1 1.1 1.1 1.1 1.1 8 3 3 3 3 8 1.8 1.6 1.6 1.4 1.3 21 21 21 6 6 16 1.4 1.4 1.4 1.3 1.3 8 8 8 8 3 24 1.3 1.3 1.3 1.3 1.3 8 3 3 3 3 8 2.3 2.1 2.0 1.7 1.5 32 11 11 11 2 16 1.8 1.8 1.7 1.6 1.5 17 7 7 7 2 24 1.7 1.7 1.6 1.6 1.5 10 4 4 4 1 8 3.2 2.9 2.8 2.2 1.8 20 20 20 20 4 16 2.6 2.4 2.4 2.2 1.9 17 6 6 6 1 24 2.3 2.3 2.2 2.1 2.0 6 6 6 6 2 8 6.4 5.7 5.6 3.8 2.6 17 17 17 17 0 16 6.2 5.3 5.2 4.2 3.1 14 14 14 14 0 24 5.4 4.8 4.7 4.2 3.4 11 11 11 3 0 8 9.9 9.2 9.1 5.5 3.4 39 39 39 0 0 16 12.5 10.7 10.7 7.2 4.4 15 15 15 15 0 24 12.6 10.4 10.5 7.9 5.1 24 6 6 6 0 *CML and CMLC set equal to Haldane when zero cell occurs. The bias results are in good accord with those in Jewell (1984). As mentioned earlier, however, not everyone is comfortable with bias as a criterion for evaluating ratio estimators. Therefore, the table also presents the probabilities that the estimates will exceed twice the true odds ratio. For the CML estimator, these upper-tail probabilities can remain appreciable even with a substantial number of discordant pairs, and only the Bayes estimator does consistently better by this criterion. Evaluations were also made using arithmetic and logarithmic expected-squared error as performance criteria; in both, CML was worst and Laplace was best over all the cases shown. I also computed exact coverages of the approximate 95% Wald-type intervals centered on log odds-ratio estimators, as well as exact and score intervals, for the situations in Table 3. These results are not shown because all exhibited over 95% coverage in almost all the situations examined, although the Laplace correction produced by far the narrowest average width and closest to nominal coverage, with score intervals also doing well. Studies of intervals for binomial proportions have found that score intervals exhibit better performance than CML, Wald, likelihood-ratio, and even exact intervals; see Agresti and Coull (1998) for references. Interestingly, the latter authors observed that adding two to each cell count produced Wald intervals for p = expit(β) that performed nearly as well as the score intervals; this corresponds to using an approximate posterior interval for p derived from a beta(2,2) prior. Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 1.5 N % Probability $ 2 A true OR Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators 117 Table 2. Case-specular pairs from analysis of back-yard electrical lines and childhood leukemia Specular back-yard lines Case: 3-phase Secondary None 3-phase 15 24 11 Secondary 11 107 9 0 1 81 None 4. A N EXAMPLE ω(t1 , t2 ) = exp(β1 t1 + β2 t2 ). Row 1 of Table 3 gives the CML odds-ratio estimates (with 95% Wald confidence limits) from fitting this model to the example data. The intervals fall above the range of estimates obtained from other studies of wiring and leukemia, and both point estimates are at least ten times what one would expect based on all the evidence to date (including twenty or so other epidemiologic studies, most of them larger than this one) (Portier and Wolfe, 1998; Greenland et al., 2000b). While epidemiologic validity problems may have contributed to the apparent exaggeration of the estimates, the data are uninformative about those problems. We can, however, examine the extent to which this appearance depends on the analysis method. Row 2 of Table 3 provides the results from an exact logistic-regression software program. The point estimates are hardly different from the CML estimates because they are in fact only slightly modified CML estimates (LogXact, 1993). More disturbing is the fact that the exact limits appear even more exaggerated than the CML Wald limits. The exact limits are known to cover at or above the nominal rate if there are no epidemiologic biases (Breslow and Day, 1980), and so suggest no exaggeration in the CML intervals. Nonetheless, the results are extraordinarily unstable. Row 3 of Table 3 shows the impact on the CML results of reclassifying as unexposed just one of the eleven cases in the secondary/3-phase cell. This minor change puts one pair in the empty cell in Table 1, and halves the estimates. Conversely, reclassifying as exposed the single unexposed case in a discordant pair makes the CML estimates infinite. Rows 4–6 of Table 3 give the ML-bias corrected, Haldane, and Laplace estimates. The two logarithmic corrections reduce the estimates by about half while the Laplace correction reduces the estimates by about two-thirds. Nonetheless, the estimates still appear implausibly large relative to previous studies. Row 7 of Table 3 gives approximate posterior medians and 95% intervals derived from the second derivative of the log posterior density, based on a bivariate normal prior for β2 , β1 − β2 in model 1, with prior means of zero, prior variances of 1, and a prior correlation 0.5; β2 and β1 − β2 are the log odds ratios Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 A case-specular study involves case-control pairs in which the ‘case’ is a case house and the ‘control’ is a reflection of the case house across the street (Zaffanella et al., 1998); under certain assumptions, ordinary matched-pair likelihoods can be used to analyze such data (Greenland, 1999). Table 2 gives data from a case-specular study of electrical wiring and childhood leukemia (Ebi et al., 1999). Of the 259 pairs available for this example, only 56 were discordant, and only one of these pairs had a case with no back-yard power line. Represent line type by two indicators, t1 for 3-phase line (1 = yes, 0 = no) and t2 for secondary line, and let ω(t1 , t2 ) be the ratio of leukemia odds at exposure (t1 , t2 ) versus (0,0) within matching strata. The usual conditional-logistic model for the regression of leukemia risk on (t1 , t2 ) is equivalent to the conditional (stratum-specific) odds-ratio model 118 S. G REENLAND Table 3. Odds-ratio estimates for 3-phase and secondary back-yard power-line exposure, from casespecular analysis of childhood leukemia. CML = conditional maximum likelihood Method 3-phase Secondary 1. CML 32 (4.0,253) 14 (1.8,107) 2. Exact* 30 (4.5,1328) 14 (2.1,507) 16 (3.4,72) 6.8 (1.5,30) 4. CML bias corrected† 19 (3.6,105) 8.7 (1.7,45) 5. Haldane‡ 16 (3.5,78) 7.4 (1.6,34) 6. Laplace§ 11 (2.9,43) 5.2 (1.4,19) 7. Bayes β ∼ N (0,1) 12 (3.5,40) 4.9 (1.7,14) 8. Bayes β ∼ N (0,1/2) 8.6 (3.0,25) 3.6 (1.6,8.5) 9. Pairing ignored 2.4 (1.4,4.1) 1.2 (0.81,1.7) *Modified CML point estimates and exact limits from LogXact † Using approximate bias correction for ML estimates ‡ Add 1 to each cell and renormalize 2 § Add 1 to each cell and renormalize comparing secondary to no line and 3-phase to secondary. The prior variance for β1 is 1 + 1 + 2(0.5) = 3, which yields an upper 90th prior percentile for the odds ratio eβ1 comparing 3-phase to no line of 9.2. The results resemble the Laplace estimates, but with narrower intervals; this narrowing is as expected, given the lighter tails of the normal prior in comparison to the Laplace prior. Row 8 is derived using the same prior means and correlation, but with prior variances of 12 . This change implies prior variance of 1.5 for β1 and an upper 90th prior percentile for eβ1 of 3; although the results are still implausibly large, their magnitude is easily attributable to random error and (not unlikely) other sources of bias. A referee suggested examining the estimates obtained by breaking the pairing and using the crude unmatched data. These are presented in row 9 of Table 3. Because of the strong positive association of the pair exposures, collapsing across pairs produces estimates that are less than a tenth that of CML; the results are also much more precise and consistent with the literature. The latter consistency may largely reflect a fortuitous cancellation of biases, for the crude (collapsed) odds ratio is known to be biased toward the null when the pair exposures are positively correlated (Siegel and Greenhouse, 1973). Nonetheless, the crude odds ratio also has lower variance, which has led some authors to suggest averaging the stratified and crude estimators to minimize expected squared error (Liang and Zeger, 1988; Kalish, 1990; Greenland, 1991). In the present example, the tremendous drop in the odds ratio upon collapsing is just what one should expect given the extremely high correlation of the exposure (line type) with the main matching factors (neighborhood and housing type) implicit in the use of specular controls. For a more detailed discussion of this example and similar bias in a conventional matched case-control study, see Greenland et al. (2000a). 5. D ISCUSSION The present paper has focused on situations in which there are too few pairs to support CML estimation of even one parameter. The problems can become more acute in multiple logistic regression. These prob- Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 3. CML moving one pair Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators 119 ˆ 1 ˆ N = 24 and u = 23 yield eβ = 23 and W = ln(23)/(1/23 + 1/1) 2 = 3.07. Thus, W declines as eβ explodes. This type of behavior can result in the power of the Wald test dropping as |β| → ∞ given fixed N (Hauck and Donner, 1977; Vaeth, 1985). Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 lems, formalized as sparse-data inconsistency, have long been recognized in unconditional ML estimators, and in fact CML estimators were developed to address these problems (Breslow and Day, 1980; Breslow, 1981). Unfortunately, an analogous problem occurs in CML estimators when pair-counts are sparse. The formal equivalence of the matched-pair conditional likelihood to an unconditional likelihood allows one to map results for the latter to the former. As an example, consider a matched-pair study of an indicator x in which the investigator wishes to control an unmatched nominal covariate z whose number of levels increases at the same rate as the total number of pairs M. Entering this covariate into the conditional logistic model as a series of indicators (dummy variables) will then produce a conditional likelihood with O(M) nuisance parameters (the z indicator coefficients), from which it follows by arguments parallel to those in Breslow (1981) that the CML estimator βˆ of the x coefficient will be inconsistent. Because of the formal equivalence of conditional logistic and Cox-model partial likelihoods, the same type of problem can afflict proportional-hazards analyses, although the bias would not be as severe because each failure (case) would be matched to many nonfailures at each failure time. The example in Table 2 may seem extreme, but studies reporting similarly large odds ratios based on sparse matched or stratified data are not uncommon, especially in analyses in which many covariates are entered in the conditional logistic model or in which the data are divided into small subgroups (for examples, see Daling et al., 1994; Witte et al., 1994; Abenhaim et al., 1996; Feychting et al., 1998; Schwartzbaum et al., 1998). Such large reported estimates should call attention to potential bias problems. Of perhaps greater concern, however, is the possibility of unnoticed small-sample bias in modest, plausible results. Uncontrolled study biases, like selection bias, misclassification, and residual confounding, can easily make the odds-ratio parameter eβ equal to 1.2 or even 1.5 when no underlying causal effect is present (Kelsey et al., 1996; Rothman and Greenland, 1998). As apparent from Table 1, small-sample bias can then operate on this biased parameter to generate CMLEs of 2 or more, which seem less plausibly explained by study biases. The contribution of such synergistic bias effects to the generation of controversial results may be considerable when most studies have few exposed cases. Another potential for harmful synergy can arise from unnecessary matching. If the matching factor is related only to the exposure, such overmatching increases the variance of the CML estimator of the odds ratio by reducing the number of discordant matched sets available for analysis (Miettinen, 1970; Thomas and Greenland, 1983). An additional consequence of this reduction is an increase in the small-sample bias of the odds-ratio estimator. Some older writings on the impact of matching (e.g., Chase, 1968) did not encounter these problems because they focused on tests of the null hypothesis under random matching (which does not increase concordance) or focused on the difference in proportions, whose variance decreases as the pairwise correlation (and hence concordancy) increases, and which is exactly unbiased for the average pairwise difference in response probabilities. In case-control studies, however, the ‘response’ is exposure status, and so the response difference is of no direct interest. The present paper concerns the poor behavior of CML odds-ratio estimators under conditions common in epidemiology (studies with few discordant matched pairs). This behavior does not have a simple relation to the behavior of tests of the null hypothesis. Consider the behavior of the Wald test for univariate ˆ β) ˆ E( ˆ as a standard normal statistic for testing β = 0. W exhibits quite different β, treating W = β/S ˆ It has long been known that W can eventually decline as |β| ˆ gets larger given a pathologies from β. ˆ β) ˆ can increase more rapidly than βˆ as the latter increases. fixed sample size, due to the fact the S E( ˆ β) ˆ = (1/u + 1/v) 12 , so N = 24 For example, with matched pairs, the CML odds ratio is u/v and S E( 1 ˆ discordant pairs and u = 22 yield eβ = 22/2 = 11 and W = ln(11)/(1/22 + 12 ) 2 = 3.25, whereas 120 S. G REENLAND 6. R ECOMMENDATIONS ACKNOWLEDGEMENTS The author thanks Kris Ebi, David Savitz, and Luciano Zaffanella for use of the example data, and the referees for helpful comments. R EFERENCES A BENHAIM, L., M ORIDE, Y., B RENOT, F., R ICH, S., B ENICHOU, J., K URZ, X., H IGENBOTTAM, T., OAKLEY, C., W OUTERS, E., AUBIER, M., et al. (1996). Appetite-suppressant drugs and the risk of primary pulmonary hypertension. New England Journal of Medicine 335, 609–616, Table 3. AGRESTI , A. AND C OULL, B. A. (1998). Approximate is better than ‘exact’ for interval estimation of binomial proportions. American Statistician 52, 119–126. A NDERSON , J. A. AND R ICHARDSON, S. C. (1979). Logistic discrimination and bias correction in maximum likelihood estimation. Technometrics 21, 71–78. Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 The simplest diagnostic for small-sample or sparse-data problems is close tabular examination of basic data. In the above example, the possibility of small-sample artifacts did not occur to the co-investigator who first presented the CML odds ratios to the research team, simply because the total number of pairs (259) seemed quite large. Even a more sophisticated summary, noting there are fifty-six discordant pairs (twenty-eight ‘informative’ pairs per parameter), would not have signaled problems. Only the full pair table (Table 1) shows the pair sparsity. Full tabulation may seem impractical or unreliable when multiple covariates (some perhaps continuous) are entered in the model. A crude rule of thumb, adapted from an oft-cited rule for unconditional logistic regression (Peduzzi et al., 1996), would require at least ten discordant matched sets per estimated parameter. This rule, however, fails dramatically in the above example. I therefore suggest that a Bayesian or (more generally, when applicable) an hierarchical Bayes analysis can serve as a diagnostic, in the following sense: if the results change dramatically between a CML analysis and a Bayesian analysis with scientifically reasonable priors, one at least has a warning of severe data limitations. This use of a Bayesian analysis need entail no commitment to the Bayesian results over the CML results, but may serve to temper reliance on the CML results in formulating conclusions. For this purpose, simple approximate fitting methods may suffice (Greenland, 1993b; Witte and Greenland, 1996; Greenland, 1997; Breslow et al., 1998), although even these are not invulnerable to sparse-data bias (Neuhaus and Segal, 1997). A fully Bayesian analysis with scientifically sensible priors is of course the Bayesian solution to the sample-size problem, provided one uses a fitting method appropriate for small samples. Frequentists might argue instead in favor of formal bias corrections or exact analysis. Formal bias corrections for the multiple-regression case are currently only available for the coefficients, which, as argued above, are not the final parameters of interest for public-health purposes. Exact analysis has more serious shortcomings. Exact intervals are constructed to ensure at least nominal coverage of the true parameter value. This assurance extends to all parameter values, no matter how absurdly large. The cost is that exact intervals tend to expand to values beyond the CML intervals, driving them even further from the Bayesian posterior intervals than the CML intervals. They thus can be even more misleading than the CML intervals when (as seems inevitable) they are interpreted as posterior intervals by the consumer. On the practical side, the capacity of exact programs remains limited, despite remarkable computing advances. Regardless of how one chooses to deal with it, the potential for small-sample bias in results from asymptotic procedures needs to be checked more routinely than is current practice. The development of easily programmed sample-size diagnostics for commercial software would be of particular value; formal bias corrections might serve well in this role. Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators 121 Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 B ECKER, S. (1989). A comparison of maximum likelihood and Jewell’s estimators of the odds ratio and relative risk in single 2 × 2 tables. Statistics in Medicine 8, 987–996. B ISHOP, Y. M. M., F IENBERG , S. E. AND H OLLAND, P. W. (1975). Discrete Multivariate Analysis: Theory and Practice. Cambridge, MA: MIT Press. B RESLOW, N., L EROUX , B. AND P LATT, R. (1998). Approximate hierarchical modelling of discrete data in epidemiology. Statistical Methods in Medical Research 7, 49–62. B RESLOW, N. E. (1981). Odds ratio estimators when the data are sparse. Biometrika 68, 73–84. B RESLOW, N. E. AND C LAYTON, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association 88, 9–25. B RESLOW, N. E. AND DAY, N. E. (1980). Statistical Methods in Cancer Epidemiology. I. The Analysis of CaseControl Studies. Lyon: IARC. B YTH , K. AND M C L ACHLAN, G. T. (1978). The biases associated with maximum likelihood methods of estimation of the multivariate logistic risk function. Communications in Statistics A7, 877–890. C HASE, G. R. (1968). On the efficiency of matched pairs in Bernoulli trials. Biometrika 55, 365–369. C LAYTON , D. AND H ILLS, M. (1993). Statistical Models in Epidemiology. New York: Oxford University Press. C ORDEIRO , G. M. AND M C C ULLAGH, P. (1991). Bias correction in generalized linear models. Journal of the Royal Statistical Society B 53, 629–643. DALING, J. R., M ALONE, K. E., VOIGT, L. F., W HITE , E. AND W EISS, N. S. (1994). Risk of breast cancer among young women: relationship to induced abortion. Journal of the National Cancer Institute 86, 1584–1592. E BI, K. L., Z AFFANELLA , L. E. AND G REENLAND, S. (1999). Application of the case-specular method to two studies of wire codes and childhood cancers. Epidemiology 10, 398–404. F EYCHTING, M., F ORSSEN, U., RUTQUIST, L. E. AND A HLBOHM, A. (1998). Magnetic fields and breast cancer in Swedish adults residing near high-voltage power lines. Epidemiology 9, 392–397. G OOD, I. J. (1983). Some history of the hierarchical Bayesian methodology. In Good Thinking ed. Good, I.J. Chapter 9, 95–105. Minneapolis, MN: University of Minnesota Press. G REENLAND, S. (1991). Reducing mean squared error in the analysis of stratified epidemiologic studies. Biometrics 47, 773–775. G REENLAND, S. (1993a). A meta-analysis of coffee, myocardial infarction, and sudden coronary death. Epidemiology 4, 366–374. G REENLAND, S. (1993b). Methods for epidemiologic analyses of multiple exposures: A review and a comparative study of maximum-likelihood, preliminary testing, and empirical-Bayes regression. Statistics in Medicine 12, 717– 736. G REENLAND, S. (1997). Second-stage least squares versus penalized quasi-likelihood for fitting hierarchical models in epidemiologic analysis. Statistics in Medicine 16, 515–526. G REENLAND, S. (1999). A unified approach to the analysis of case-distribution (case-only) studies. Statistics in Medicine 18, 1–15. G REENLAND, S., S CHWARTZBAUM , J. A. AND F INKLE, W. D. (2000a). Problems due to small samples and sparse data in conditional logistic regression analysis. American Journal Epidemiology 151, in press. G REENLAND, S., S HEPPARD, A. S., K AUNE, W. T., P OOLE , C. AND K ELSH, M. A. (2000b). A pooled analysis of magnetic fields, wire codes, and childhood leukemia. Epidemiology 11, in press. H AUCK , W. W. AND D ONNER, A. (1977). Wald’s test as applied to hypotheses in logit analysis. Journal of the American Statistical Association 72, 851–853. J EWELL, N. P. (1984). Small-sample bias of point estimators of the odds ratio from matched sets. Biometrics 40, 421–435. J EWELL, N. P. (1986). On the bias of commonly used measures of association for 2×2 tables. Biometrics 42, 351–358. K ALISH, L. A. (1990). Reducing mean-squared error in the analysis of pair-matched case-control studies. Biometrics 46, 493–499. K ELSEY, J. L., W HITTEMORE, A. S., E VANS , A. S. AND T HOMPSON, W. D. (1996). Methods in Observational Epidemiology. New York: Oxford University Press. K RAUS, A. S. (1960). Comparison of a group with disease and a control group from the same families, in search of possible etiologic factors. American Journal of Public Health 50, 303–311. 122 S. G REENLAND [Received June 28, 1999. Revised October 25, 1999] Downloaded from http://biostatistics.oxfordjournals.org/ by guest on October 6, 2014 L EONARD , T. AND H SU, J. S. J. (1994). The Bayesian analysis of categorical data: a selective review. In Aspects of Uncertainty ed. Freeman, P. R. and Smith, A. F. M. Chapter 18, 283–310. New York: Wiley. L IANG , K.-Y. AND Z EGER, S. L. (1988). On the use of concordant pairs in matched case-control studies. Biometrics 44, 1145–1156. L IU, K.-J. (1989). A note on the estimate of the relative risk when sample sizes are small (letter). Biometrics 45, 1030–1031. L OG X ACT (1993). Cambridge, MA, Cytel. M IETTINEN, O. S. (1970). Matching and design efficiency in retrospective studies. American Journal of Epidemiology 91, 111–118. M ORGENSTERN , H. AND G REENLAND, S. (1990). Graphing ratio measures of effect. Journal of Clinical Epidemiology 43, 539–542. N EUHAUS , J. M. AND S EGAL, M. R. (1997). An assessment of approximate maximum likelihood estimators in generalized linear models. In Modelling Longitudinal and Spatially Correlated Data: Methods, Applications, and Future Directions ed. Gregoire, T.G., Brillinger, D. R., Diggle, P. J., Russek-Cohen, E., Warren, W. G. and Wolfinger, R. D. 11–22. New York: Springer. P EDUZZI, P., C ONCATO, J., K EMPER, E., H OLFORD , T. R. AND F EINSTEIN, A. R. (1996). A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology 49, 1373–1379. P ORTIER , C. J. AND W OLFE, M. S. (1998). Assessment of Health Effects from Exposure to Power-line Frequency Electric and Magnetic Fields. Research Triangle Park, NC; National Institute of Environmental Health Sciences. ROTHMAN , K. J. AND G REENLAND, S. (1998). Modern Epidemiology (2nd edn). Philadelphia: Lippincott-Raven. S CHAEFER, R. L. (1983). Bias correction in maximum-likelihood logistic regression. Statistics in Medicine 2, 71–78. S CHWARTZBAUM, J. A., F ISHER , J. L. AND C ORNWELL, D. G. (1998). Role of dietary energy and cured meat consumption in adult glioma risk (abstract). American Journal of Epidemiology 147, S7. S IEGEL , D. G. AND G REENHOUSE, S. W. (1973). Validity in estimating relative risk in case-control studies. Journal of Chronic Diseases 42, 687–688. T HOMAS , D. C. AND G REENLAND, S. (1983). The relative efficiencies of matched and independent sample designs for case-control studies. Journal of Chronic Diseases 36, 685–697. VAETH, M. (1985). On the use of Wald’s test in exponential families. International Statistics Review 53, 199–214. WALTER , S. D. AND C OOK, R. J. (1991). A comparison of several point estimators of the odds ratio in a single 2 × 2 contingency table. Biometrics 47, 795–811. W ITTE , J. S. AND G REENLAND, S. (1996). Simulation study of hierarchical regression. Statistics in Medicine 15, 1161–1170. W ITTE, J. S., G REENLAND, S., H AILE , R. W. AND B IRD, C. L. (1994). Hierarchical regression analysis applied to a study of multiple dietary exposures and breast cancer. Epidemiology 5, 612–621. Z AFFANELLA, L. E., S AVITZ, D. A., G REENLAND , S. AND E BI, K. L. (1998). The residential case-specular method to study wire codes, magnetic fields, and disease. Epidemiology 9, 16–20.