Statistical Methods Highlights for Health Economists, 2015
Transcription
Statistical Methods Highlights for Health Economists, 2015
Statistical methods highlights for health economists Lecture notes Randy Ellis, March 16, 2015 Problems common to Health Economics • Spending is highly skewed, with many zeros • Randomized controlled trials are rare, so observational studies are the norm • Policies often implemented on non‐random sets of enrollees. E.g., diabetes intervention. • Many outcomes of interest are discrete • Often collaborating with biostatistics/epidemiology who use other approaches • Policies implemented geographically or over time, so hard to distinguish from other possible causes • Huge samples? • Many fixed effects => unobserved covariates • Many endogenous variables Design When to use Advantages Disadvantages Randomization Whenever feasible When there is variation at the individual or community level Gold standard Most powerful Not Randomized Encouragement Design When an intervention is universally implemented Provides exogenous variation for a subset of beneficiaries Only Regression Discontinuity If an intervention has a clear, sharp assignment rule Project beneficiaries often must qualify through established criteria Only Difference-inDifferences If two groups are growing at similar rates Baseline and followup data are available Eliminates fixed differences not related to treatment Can Propensity Score Matching When other methods are not possible Overcomes Assumes observed differences between treatment and comparison always feasible Not always ethical looks at subgroup of sample Power of encouragement design only known ex post look at subgroup of sample Assignment rule in practice often not implemented strictly be biased if trends change Ideally have 2 preintervention periods of data no unobserved differences (often implausible) Propensity Score Cites • Very useful link to sources on using propensity scores in different software packages. (Johns Hopkins University School of Public Health) • http://www.biostat.jhsph.edu/~estuart/prope nsityscoresoftware.html Stata • psmatch2 http://ideas.repec.org/c/boc/bocode/s432001.html – Leuven, E. and Sianesi, B. (2003). psmatch2. Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. – Allows k:1 matching, kernel weighting, Mahalanobis matching – Includes built‐in diagnostics – Includes procedures for estimating ATT or ATE • pscore http://www.lrz‐muenchen.de/~sobecker/pscore.html – Becker, S.O. and Ichino, A. (2002). Estimation of average treatment effects based on propensity scores (2002) The Stata Journal 2(4): 358‐377. – k:1 matching, radius (caliper) matching, and stratification (subclassification) – For estimating the ATT • match http://www.economics.harvard.edu/faculty/imbens/software_imbens – Abadie, A., Drukker, D., Herr, J. L., and Imbens, G. W. (2004). Implementing matching estimators for average treatment effects in Stata. The Stata Journal 4(3): 290‐311. Available here. – Primarily k:1 matching (with replacement) – Allows estimation of ATT or ATE, including robust variance estimators • cem http://gking.harvard.edu/cem/ – Iacus, S.M., King, G., and Porro, G. (2008). Matching for Causal Inference Without Balance Checking. Available here. – Implements coarsened exact matching SAS • • SAS usage note: http://support.sas.com/kb/30/971.html Local and global optimal propensity score matching – Coca‐Perraillon, M. (2007). Local and global optimal propensity score matching. In SAS Global Forum 2007. Paper 185‐2007. Available here. – Variety of matching methods. No built in diagnostics. Assumes propensity score already estimated. • Greedy matching (1:1 nearest neighbor) – Parsons, L. S. (2001). Reducing bias in a propensity score matched‐pair sample using greedy matching techniques. In SAS SUGI 26, Paper 214‐26. Available here. – Parsons, L.S. (2005). Using SAS software to perform a case‐control match on propensity score in an observational study. In SAS SUGI 30, Paper 225‐25. Available here. – Kosanke, J., and Bergstralh, E. (2004). gmatch: Match 1 or more controls to cases using the GREEDY algorithm. http://www.mayo.edu/research/departments‐divisions/department‐health‐sciences‐ research/division‐biomedical‐statistics‐informatics/software/locally‐written‐sas‐macros • 1:1 Mahalanbois matching within propensity score calipers – Feng, W.W., Jun, Y., and Xu, R. (2005). A method/macro based on propensity score and Mahalanobis distance to reduce bias in treatment comparison in observational study. www.lexjansen.com/pharmasug/2006/publichealthresearch/pr05.pdf • Weighting – Leslie, S. and Thiebaud, P. (2006). Using propensity scores to adjust for treatment selection bias. http://www.lexjansen.com/wuss/2006/Analytics/ANL‐Leslie.pdf • Variable ratio matching, optimal matching algorithm – Kosanke, J., and Bergstralh, E. (2004). Match cases to controls using variable optimal matching. http://www.mayo.edu/research/departments‐divisions/department‐health‐sciences‐research/division‐ biomedical‐statistics‐informatics/software/locally‐written‐sas‐macros R • MatchIt http://gking.harvard.edu/matchit – – – – • Matching http://sekhon.berkeley.edu/matching – – – – • Hansen, B.B., and Fredrickson, M. (2009). optmatch: Functions for optimal matching. Variable ratio, optimal, and full matching Can also be implemented through MatchIt PSAgraphics http://cran.r‐project.org/web/packages/PSAgraphics/index.html – – • Iacus, S.M., King, G., and Porro, G. (2008). Matching for Causal Inference Without Balance Checking. Available here. Implements coarsened exact matching Can also be implemented through MatchIt optmatch http://cran.r‐project.org/web/packages/optmatch/index.html – – – • Ridgeway, G., McCaffrey, D., and Morral, A. (2006). twang: Toolkit for weighting and analysis of nonequivalent groups. Functions for propensity score estimating and weighting, nonresponse weighting, and diagnosis of the weights Primarily uses generalized boosted regression to estimate the propensity scores cem http://gking.harvard.edu/cem/ – – – • Sekhon, J. S. (2011). Multivariate and propensity score matching software with automated balance optimization: The Matching package for R. Journal of Statistical Software 42(7). http://www.jstatsoft.org/v42/i07 Uses automated procedure to select matches, based on univariate and multivariate balance diagnostics Primarily 1:M matching (where M is a positive integer), allows matching with or without replacement, caliper, exact Includes built‐in effect and variance estimation procedures twang http://cran.r‐project.org/web/packages/twang/index.html – – – • Ho, D.E., Imai, K., King, G., and Stuart, E.A. (2011). MatchIt: Nonparametric preprocessing for parameteric causal inference. Journal of Statistical Software 42(8). http://www.jstatsoft.org/v42/i08 Two‐step process: does matching, then user does outcome analysis (integrated with Zelig package for R) Wide array of estimation procedures and matching methods available: nearest neighbor, Mahalanobis, caliper, exact, full, optimal, subclassification Built‐in numeric and graphical diagnostics Helmreich, J.E. and Pruzek, R.M. (2009). PSAgraphics: An R Package to Support Propensity Score Analysis. Journal of Statistical Software 29(6). Available here. From webpage: "A collection of functions that primarily produce graphics to aid in a Propensity Score Analysis (PSA). Functions include: cat.psa and box.psa to test balance within strata of categorical and quantitative covariates, circ.psa for a representation of the estimated effect size by stratum, loess.psa that provides a graphic and loess based effect size estimate, and various balance functions that provide measures of the balance achieved via a PSA in a categorical covariate." Synth – – – Abadie, A., Diamond, A., and Hainmueller, H. (2011). Synth: An R Package for Synthetic Control Methods in Comparative Cast Studies. Journal of Statistical Software 42(13). http://www.jstatsoft.org/v42/i13 Implements weighting approach to creating synthetic control groups Useful when there is a single treated unit, such as a state or country. Main idea is to form a weighted average of comparison units that, when weighted, looks like the treated unit. Matching Methods Selected slides from an unknown professor at UC Berkeley who posted some slides. Propensity‐Score Matching (PSM) Propensity score matching: match treated and untreated observations on the estimated probability of being treated (propensity score). Most commonly used. • Match on the basis of the propensity score P(X) = Pr (d=1|X) – D indicates participation in project – Instead of attempting to create a match for each participant with exactly the same value of X, we can instead match on the probability of participation. PSM: Key Assumptions 1. No unobserved variables affecting outcomes • participation is independent of outcomes conditional on Xi – This is false if there are unobserved outcomes affecting participation • Enables matching not just at the mean but balances the distribution of observed characteristics across treatment and control 2. Common Support Non zero probability of being in treatment or control group for all observation conditional on X. 3. For diff‐in‐diff models also need parallel trends, which is related to Assumption 1. Common support is key Density Density of scores for participants Density of scores for nonparticipants Region of common support 0 Propensity score 1 High probability of participating given X Steps in Score Matching 1. Need representative and comparable data for both treatment and comparison groups 2. Use a logit (or other discrete choice model) to estimate program participations as a function of observable characteristics 3. Use predicted values from logit to generate propensity score p(xi) for all treatment and comparison group members Calculating Impact using PSM 4. Match Pairs: Restrict sample to common support (as in Figure) Need to determine a tolerance limit: how different can control individuals or villages be and still be a match? • Nearest neighbors, nonlinear matching, multiple matches 5. Once matches are made, we can calculate impact by comparing the means of outcomes across participants and their matched pairs PSM vs Randomization • • Randomization does not require the untestable assumption of independence conditional on observables PSM requires large samples and good data: 1. Ideally, the same data source is used for participants and non‐participants 2. Participants and non‐participants have access to similar institutions and markets, and 3. The data include X variables capable of identifying program participation and outcomes. Lessons on Matching Methods • Typically used when neither randomization, RD or other quasi experimental options are not possible – Case 1: no baseline. Can do ex‐post matching – Dangers of ex‐post matching: • Matching on variables that change due to participation (i.e., endogenous) • What are some variables that won’t change? • Matching helps control only for OBSERVABLE differences, not unobservable differences More Lessons on Matching Methods • Matching becomes much better in combination with other techniques, such as: – Exploiting baseline data for matching and using difference‐in‐difference strategy – If an assignment rule exists for project, can match on this rule • Need good quality data – Common support can be a problem if two groups are very different • What to match on? Levels? Trends? Variance? LINK BETWEEN PAY FOR PERFORMANCE INCENTIVES AND PHYSICIAN PAYMENT MECHANISMS: EVIDENCE FROM THE DIABETES MANAGEMENT INCENTIVE IN ONTARIO JASMIN KANTAREVICa and BORIS KRALJb aOntario Medical Association, Canada bUniversity of Toronto, Canada HEALTH ECONOMICS Health Econ. 22: 1417–1439 (2013) Analytical framework Social problem can then be written as Solving yields Alternative uses of Propensity Score matching • Weighting by inverse probabilities (no longer favored) • Nearest neighbor matching – With or without replacement? • Caliper matching • Conventional kernel estimator • Local linear kernel estimator (Their preferred specification