Ignorability and stability assumptions in neighborhood effects research

Transcription

Ignorability and stability assumptions in neighborhood effects research
Ignorability and stability assumptions in
neighborhood e¤ects research
Tyler J. VanderWeele
Department of Health Studies, University of Chicago
5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA
[email protected]
Abstract
Two central assumptions concerning causal inference in the potential outcomes framework for estimating neighborhood e¤ects are examined. The stable unit treatment
value assumption in the context of neighborhood e¤ects requires that an individual’s outcome does not depend on the treatment assigned to neighborhoods other
than the individual’s own neighborhood. The assumption is important in that it
makes estimation feasible; although some progress can be made even when the assumption is relaxed. Some discussion is given concerning the contexts in which the
neighborhood-level stable unit treatment value assumption is likely to hold. The
ignorability assumption allows the researcher to move from conclusions about association to conclusions about causation. In the context of neighborhood-wide interventions, the ignorability assumption for the individual-level potential outcomes
framework can be easily adapted for neighborhood e¤ects.
Keywords: Causal inference; ignorability; multilevel models; neighborhood e¤ects;
potential outcomes.
1
1. Introduction
Several di¤erent types of inquiry arise when considering questions of "neighborhood e¤ects." We might be interested in how health or income outcomes would
di¤er if individual i were moved from neighborhood j to neighborhood k. Alternatively we might be interested in how individual or population outcomes would likely
change under a particular neighborhood-level intervention. We could consider, for
example, the e¤ect of building a new hospital, or of increasing the police force in a
particular area or of …xing local roads and sidewalks. In some studies, researchers
are able to randomize interventions to neighborhoods (e.g. [1]) or sometimes are
even able to randomize individuals to a change in neighborhood (e.g. [2]). Studies
of the former type are generally referred to as randomized community trials; such
studies are well understood and have been successfully implemented [3]. With the
Moving to Opportunity (MTO) study sponsored by the U.S. Department of Housing
and Urban Development, there has been more recent interest in the interpretation of
studies which randomize individuals to a change in neighborhood [2, 4, 5]. In many
studies, however, in which the aim is to estimate neighborhood e¤ects, observational
data is used [6-9]. This raises questions about the assumptions that are required to
move from associational to causal conclusions in such observational settings. The
potential outcomes framework has been routinely employed to articulate these assumptions in the context of non-clustered cohorts [10, 11]. However, the manner
in which the concepts from the potential outcomes framework extend to the study
of neighborhood e¤ects in which individuals are clustered within neighborhoods is
not well documented or understood. In this paper our focus will be the e¤ects of
a neighborhood-level intervention rather than the e¤ects of moving an individual
from one neighborhood to another. We will consider both randomized community
trials and observational data settings and we will examine the technical assumptions
needed to draw causal conclusions about neighborhood e¤ects from such data.
The studies we consider in this paper are those in which treatment is administered
at the neighborhood level. Such studies need to be distinguished from studies
in which treatment occurs at the individual-level (so that not all individuals in a
particular neighborhood receive treatment) but in which only aggregate level data is
available. Within the epidemiologic literature these studies are known as "ecological
studies." In such cases, neighborhood-level outcomes are modeled as functions of
neighborhood-level characteristics. It has long been recognized [12, 13] that in
ecological studies, associations at the neighborhood-level could di¤er considerably,
or even be in the opposite direction, from the corresponding associations at the
individual-level. Greenland and Morgenstern [14] documented how even in the
2
absence of bias within groups due to confounding, selection, misclassi…cation, etc.,
neighborhood associations may not re‡ect individual association if either (i) the
outcome in the untreated group varies between groups or (ii) the treatment e¤ect is
not additive between groups [cf. 15]. Furthermore, variables may be confounders
even if they are unassociated with treatment in every group, especially if they are
associated with treatment across groups [14, 16]. Moreover, control for such variables
may not reduce the bias produced by these confounding variables [16, 17]. Greenland
[18] has also noted that if both a neighborhood characteristic and its individual-level
equivalent have an e¤ect on the outcomes then even if neither conditions (i) or (ii)
above apply, estimates from ecologic studies are not purely neighborhood e¤ects but
represent a combination of neighborhood and individual-level e¤ects. Interpretation
becomes even more di¢ cult in the case of non-linear models. Haneuse and Wake…eld
[19, 20] have done some work on inference in ecological studies in which the ecologic
data are supplemented with a sample of individual-level case control data. Our focus
in this paper, however, will be studies in which treatment is in fact administered at
the neighborhood level.
The remainder of this paper is organized as follows. Sections 2 and 3 review
the potential outcomes framework and multilevel models respectively. Section 4
discusses the technical assumption needed to draw causal inferences in studies in
which treatment is administered at the neighborhood level. Section 5 provides some
concluding discussion.
2. Potential Outcomes
Before we begin our discussion of neighborhood e¤ects we give a brief overview
of the potential outcomes
framework [10, 11] for non-clustered individuals. We will
a
use the notation A
BjC to denote that A is independent of B given C. We will
let Ti denote the treatment actually received by individual i. In the case of binary
treatments, we will let Ti = 1 denote a particular treatment and Ti = 0 denote
the absence of that treatment. Let Yi denote some post-treatment outcome for
individual i. We let Yi (t) denote the counterfactual value of Y for individual i if
treatment T were set, possibly contrary to fact, to the value t. Each individual i has
a counterfactual Yi (t) for each possible value of t; these counterfactual variables Yi (t)
are often referred to as "potential outcomes." In causal inference we are generally
interested in expressions such as E[Y (1)] E[Y (0)] i.e. what the average outcome
would have been if the entire population had received treatment as compared to what
the average outcome in the population would have been if the entire population
had not received treatment. The potential outcomes framework also imposes a
"consistency" assumption or axiom that Yi (Ti ) = Yi i.e. that the value of Y which
3
would have been observed if T had been set to what it in fact was for individual
i is equal to the value of Y which was in fact observed. Thus the only potential
outcome for individual i that is observed is the potential outcome Yi (Ti ), the value
of Y which would have been observed if T were set to what it in fact was.
Treatment assignment
T is said to be ignorable (or weakly ignorable) given coa
variates X if Y (t)
T jX for all t and if 0 < P (T = tjX = x) < 1 for all t and
x. This condition is sometimes also referred to as the assumption of no unmeasured
confounders. It is easily veri…ed thatX
if treatment assignment T is ignorable given
covariates X then for all t, E(Yt ) =
E(Y jT = t; X = x)P (X = x). The right
x
hand side of the equation, unlike the left hand size, is given entirely in observable
quantities. Ignorable treatment assignment given covariates X means that within
strata of X, treatment gives no information on the distribution of counterfactual outcomes. In practice, a researcher trying to draw causal inferences from observational
data attempts to collect a su¢ ciently large set of variables so that any pretreatment
di¤erence between the di¤erent treatment groups are attributable to factors in the set
of measured variables X so that within strata of X, the groups with di¤erent levels of
treatment variable T are comparable except with regard to the treatment that they
received. Under the assumption of conditionally ignorable treatment assignment,
the researcher can use the observed outcomes within strata of X, weighted by the
probability of X, as a valid estimate of average potential outcomes. In this paper
we will generalize this potential outcomes framework to settings in which outcomes
are recorded at the individual level but in which treatment is administered at the
neighborhood level. Before moving on we note that some authors have developed
methodology for causal inference without employing the idea of a counterfactual [21,
22].
3. Multilevel Models and Neighborhood E¤ects
Multilevel modeling (also called hierarchical modeling) is often employed in the
estimation of the causal e¤ects of neighborhood-level interventions or characteristics. Let Yik denote the outcome for individual i in neighborhood k. Let Xik
denote individual-level characteristics for for individual i in neighborhood k. Let
Zk denote neighborhood-level characteristics for neighborhood k. Let Tk denote
the neighborhood wide treatment or intervention under consideration. The typical
model employed in neighborhood e¤ects research concerning a neighborhood-level
intervention will have the following form.
Yik =
+
1 Xik
+
2 Zk
4
+ Tk + u k +
ik
(1)
where denotes an intercept, uk denotes a neighborhood-level random error and
ik denotes an individual-level random error. The uk and ik are usually assumed
to be independent and identically distributed with mean zero. That the ik are
independent of one another implies that the correlation between deviations from the
mean that arises from subjects within the same neighborhood is wholly attributable
to the neighborhood random term uk . Model (1) is also often written in hierarchical
form with level 1 given by the equation:
Yik =
k
+
1 Xik
+
2 Zk
+ Tk + u k :
ik
(2)
and level 2 given by the equation:
k
=
+
(3)
Substituting equation (3) into equation (2) gives equation (1). Equation (1) could of
course also be generalized to include interaction terms. Estimation of the coe¢ cients
of (1) or equivalently of (2) and (3) is straightforward using standard statistical
software. Several text books cover such estimation procedures; one such textbook
[23] provides some discussion of causal inference with the multilevel model.
For purposes of illustration we will consider the e¤ect of the construction of new
hospitals, a neighborhood-level intervention, on health outcomes. Let Tk denote
the construction of a new hospital in neighborhood k in year r. Let Yik denote a
continuous measure of health status for individual i in neighborhood k in year r + 1.
Let Xik denote a vector of individual-level characteristics for individual i in neighborhood k including age, sex, race, socioeconomic status, and prior health status, all
measured at the beginning of year r. Let Zk denote neighborhood-level characteristics for neighborhood k including number of existing hospitals, neighborhood safety
and neighborhood mean income, also measured at the beginning of year r.
Before proceeding we note that not all neighborhood-level interventions or randomized community trials can be classi…ed as "neighborhood e¤ects research." Consider, for example, a randomized community trial in which all households in the
neighborhoods selected for "treatment" are mailed a nine month supply of vitamin
supplements. Such a study may not be considered "neighborhood research" because
the intervention is such that the neighborhood itself is not in any way integrally
altered. The line between what types of interventions do and do not constitute a
"neighborhood e¤ect" may not always be entirely clear - one might imagine a scenario in which the vitamin supplement mailings lead to greater community discussion
of nutrition and consequently a more "health-conscious community." Nevertheless,
5
the vitamin supplement mailings are considerably less concerned with altering the
neighborhood than an intervention that intentionally tries to create a more healthconscious community by mailing health information, promoting community discussion, making use of billboards and local newspapers, etc.
4. Ignorability and Stability Assumptions
In this section we consider the assumptions necessary to provide a causal interpretation of the coe¢ cient . We …rst consider the neighborhood-level stable
unit treatment value assumption (NL-SUTVA). The neighborhood-level stable unit
treatment value assumption requires that an individual’s outcome does not depend
on the treatment assigned to neighborhoods other than the individual’s own neighborhood. Let Yik (t) denote the outcome individual i in neighborhood k would have
obtained if the neighborhood-level treatment Tk in neighborhood k were set to t.
The potential outcome Yik (t) is only well de…ned under NL-SUTVA. If NL-SUTVA
does not hold then the individual’s outcome may depend on the treatments assigned
to neighborhoods other than the individual’s own neighborhood and so the quantity
we wish to de…ne by Yik (t) may take on many di¤erent values depending upon the
treatment status of neighborhoods other than neighborhood k. Under NL-SUTVA,
Yik (t) is uniquely de…ned. Note that NL-SUTVA does not require that there be
no treatment interaction between two individuals in the same neighborhood; rather
NL-SUTVA requires that there be no treatment interaction between individuals in
di¤erent neighborhoods, or once again more generally, that the treatment received by
one neighborhood does not a¤ect outcomes for individuals in other neighborhoods.
The condition may be violated either because of spill-over e¤ects or because of interaction.
An example of a spill-over e¤ect leading to a violation of NL-SUTVA
would be easier access to health care from the construction of a new hospital that
is not in one’s own neighborhood but is in a nearby neighborhood. An example of
interaction leading to a violation of NL-SUTVA would be a context in which when
two adjacent neighborhoods both have hospitals then each of the hospitals is able
to attract more competent physicians. More competent physicians in turn would
lead to better individual level health outcomes. Thus if a study were conducted in
which adjacent neighborhoods were included in the study, NL-SUTVA would not be
satis…ed because the health outcomes in one neighborhood would depend on whether
a hospital had been built in adjoining neighborhoods. Clearly then, in certain contexts, NL-SUTVA is likely not to hold.
However, clarifying the stable unit treatment value assumption for the neighborhood e¤ects context may have important implications for study design. Although
6
the researcher has no control over treatment assignment in observational studies, in
both observational studies and randomized trials the researcher will likely have considerable choice in which experimental units to include in the study. As discussed
above, NL-SUTVA will be violated if the outcome of individual i in neighborhood
k is a¤ected by treatment in neighborhood j 6= k. Thus, if an individual’s health
outcomes are a¤ected by easier access to health care from the construction of a new
hospital in a nearby neighborhood, NL-SUTVA will be violated. A researcher designing a randomized trial might thus choose the experimental units, the neighborhoods,
in such a way that there would be considerable distance between the neighborhoods
so as to prevent the possibility of the construction of a new hospital in neighborhood
j from a¤ecting individuals in any other neighborhood k 6= j i.e. by ensuring that
neighborhoods near neighborhood j are not in the study (cf. [24]). If in an observational context NL-SUTVA is violated then it will be necessary to use techniques
which relax this assumption and allow for the modeling of spillover e¤ects [25, 26];
see [5] for the relaxation of SUTVA conditions in a setting in which individuals are
randomized to a change of neighborhoods. This will of course be a necessity if
spillover e¤ects (i.e. between-group interactions) are in fact of primary interest.
We next turn to the second central assumption needed to draw causal conclusions
from statistical models like that given in equation (1). The ignorability assumption
makes certain conditional independence claims about the counterfactual outcomes
and the treatment and thereby allows the researcher to move from conclusions about
association to conclusions about causation. Intuitively, the ignorability assumption
is often understood as indicating that within strata of certain measured, potentially
confounding variables, the treatment is essentially as though it were randomized.
More formally, we will say that neighborhood treatment assignment is ignorable
given the covariates Xik and Zk if for t = 0; 1
a
Yik (t)
Tk jXik ; Zk
(4)
and if 0 < P (Tk = tjXik = x; Zk = z) < 1 for all t, x and z. The ignorability assumption states that conditional on the covariates Xik ; Zk the potential outcomes Yik (0)
and Yik (1) are independent of treatment Tk . The ignorability assumption makes
statements about the distributions of the potential outcomes Yik (0) and Yik (1) and
since some of these potential outcomes are unobserved the ignorability assumption
cannot be tested with data; background knowledge must be used to assess whether
the assumption is reasonable. The ignorability condition would hold if treatment is
randomized within strata of Xik and Zk . Condition (4) is in fact somewhat weaker
than treatment randomization within strata of Xik and Zk . Treatment randomiza7
tion within strata of Xik and Zk implies that treatment Tk is conditionally independent not only of the potential outcomes Yik (0) and Yik (1) given Xik and Zk but also
of all other variables. Condition (4) requires only that treatment Tk be conditionally
independent of the potential outcomes given Xik and Zk . Intuitively, the ignorability
assumption (4) is often interpreted as requiring that for every neighborhood k and for
each individual i the covariates Xik ; Zk include all relevant variables that a¤ect both
the neighborhood-level treatment Tk and the potential outcomes Yik (0) and Yik (1)
i.e. that Xik and Zk contain all the individual and neighborhood level confounding
variables. Thus the covariate vector Zk would likely need to include the number of
previously existing hospitals in the neighborhood, the presence of hospitals in adjacent neighborhoods along with perhaps the mean income for the neighborhood and
any other neighborhood-level characteristics that a¤ect both the construction of a
new hospital and individual health outcomes. The situation with the individuallevel covariates is more subtle. An individual-level covariate needs to be included in
the individual-level covariate vector Xik only if it a¤ects neighborhood-level treatment Tk in some way di¤erent than the neighborhood-level covariate equivalent in
Zk . For example, if the income distribution of the neighborhood were used in making decisions about the construction of a new hospital and if an individual’s income
a¤ected health outcomes then the covariate vector Xik would have to include the
income for individual i in neighborhood k because each individual’s income a¤ects
both the individual’s health outcomes and the overall income distribution which affects decisions about hospital construction.
However, if only mean income were
used in making decisions about the construction of a new hospital, rather than the
income distribution, then to ensure ignorability condition (4) it would su¢ ce to include mean income in the covariate vector Zk ; no individual-level income would need
to be included among the covariates. Nevertheless, the individual-level income data
may be included in the statistical model in order to improve e¢ ciency though this
inclusion can also sometimes induce bias, as discussed below.
In practice, …nding covariates Xik and Zk such that the treatment ignorability
condition (4) will hold may be challenging. Many variables that may a¤ect both
treatment and the outcome, such as for example political resourcefulness, trust and
community solidarity, will often be very di¢ cult to measure. Condition (4) will at
best hold only approximately and a central challenge of the study’s design will be
collecting su¢ ciently rich data so that the approximation will be a reasonable one.
Of course, as indicated above, in a community randomized trial the ignorability
condition holds by design.
We need one …nal assumption or axiom, the "consistency assumption" for neigh8
borhood level treatments. The consistency assumption states that when treatment
Tk = t then the counterfactual outcome intervening to set treatment Tk to t is equal
to the observed outcome i.e. Tk = t =) Yik (t) = Yik . The assumption is necessary
mathematically but is intuitively obvious i.e. that the outcome that would have been
observed if treatment were set to what it in fact was is equal to the outcome that was
in fact observed. With NL-SUTVA and the consistency assumption, the neighborhood treatment ignorability assumption allows us to derive estimates of the average
treatment e¤ect from observational data as stated in the following proposition.
Proposition. If the neighborhood treatment assignment is ignorable given Xik and
Zk then
X
E[Yik (t)] =
E[Yik jTk = t; Xik = x; Zk = z]P (Xik = x; Zk = z)
(5)
x;z
where the expectation is taken over all individuals and neighborhoods. Furthermore
under model (1) we have that E[Yik (1)] E[Yik (0)] = .
Proof. By the law of iterated expectations we have that,
X
E[Yik (t)] =
E[Yik (t)jXik = x; Zk = z]P (Xik = x; Zk = z)
=
X
x;z
E[Yik (t)jTk = t; Xik = x; Zk = z]P (Xik = x; Zk = z) by ignorability
x;z
=
X
x;z
E[Yik jTk = t; Xik = x; Zk = z]P (Xik = x; Zk = z) by consistency.
Furthermore, if model (1) holds then,
E[Yik (1)] E[Yik (0)]
X
=
fE[Yik jTk = 1; Xik = x; Zk = z]
x;z
=
X
E[Yik jTk = 0; Xik = x; Zk = z]gP (Xik = x; Zk = z)
P (Xik = x; Zk = z) = :
x;z
Thus, in our example, provided there are no interactions in the statistical model,
represents the average causal e¤ect of building a new hospital on the health of
individuals in the neighborhood in which the hospital is built. Even in the presence
of interactions, equation (5) can be used to estimate causal e¤ects. The left hand
9
side of equation (5) is a causal quantity. The expression E[Yik (t)] denotes the
expected value of the potential outcome for an individual living in a neighborhood
in which there is an intervention to set treatment Tk to t. The right hand side of
equation (5) can be estimated from the data. The expression E[Yik jTk = t; Xik =
x; Zk = z] is in fact the conditional expectation that is estimated in the multi-level
model in equation (1). Of course, for the estimates obtained from the multi-level
model to be valid, the functional form of the model must be correctly speci…ed.
Thus quadratic terms, interaction terms and cross-level interaction terms between
covariates in Xik and covariates in Zk may need to be included in the model. If there
is an interaction between Tk and Zk in the model then the e¤ect of treatment will
vary by neighborhood. For example, the average treatment e¤ect of building a new
hospital on health outcomes may vary by the quality and connectivity of the roads
in a neighborhood since the e¢ cacy of emergency medical will in part depend upon
how quickly an individual can reach the hospital. Even with treatment-covariate
interactions, the average treatment e¤ect overall will still be equal to the weighted
average given in (5). Appropriate multilevel modeling techniques must be used
to estimate the variance of the E[Yik (t)jXik = x; Zk = z] estimates. If the e¤ect
of treatment on the study sample is of interest then P (Xik = x; Zk = z) can be
taken to be …xed weights corresponding to the relative frequency of the covariates
Xik and Zk in the study population.
If the study sample consists of a simple
random sample of subjects from a speci…c study population then the variances of
the estimates of the e¤ect of treatment on the study population can be computed by
bootstrapping techniques. If the researcher is interested in the e¤ect of treatment
on the treated, i.e. on the sub-population of individuals in neighborhoods that
actually received treatment then the weights P (Xik = x; Zk = z) may be replaced
by P (Xik = x; Zk = zjTk = 1).
5. Discussion
In this …nal section we provide some discussion concerning the estimation and
causal interpretation of multilevel model coe¢ cients in neighborhood e¤ects research
along with some discussion on the limitations of and possible extensions to these
methods. As noted above, for the coe¢ cient estimate to be interpreted causally
the covariate vectors Xik and Zk must be chosen so that the ignorability condition
(4) holds i.e. Xik and Zk must include all variables that a¤ect both the treatment
and the outcome. If some variable in Xik or Zk a¤ects only the outcome and not
the treatment then including the variable in model (1) may increase the e¢ ciency of
the coe¢ cient estimates. Provided such variables are measured prior to treatment
the ignorability condition (4) will in general be una¤ected by their inclusion in the
10
covariates vectors Xik and Zk . Exceptions can, however, occur, for example, when
the pretreatment covariate is a common e¤ect of two other variables, one of which
is a cause of the outcome and the other of which is a cause of treatment and neither
of these other two variables are controlled for [27, 28]. Graphical models can be
useful in assessing the conditional independence assumption required by treatment
ignorability [27, 29, 30]. Glymour provides some discussion speci…cally on the use of
graphical models to assess common problems in social epidemiology [31] which may
be of interest to those doing neighborhood e¤ects research.
Some comments are also in order concerning the modeling assumptions in the
estimation of the multilevel model given in equation (1). Equation (1) assumes
a linear relationship between the covariates in the model and the outcome. This
assumption will no doubt at best be an approximation to the actual functional relationship between the outcome and the covariates. However, in order to assess
whether such a linearity assumption is reasonable and in order to genuinely separate
the treatment e¤ect from the e¤ect of the covariates, it is important that there is
considerable overlap between the distribution of the individual and neighborhoodlevel covariate values in the neighborhoods receiving treatment (Tk = 1) and the
neighborhoods which are controls (Tk = 0). Otherwise, signi…cant and potentially
scienti…cally meaningful estimates of the treatment e¤ect may in fact be an artifact
of the linearity assumption. Propensity score methods can be of considerable use
in assessing these issues concerning the comparability of the treatment and control
groups [32-34]. Oakes [35] has pointed out how these comparability assumptions
can be particularly problematic in neighborhood e¤ects research which attempts to
estimate the e¤ects of neighborhood-level socioeconomic status on say health status.
This is because the e¤ect of neighborhood-level socioeconomic status is confounded
by individual-level socioeconomic status and although individual-level income data
can be controlled for in a model there will often be insu¢ cient overlap in the distribution of individual-level income between rich and poor neighborhoods. That is to say,
the wealthy tend to live in wealthy neighborhoods and the poor in poor neighborhoods and so it is di¢ cult to separate the e¤ects of individual socioeconomic status
from neighborhood-level socioeconomic status. Such considerations can sometimes
be partially remedied by appropriate study design in which neighborhoods of only
modest di¤erence in socioeconomic status are compared.
The discussion in this paper is easily generalized to non-continuous outcomes
through the use of generalized linear multilevel models and also to cases with categorical or continuous, rather than binary, treatments. The stable unit treatment
value assumption discussed in Section 2 remains the same under these more gen11
eral settings. With continuous treatment, the ignorability assumption given in (4)
must be modi…ed so as to apply not only to t = 0; 1 but to any value of t in the
support of Tk . For generalized linear multilevel models, equation (1) or equivalently equations (2) and (3) must be modi…ed and appropriate multilevel estimation
techniques employed but the causal assumptions required for the valid estimation
of neighborhood e¤ects remain essentially the same. In estimating average causal
e¤ects from generalized linear models, equation (5) can be used but care must be
taken to transform estimates using the inverse link function so that the expression
E[Yik jTk = t; Xik = x; Zk = z], rather than its transform, is weighted. The odds
ratio, for example, is not collapsible [36] and thus if a logistic multi-level model is
used, the inverse logit transformation must be applied before standardization takes
place.
In our discussion we have seen that multilevel methods do hold some promise
for neighborhood e¤ects research. The conclusions drawn from observational data
are always tentative and some of the assumptions required may be di¢ cult to meet
in practice; however, a …rm understanding of the assumptions required for a causal
interpretation of estimates from statistical models allows the researcher to critically
assess the likely validity of study conclusions.
References
[1] Wagenaar AC, Murray DM, Wolfson M, Forster JL, Finnegan JR. Communities
mobilizing for change on alcohol: Design of a randomized community trial. Journal
of Community Psychology CSAP, Special Issue, 1994; 79-101.
[2] Del Conte A, Kling J. A synthesis of MTO research on self-su¢ ciency, safety and
health, and behavior and delinquency," Poverty Research News, 2001; 5:3-6.
[3] Hannan PJ. Experimental social epidemiology - Controlled community trials. In:
J.M. Oakes and J.S. Kaufman (eds.), Methods in Social Epidemiology. Jossey-Bass:
San Francisco, CA, 2006.
[4] Kling JR, Liebman JB, Katz LF. Experimental analysis of neighborhood e¤ects.
Econometrica, in press.
[5] Sobel ME. What do randomized studies of housing mobility demonstrate?: Causal
inference in the face of interference. Journal of the American Statistical Association,
2006; 101:1398-1407.
12
[6] Diez Rouz A, Nieto F, Muntaner C. Neighborhood environments and coronary
heart disease: A multilevel analysis. American Journal of Epidemiology, 1997;
146:48-63.
[7] Sampson RJ, Raudenbush SW, Earls F. Neighborhoods and violent crime: a
multilevel study of collective e¢ cacy. Science, 1997; 227:918-923.
[8] Subramanian SV, Kim DJ, Kawachi I. Social trust and self-rated health in US
communities: a multilevel analysis. Journal of Urban Health. 2002; 79:S21-34.
[9] Browning CR, Wallace D, Feinberg S, Cagney KA. Neighborhood social processes
and disaster-related mortality: The case of the 1995 Chicago heat wave. American
Sociological Review, 2006; 71:665-682.
[10] Rubin DB. Estimating causal e¤ects of treatments in randomized and nonrandomized studies. Journal Educational Psychology, 1974; 66:688-701.
[11] Rubin DB. Bayesian inference for causal e¤ects: The role of randomization.
Annals of Statistics, 1978; 6:34-58.
[12] Robinson WS. Ecological correlations and the behavior of individuals. American
Sociological Review 1950, 15:351-357.
[13] Duncan OD, Cuzzort RP, Duncan B. Statistical geography: problems in analyzing
areal data. Greenwood Press: Westport, CT, 1961.
[14] Greenland S, Morgenstern H. Ecological bias, confounding, and e¤ect modi…cation. American Journal of Epidemiology, 1989; 18:269-274.
[15] Rothman KJ, Greenland S. Modern Epidemiology. Lippincott-Raven: Philadelphia, PA, 1998.
[16] Greenland S. Divergent biases in ecologic and individual-level studies. Statistics
in Medicine, 1992; 11:1209-1223.
[17] Greenland S, Robins JM. Invited commentary: ecologic studies - biases, misconceptions, and counterexamples. American Journal of Epidemiology, 1994; 139:747760.
[18] Greenland S. A review of multilevel theory for ecologic analyses. Statistics in
Medicine, 2002; 21:389-395.
13
[19] Haneuse S, Wake…eld J. Hierarchical models for combining ecological and casecontrol data. Biometrics, 1992; 63:128-136.
[20] Haneuse S, Wake…eld J. The combination of ecological and case-control data.
Berkeley Electronic Press, UW Biostatistics Working Paper Series. Working Paper
292. http://www.bepress.com/uwbiostat/paper292.
[21] Spirtes, P., Glymour, C. and Scheines, R. Causation, Prediction and Search.
New York: Springer-Verlag, 1993.
[22] Dawid, A. P. (2000). Causal inference without counterfactuals. Journal of the
American Statistical Association, 2000; 95:407-424.
[23] Gelman A, Hill J. Data Analysis Using Regression and Multilevel/Hierarchical
Models. Cambridge University Press: Cambridge, 2007.
[24] Murray D. Design and analysis of group-randomized trials. Oxford University
Press: New York, 1998.
[25] Hong G, Raudenbush SW. Evaluating kindergarten retention policy: A case
study of causal inference for multilevel observational data. Journal of the American
Statistical Association, 2006; 101:901-910.
[26] Verbitsky N, Raudenbush SW. Causal inference in spatial settings: A case study
of community policing program in Chicago. Working paper.
[27] Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research.
Epidemiology, 1999; 10:37-48.
[28] Greenland S. Quantifying biases in causal models: classical confounding vs
collider-strati…cation bias. Epidemiology, 2003; 14:300-306.
[29] Pearl J. Causal diagrams for empirical research. Biometrika, 1995; 82:669-688.
[30] Pearl J. Causality: Models, Reasoning, and Inference. Cambridge University
Press: Cambridge, 2000.
[31] Glymour MM. Using causal diagrams to understand common problems in social
epidemiology. In: J.M. Oakes and J.S. Kaufman (eds.), Methods in Social Epidemiology. Jossey-Bass: San Francisco, CA, 2006.
14
[32] Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal e¤ects. Biometrika, 1983; 70:41-55.
[33] Rubin DB. Estimating causal e¤ects from large data sets using propensity scores,
Annals of Internal Medicine, 1997; 127:757-763.
[34] Oakes JM, Johnson PJ. Propensity score matching for social epidemiology. In:
J.M. Oakes and J.S. Kaufman (eds.), Methods in Social Epidemiology. Jossey-Bass:
San Francisco, CA, 2006.
[35] Oakes JM. The (mis)estimation of neighborhood e¤ects: causal inference for a
practicable social epidemiology. Social Science & Medicine, 2004; 58:1929-1952.
[36] Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Statistical Science, 1999; 14:29-46.
15