Les dterminants contextuels de la sant et du recours aux soins

Transcription

Les dterminants contextuels de la sant et du recours aux soins
UNIVERSITÉ PARIS VI – PIERRE ET MARIE CURIE
ECOLE DOCTORALE : SANTÉ PUBLIQUE ET SCIENCES
DE L’INFORMATION BIOMÉDICALE
THÈSE
pour l’obtention du titre de
DOCTEUR DE L’UNIVERSITÉ PARIS VI
Présentée par :
Basile CHAIX
Modélisation des effets du contexte
sur la santé et le recours aux soins
Thèse soutenue le 01/12/2004 devant le jury composé de :
Rapporteurs :
M. Marcel GOLDBERG, Professeur
M. Denis HEMON, Directeur de Recherche
Examinateurs :
M. Alain-Jacques VALLERON, Professeur
M. Thierry LANG, Professeur
Directeur de thèse :
1
M. Pierre CHAUVIN, Chargé de Recherche - HDR
Remerciements
En premier lieu, je remercie Alain-Jacques Valleron de m’avoir accueilli au sein de l’unité
444 de l’INSERM. Je remercie tout particulièrement Pierre Chauvin d’avoir dirigé mon
travail de thèse au cours de ces trois années ; son encadrement scientifique, ses conseils
éclairés, sa patience, et son soutien sans faille m’ont été chers au cours de la thèse, et
constituent un appui irremplaçable pour l’avenir. Je remercie Juan Merlo de l’Hôpital
Universitaire de Malmö en Suède de m’accorder une place dans son programme de recherche
ambitieux. Son expertise dans le champ de l’analyse contextuelle m’a été précieuse, et la
confiance mutuelle qui s’est forgée au cours d’une année et demi de travail en commun
augure d’une collaboration placée sous le signe de la durée. Je remercie enfin vivement M.
Goldberg, M. Hémon, et M. Lang d’avoir accepté de faire partie de mon jury de thèse.
2
Résumé de la thèse
Depuis près de dix ans, l’épidémiologie sociale s’intéresse à l’impact que les caractéristiques
du contexte de résidence peuvent avoir sur la santé des individus, au-delà des effets
imputables à leurs caractéristiques socio-économiques personnelles. Dans le cadre de cette
thèse, nous avons cherché à avancer dans la connaissance des déterminants contextuels de la
santé et du recours aux soins, qui ont reçu nettement moins d’attention en France qu’en
Europe du Nord, en Angleterre, ou aux Etats-Unis.
Notre objectif principal était de réfléchir sur les outils à mettre en œuvre pour décrire et
expliquer les variations spatiales des phénomènes de santé et de recours aux soins, et de
développer de nouvelles approches d’analyse permettant de combler les lacunes des méthodes
actuellement utilisées dans ce champ de l’épidémiologie sociale.
Nous avons dans un premier temps cherché à montrer l’utilité que les modèles multiniveaux
peuvent avoir en analyse contextuelle. Se démarquant des pratiques d’analyse suivies par
beaucoup d’auteurs, nous avons souligné l’intérêt qu’il y a à quantifier et modéliser les
variations inter-zones des phénomènes lorsque l’on cherche à évaluer l’importance du
contexte pour la santé et le recours aux soins. L’objectif étant d’aboutir à des indicateurs qui
expriment l’amplitude des variations inter-zones, nous nous sommes attachés à comparer les
différents indicateurs disponibles dans le cadre du modèle logistique, qui est fréquemment
utilisé en épidémiologie sociale.
Nous en venons finalement à mettre en doute la pertinence de l’approche d’analyse
multiniveau utilisée de façon quasi-hégémonique dans la littérature d’analyse contextuelle. En
effet, fragmentant le territoire en une multitude de zones administratives et négligeant les
connections spatiales qui existent entre ces zones, l’approche multiniveau ne fournit souvent
que des informations incomplètes sur la distribution spatiale des phénomènes de santé. Audelà, mesurant les facteurs explicatifs du contexte de résidence au niveau de zones
administratives arbitraires, elle s’avère souvent incapable de capter adéquatement les effets du
contexte sur la santé. A partir d’études appliquées conduites à partir de données Françaises et
Suédoises, nous avons montré qu’une approche d’analyse qui tient compte de l’espace dans sa
continuité intrinsèque permettait mieux de décrire et d’expliquer les variations spatiales des
phénomènes de santé et de recours aux soins.
3
Thesis summary
Over the past decade, social epidemiologists have investigated the effects that the
characteristics of the context of residence have on individual health, beyond the impact
associated with the characteristics of the individuals. In our thesis, we aimed to investigate
contextual determinants of health and healthcare utilisation, which have received far less
attention in France than in Northern Europe, in the United Kingdom, or in the United States.
Our main objective was to compare different analytical tools to be used to describe and
explain spatial variations of health phenomena and healthcare utilisation, and to develop new
approaches to overcome the limitations of the methods currently used in this specific social
epidemiological field.
We first highlighted the interest of using multilevel models in contextual analysis. Following
a different perspective than many authors in the literature, we aimed to emphasize that
quantifying and modelling variations of outcomes between areas is useful to assess the
importance that the context has for health and healthcare utilisation. We particularly seek to
compare the different indexes available in the multilevel logistic model to measure the
magnitude of variations between areas.
We finally aimed to show that the multilevel analytic approach, used in most of the analyses
of contextual effects on health, has several important limitations. Indeed, fragmenting space
into arbitrary administrative areas and neglecting spatial connections between areas, the
multilevel analytic approach often only provides incomplete information on the spatial
distribution of health outcomes. Moreover, measuring the characteristics of the context of
residence in arbitrary administrative areas, this approach may often be unable to adequately
describe contextual effects on health. Conducting applied investigations based on French and
Swedish data, we showed that an analytic approach based on a continuous notion of space
allowed us to better describe and explain spatial variations in health or healthcare utilisation.
4
Table des matières
Introduction ............................................................................................................................... 6
1) Utilité de l’analyse contextuelle en santé publique ...................................................................... 6
A – La description des variations géographiques des phénomènes de santé ................................................. 7
B – La compréhension des mécanismes à l’origine des disparités géographiques de santé .......................... 7
2) Evaluer l’importance des effets du contexte sur la santé : l’importance de la question
méthodologique ................................................................................................................................ 10
A – Les racines historiques de l’analyse multiniveau.................................................................................. 11
B – Différentes approches d’utilisation des modèles multiniveaux............................................................. 12
C – Comparaison de l’approche multiniveau et d’une perspective d’analyse spatiale dans l’étude des effets
du contexte .................................................................................................................................................. 13
3) Plan du document ........................................................................................................................ 15
Chapitre I – Utilité de l’approche multiniveau en épidémiologie sociale ............................. 17
1) L’utilisation des modèles multiniveaux dans la littérature d’analyse contextuelle................... 17
2) L’intérêt des mesures de variation comme sources d’information indépendantes sur l’impact
du contexte sur la santé.................................................................................................................... 18
Chapitre II – Exemples préliminaires d’analyse contextuelle............................................... 24
1) Analyse des effets du contexte de résidence sur différents comportements relatifs à la santé.. 24
2) Analyse des effets du ménage de résidence sur les modes de recours aux soins....................... 29
Chapitre III – Perspective multiniveau et perspective spatiale en analyse contextuelle ...... 31
1) Description de la distribution spatiale des phénomènes............................................................. 33
2) Mesure des facteurs du contexte dans un espace continu centré sur le lieu de résidence des
individus ........................................................................................................................................... 36
Conclusion générale et perspectives ....................................................................................... 40
Perspectives de recherche ................................................................................................................ 41
Liste de publications................................................................................................................ 45
Bibiographie ............................................................................................................................ 49
5
Introduction
1) Utilité de l’analyse contextuelle en santé publique
Ainsi que l’ont indiqué différents auteurs, le champ de l’épidémiologie s’est longtemps
inscrit dans le paradigme de l’individualisme méthodologique, qui postule que les facteurs
influant sur la santé des personnes appartiennent au registre des caractéristiques individuelles.
1, 2
En suivant cette orientation d’analyse, on serait capable d’appréhender l’ensemble des
processus agissant sur la santé des individus en tenant compte de leurs caractéristiques
démographiques, sociales, psychologiques, anatomiques, biologiques, etc. Dans cette optique,
on n’est amené à tenir compte de facteurs collectifs (tels que ceux que l’on mesure au niveau
de la zone de résidence des personnes) que lorsque l’information correspondante fait défaut au
niveau individuel. On néglige alors complètement la dimension contextuelle des facteurs
collectifs considérés, qui ne servent que de substituts à des informations que l’on est incapable
d’obtenir au niveau individuel.3
Au contraire, de nombreux travaux issus des sciences sociales ont cherché à mettre en
évidence l’influence que le contexte de vie des individus peut avoir sur la santé.2, 4 L’idée
s’est ainsi progressivement formée dans le champ de l’épidémiologie sociale que les
déterminants sociaux de la santé ont par nature une structure à niveaux (ou multiniveau),
appartenant au niveau individuel, mais également au niveau du ménage, du lieu de résidence,
ou du lieu de travail ou d’étude.5, 6, 7, 8, 9 En conséquence, il est aujourd’hui largement reconnu
qu’une voie importante à suivre en épidémiologie sociale pour avancer dans la connaissance
des mécanismes à l’origine des disparités sociales de santé est de s’intéresser aux effets du
contexte, et notamment à ceux du contexte résidentiel.5, 10
Au-delà des objectifs de connaissance, il est important d’un point de vue de santé publique
de tenir compte des relations qui existent entre le contexte de vie des individus et leur santé.
En effet, les études d’analyse contextuelle offrent des perspectives nouvelles dans le champ
de la santé publique, d’une part en décrivant les variations géographiques des phénomènes de
santé, et d’autre part en affinant la compréhension que l’on a des mécanismes à l’origine des
disparités de santé.
6
A – La description des variations géographiques des phénomènes de santé
Prendre en compte la dimension contextuelle des phénomènes de santé consiste d’abord à
examiner si ceux-ci présentent des variations sur le territoire d’étude. Etant incapable
d’identifier des variations spatiales aux différentes échelles d’analyse considérées, on serait
amené à conclure que le phénomène étudié ne présente pas de dimension contextuelle, et que
sa variabilité est imputable à des facteurs mesurables au niveau individuel.11, 12 Au contraire,
si les méthodes mises en œuvre indiquent une variabilité géographique importante, le
phénomène devient un objet d’intérêt en analyse contextuelle, qui cherche alors à en décrire et
expliquer la distribution spatiale.13
A des fins de recherche, la simple représentation cartographique des variations
géographiques des phénomènes de santé et la description quantitative de ces variations à
l’aide de modèles de régression fournissent des informations importantes qui permettent de
générer des hypothèses sur les facteurs qui influent sur ces phénomènes.14 D’un point de vue
de santé publique, la quantification des variations contextuelles des phénomènes indique si
d’éventuels programmes d’information ou d’intervention doivent intégrer cette dimension
contextuelle, ou si ces programmes peuvent être mis en œuvre de façon complètement
invariante sur le territoire.13 La description cartographique des disparités de santé ou de
comportements relatifs à la santé aide également à identifier les zones d’intervention
prioritaires et à répartir les ressources sur le territoire en tenant compte des besoins
différenciés d’un endroit à l’autre.
B – La compréhension des mécanismes à l’origine des disparités géographiques de santé
Au-delà de la simple description des disparités territoriales de santé, l’objectif est
d’avancer dans la compréhension des mécanismes qui les produisent. L’orientation d’analyse
contextuelle s’est construite en critiquant l’approche écologique qui consiste à mettre en
relation des variables explicatives et des données de santé agrégées aux niveaux de zones
administratives plus ou moins fines.15, 16, 17 Observant par exemple une association positive
entre taux de chômage communal et taux de mortalité communal, il est difficile de tirer des
enseignements précis qui puissent être utilisés en santé publique. En effet, transférer une telle
association au niveau individuel afin de conclure que les individus au chômage ont un risque
de mortalité supérieur revient à commettre l’erreur écologique largement décrite dans la
littérature :11, 18, 19 l’association écologique ne permet pas d’affirmer que ce sont les chômeurs
7
plutôt que d’autres individus dans les communes où le pourcentage de chômeurs est élevé qui
ont un risque de mortalité supérieur. Par ailleurs, et de façon cruciale en analyse contextuelle,
cette association écologique ne permet pas non plus de conclure à l’existence d’un effet
collectif ou contextuel du chômage sur l’ensemble des résidents des communes à fort taux de
chômage, puisqu’elle ne distingue pas les éventuels effets du chômage aux niveaux individuel
et collectif.20, 21
L’approche contextuelle s’est donc développée à partir du constat qu’il est nécessaire
d’utiliser des données collectées au niveau individuel pour avancer dans la compréhension des
déterminants sociaux de la santé.22 L’objectif de ce genre d’analyses est d’examiner si les
variations géographiques identifiées sont intégralement liées à la composition variable des
zones considérées en terme de caractéristiques individuelles, ou si elles résultent également
d’effets proprement contextuels qui ne sauraient être captés au niveau individuel.3,
23
En
mesurant un même facteur social aux niveaux des individus et du contexte de résidence,
l’utilisation de techniques de régression multivariées permet de distinguer différents processus
sociaux qui se trouvaient amalgamés au sein de l’association écologique.24, 25, 26, 27, 28
Puisque les caractéristiques démographiques, sociales, et économiques des individus sont
souvent corrélées aux facteurs du contexte, il est absolument nécessaire de tenir compte des
facteurs individuels lorsque l’on cherche à identifier des effets véritablement contextuels. Un
débat important existe dans la littérature sur cette question du nécessaire ajustement des
modèles, qu’il faudrait mettre en œuvre avec prudence et circonspection pour les plus
optimistes,29 ou compromettrait définitivement toute possibilité d’identification d’effets
véritablement contextuels pour les plus pessimistes.30 Pour ne citer que deux des difficultés
relatives à cette question, il est d’une part toujours possible d’imaginer que les effets
contextuels identifiés en analyse multivariée résultent en fait d’un défaut d’ajustement au
niveau individuel, et soient ainsi liés à des effets de composition résiduels.31,
32, 33, 34
Mais
d’autre part, à l’opposé de ce problème de sous-ajustement des modèles, on peut aussi
craindre d’inclure trop de facteurs individuels dans les modèles, retirant ainsi au facteur
contextuel la part de son effet qui se manifeste au travers des variables individuelles
intermédiaires prises en compte comme facteurs d’ajustement.1, 15, 35, 36, 37 Ainsi, comme dans
bien d’autres cas en épidémiologie, la sélection des variables d’ajustement ne peut être
mécaniquement effectuée, et relève d’arbitrages extérieurs au champ de la statistique.
Au-delà, c’est toute la distinction fondatrice en analyse contextuelle entre effets
individuels et effets contextuels qui doit être envisagée avec circonspection. En effet, de façon
8
plus fondamentale, pour affecter la santé, les effets du contexte doivent « pénétrer à l’intérieur
du corps », ce qui se produit nécessairement au travers de processus que l’on peut capter au
niveau individuel.3 Ainsi, plutôt qu’une différence bien identifiée entre processus causaux
opérant dans le réel, la distinction entre effets individuels et effets contextuels peut être
conçue comme un outil conceptuel permettant d’organiser l’analyse et de générer des
hypothèses de travail mais dont il faudrait également se méfier sous peine d’aboutir à des
interprétations trop grossières.
Ainsi que de nombreux auteurs l’ont indiqué, il est utile en santé publique d’examiner si
les facteurs du contexte de résidence sont associés aux problèmes de santé après avoir tenu
compte des facteurs démographiques et sociaux au niveau individuel.15 L’intérêt est de voir si
l’on peut se contenter de cibler les programmes d’intervention sur la base des caractéristiques
des individus, ou si l’on doit au-delà également tenir compte des caractéristiques des zones de
résidence. L’idée avancée est qu’en cas d’effets directs des caractéristiques du contexte sur la
santé des individus, la cible des programmes de santé publique manquerait d’inclure un
nombre important d’individus à risque si elle n’était définie que sur la base des facteurs de
risque individuels.
Au-delà de la distinction entre effets de composition et effets contextuels, l’objectif est
d’examiner quelles dimensions du contexte de résidence jouent sur la santé des individus.38, 39
Les auteurs ont proposé différentes catégorisations des facteurs contextuels, soit en fonction
du type d’effets en jeu (environnement physique, infrastructures et services disponibles,
fonctionnement social40), soit en fonction du mode de constitution des variables. Dans ce
dernier cas, on distingue en général les variables contextuelles agrégées (qui résultent de
l’agrégation des caractéristiques des individus dans chaque zone) des variables contextuelles
intégrales qui sont directement mesurées au niveau des zones de résidence.35 Les variables
contextuelles agrégées les plus communes cherchent à rendre compte du niveau socioéconomique du milieu de résidence à partir de moyennes des caractéristiques socioéconomiques des résidents.41,
42, 43, 44, 45
Au contraire, les variables qui renvoient aux
infrastructures des zones appartiennent par exemple à la catégorie des variables intégrales.
Agrégées ou intégrales, les variables contextuelles ne peuvent le plus souvent pas être
mesurées au niveau individuel, et sont comme telles susceptibles de capter des effets
clairement distincts de ceux que l’on appréhende au moyen de variables individuelles.1, 7
D’un point de vue de santé publique, l’objectif est d’identifier les facteurs du contexte qui
sont réellement à l’origine des disparités de santé, afin d’adapter au mieux les programmes
9
d’intervention aux mécanismes causaux identifiés. Concernant par exemple la pratique
d’activités sportives, que l’on sait être liée au niveau socio-économique des individus,46,
47
diverses études ont mis en évidence des variations significatives d’un quartier de résidence à
l’autre.48 Les auteurs ont cherché à voir si ces variations spatiales étaient simplement dues à la
composition variable des zones sur le plan des caractéristiques socio-économiques
individuelles. Au-delà, ils ont trouvé que ces variations étaient en partie imputables au niveau
socio-économique du quartier de résidence, mesuré en agrégeant les caractéristiques des
individus.45,
48
Un tel effet pourrait être dû au fait que les valeurs et habitudes
comportementales d’un groupe social donné tendent à prévaloir dans les endroits où il est
majoritaire, affectant ainsi l’ensemble des résidants, même si ils n’appartiennent pas à ce
groupe social. Enfin, les auteurs ont également tenu compte de variables contextuelles
intégrales, et ont pu montrer que la présence d’installations sportives et d’endroits où la
marche ou la course peuvent être pratiquées en toute sécurité avait une influence sur la
pratique sportive.49
Quantifier les variations contextuelles des phénomènes, chercher à les expliquer en
distinguant effets de composition et effets proprement contextuels, et avancer dans la
connaissance des différents processus par lesquels le contexte influe sur la santé présentent
donc un intérêt en santé publique.
2) Evaluer l’importance des effets du contexte sur la santé : l’importance de la
question méthodologique
Puisque les déterminants sociaux de la santé appartiennent à différents niveaux, la
variabilité des phénomènes de santé présente une structure hiérarchique : au-delà de la
variabilité qui existe entre individus d’un même groupe, une partie des variations survient
d’une unité contextuelle à l’autre, l’individu et son contexte constituant des sources de
variabilité distinctes et hiérarchiquement organisées.50 Concernant les méthodes d’analyses,
les approches qui ne tiennent pas compte de cette structure complexe de la variabilité peuvent
s’avérer en partie inefficientes. Afin de décrire et d’expliquer les variations de phénomènes
qui opèrent à différents niveaux, la littérature d’épidémiologie sociale recourt aujourd’hui aux
modèles multiniveaux (incluant des effets aléatoires au-delà des effets fixes50, 51, 52, 53) ou dans
une moindre mesure à des modèles basés sur l’équation d’estimation généralisée.54, 55, 56, 57
10
A – Les racines historiques de l’analyse multiniveau
Ainsi que le rapportent Searle et ses collègues,58 la première formulation d’un modèle à
effets aléatoires dans la littérature date de 1861 (quoique ce modèle n’y soit alors pas ainsi
dénommé). L’intérêt qu’il y a à distinguer les composants de la variance dans une situation où
des unités se trouvent rassemblées au sein de groupes est explicitement formulé à partir des
années 1930. C’est en 1947 qu’apparaissent pour la première fois la distinction entre « effet
fixe » et « effet aléatoire » et la notion de « modèle mixte ». Les années 1950 et 1960 ont
apporté des développements majeurs dans les méthodes utilisées pour estimer les composants
de la variance. C’est autour des années 1970 que les faiblesses de la méthode d’estimation de
l’Analyse de Variance (ANOVA) ont commencé à être largement reconnues, et que
l’approche d’estimation basée sur le maximum de vraisemblance s’est développée.58 Quant à
l’approche d’analyse multiniveau pratiquée aujourd’hui en épidémiologie sociale, Snijders et
Bosker estiment qu’elle s’est formée au cours des années 1980 par la réunion du courant
d’analyse contextuelle et de la tradition statistique d’utilisation des modèles mixtes.50 Dans la
période antérieure, l’analyse contextuelle se contentait d’utiliser des modèles de régression
classiques afin d’identifier des variables contextuelles potentiellement influentes sur les
phénomènes. A partir de 1980, différentes équipes ont développé les algorithmes permettant
d’estimer des modèles de régression avec des coefficients aléatoires emboîtés, ainsi que les
logiciels pour le faire.59, 60 Dès 1986, les bases de l’analyse multiniveau, incluant les outils
statistiques ainsi que la méthodologie pour les utiliser, étaient jetées.
L’approche d’analyse multiniveau a d’abord été utilisée dans le champ des sciences de
l’éducation, dans lesquelles les données concernent des élèves regroupés au sein de classes,
elles-mêmes rassemblées au sein d’écoles, où la structure hiérarchique apparaît
incontournable.60 Cette approche n’a été utilisée dans le champ de l’épidémiologie sociale
dans l’étude des effets du contexte de résidence sur la santé qu’à partir des années 1990,
commençant vraiment à s’y établir au milieu de la décennie.2,
61, 62, 63
Des articles
méthodologiques ont été publiés à partir de 1998,1 alors que la première revue de la littérature
date de 2001.4 Cependant, ainsi que nous le discutons maintenant, diverses tendances se font
jour dans l’utilisation qui est faite des modèles multiniveaux.
11
B – Différentes approches d’utilisation des modèles multiniveaux
En analyse contextuelle, l’objectif est donc de décrire les effets du contexte sur la santé
des individus. Les modèles multiniveaux fournissent différents outils pour le faire, dont
l’importance relative est diversement évaluée par les auteurs de la littérature. Une première
approche, suivie par les pionniers de la littérature d’analyse contextuelle ainsi que dans de
nombreuses études plus récentes, consiste à s’intéresser exclusivement aux mesures
d’association entre facteurs contextuels et variables réponse individuelles.4, 64, 65 Dans ce cas,
l’intérêt des modèles multiniveaux est de tenir compte de la structure hiérarchique des
données (individus regroupés au sein de zones de résidence) lors de la procédure d’estimation
des paramètres, et d’aboutir ainsi à des écart-types des forces d’association qui prennent en
compte la corrélation intra-zone de la variable réponse.
Une telle utilisation des modèles multiniveaux apparaît en fait restrictive. Au cours d’une
revue de littérature publiée dans la Revue d’Epidémiologie et de Santé Publique,66 nous avons
montré que la prise en compte des effets aléatoires des modèles fournit des informations utiles
à l’interprétation des associations entre facteurs contextuels et phénomènes de santé.3, 6, 23, 34, 67
Toutefois, une telle utilisation des effets aléatoires comme simples appuis dans
l’interprétation des associations entre facteurs explicatifs et phénomènes de santé peut encore
apparaître limitée. Le but d’un projet éditorial dirigé par Juan Merlo de l’Hôpital Universitaire
de Malmö en Suède est de souligner que de tels effets aléatoires fournissent en eux-mêmes
des informations importantes en santé publique sur les variations géographiques des
phénomènes, sous la forme d’indicateurs que l’on appelle « mesures de variation » par
opposition aux « mesures d’association » classiques.13,
68, 69
Nous pensons que l’utilité de
telles « mesures de variation » (telles que le coefficient de corrélation intraclasse ou
coefficient de partition de la variance) a été sous-estimée dans la littérature. Une possible
explication de cet état de fait est que les études contextuelles aboutissent souvent à des
variances inter-zones des effets aléatoires extrêmement faibles et non significativement
différentes de zéro, ce qui est signe d’une faible importance du contexte pour les phénomènes
étudiés. Plutôt que d’accorder trop d’attention à l’information négative véhiculée par ce
paramètre, beaucoup d’auteurs semblent s’évertuer à trouver des associations entre facteurs
du contexte et phénomènes de santé (dont l’amplitude est également faible), afin de conclure
que le contexte a un impact sur la santé.
Il n’est donc pas sans importance d’un point de vue de santé publique de clarifier l’intérêt
respectif des mesures d’association (issues des effets fixes du modèle multiniveau) et des
12
mesures de variation (issues des effets aléatoires), qui fournissent des informations
complémentaires permettant de juger de l’importance réelle du contexte pour la santé.13, 68, 69
C – Comparaison de l’approche multiniveau et d’une perspective d’analyse spatiale dans
l’étude des effets du contexte
C’est toutefois dans la critique de l’approche multiniveau, au statut quasi-hégémonique
dans la littérature d’analyse contextuelle en épidémiologie sociale,4 que notre travail de thèse
trouve son axe essentiel. Notre objectif général est de montrer que l’approche d’analyse
multiniveau, du fait de sa conception de l’espace, ne fournit pas des informations optimales
sur la variabilité spatiale des phénomènes de santé. L’approche multiniveau conçoit en effet
l’espace comme fragmenté en zones distinctes le plus souvent définies à partir des limites
administratives. La littérature géographique sur le « modifiable areal unit problem » a depuis
longtemps montré que les résultats des analyses qui s’appuient sur un zonage administratif du
territoire sont largement dépendants du découpage utilisé.70, 71, 72, 73 L’effet d’agrégation qui
intervient est d’une part dû à des phénomènes d’échelle, puisque les zones peuvent être
définies à un niveau plus ou moins local (« scale effect »). D’autre part, à une échelle donnée,
les frontières considérées peuvent grouper les individus d’une multitude de façons différentes.
En conséquence, tant les indicateurs qui quantifient les variations d’une zone à l’autre que les
mesures des effets du contexte sont dépendants du découpage en zones utilisé, et des
différences importantes dans les résultats peuvent être observées si d’autres découpages du
territoire sont utilisés.74
Au-delà de cette dépendance des indicateurs au découpage utilisé, une limite plus
importante des modèles multiniveaux est de ne pas tenir compte des relations spatiales entre
les zones, et de supposer que des individus provenant de zones différentes sont complètement
indépendants même si ces zones sont adjacentes ou proches. En négligeant cette possible
corrélation entre zones proches sur le territoire, les modèles multiniveaux ne permettent pas
d’obtenir des informations optimales sur la distribution spatiale des phénomènes : ils ne
renseignent que sur la force de la corrélation des phénomènes de santé à l’intérieur des zones,
mais pas sur la portée de cette corrélation dans l’espace.
Au-delà de l’insuffisance des indicateurs qui décrivent les variations spatiales des
phénomènes, une autre limite de l’approche multiniveau est de systématiquement définir les
facteurs du contexte au niveau des zones administratives de résidence des individus. Or, rien
13
n’assure a priori que les différents effets contextuels opèrent réellement au niveau des zones
administratives considérées.38 Dans certains cas, les individus pourraient également être
affectés par les caractéristiques du contexte au-delà des limites administratives de leur zone de
résidence, puisque leurs activités quotidiennes les amènent probablement à se déplacer dans
cet espace élargi. Au contraire, dans d’autres cas, les zones administratives considérées
pourraient s’avérer bien trop larges pour capter une influence du contexte susceptible d’opérer
à un niveau plus local.
Ces différentes limites de l’approche d’analyse multiniveau sont liées à sa conception
d’un espace fragmenté en zones administratives arbitraires déconnectées les unes des autres.
Du fait de cette définition de l’espace, tant les mesures de variation dont il a été question cidessus que les mesures d’association entre facteurs contextuels et phénomènes de santé
s’avèrent en partie inefficientes à rendre compte de la distribution spatiale des phénomènes de
santé. Dans le cadre de collaborations internationales, nous avons conduit deux études, l’une à
partir de données Françaises, l’autre à partir de données Suédoises, dans lesquelles nous avons
eu recours à une approche d’analyse spatiale des effets du contexte, qui se distingue de
l’approche d’analyse multiniveau couramment utilisée. Le fondement de cette perspective
spatiale d’investigation est de s’appuyer sur une conception continue de l’espace lors de
l’analyse des variations des phénomènes de santé. Un premier aspect de cette approche est de
s’appuyer sur des modèles de régression spatiaux, qui quoique différents les uns des autres,
ont pour point commun de ne pas fragmenter l’espace en zones déconnectées les unes des
autres.75, 76, 77, 78, 79, 80, 81 En appliquant la notion de « mesure de variation » définie dans le
cadre du modèle multiniveau, un de nos objectifs est de souligner que les modèles de
régression spatiaux aboutissent à des indicateurs qui fournissent plus d’informations sur la
distribution spatiale des phénomènes que ceux que l’on obtient à partir des modèles
multiniveaux.
Le second intérêt d’une approche spatiale en analyse contextuelle est de s’affranchir des
limites administratives lors de la définition des facteurs contextuels explicatifs.82, 83, 84 Nous
avons développé des méthodes de mesure de l’exposition aux caractéristiques du contexte qui
tiennent compte de l’information contextuelle dans un espace continu centré sur le lieu de
résidence des individus. Un avantage de ces approches est qu’elles parviennent certainement
mieux à capter les effets du contexte environnant que les mesures réalisées au niveau des
zones administratives pour les individus qui résident sur les marges de ces zones.14 Une de
nos études sur ce thème a eu recours à des données Suédoises issues des Registres de
14
Population, dans lesquelles nous étions en mesure de localiser géographiquement les
individus de façon très précise. Ces données nous ont permis d’avancer dans le
développement des méthodes de mesure en continu des facteurs du contexte.
L’objectif de nos deux études était de comparer l’approche multiniveau couramment
utilisée dans la littérature à cette perspective d’analyse spatiale. Nous avons cherché à voir si
le fait de tenir compte de l’espace dans sa continuité intrinsèque permettait d’obtenir des
informations sur la distribution spatiale des phénomènes, tant à partir des mesures classiques
d’association qu’à partir des mesures de variation, qui resteraient inaccessibles dans le cadre
de l’approche multiniveau qui fragmente le territoire en zones administratives déconnectées
les unes des autres.
3) Plan du document
Dans la suite du présent document, le premier chapitre traite de l’utilité de l’approche
multiniveau en analyse contextuelle. Nous détaillons progressivement les fonctions d’intérêt
des modèles multiniveaux pour ce genre d’analyses. Dans cette partie, nous rapportons dans
un premier temps l’article rédigé au début de la thèse et publié dans la Revue d’Epidémiologie
et de Santé Publique,66 qui décrit l’utilisation qui est faite des modèles multiniveaux dans la
littérature. Une critique que l’on peut adresser à un nombre important d’études est de ne pas
assez tirer parti des informations fournies par les effets aléatoires des modèles. Dans la suite
du premier chapitre, nous décrivons notre participation à un projet dirigé par Juan Merlo
(Département de Médecine Communautaire, Hôpital Universitaire de Malmö, Suède), dont
l’objet est de souligner l’intérêt des mesures de variation (basées sur les effets aléatoires des
modèles) dans le champ de l’analyse contextuelle. Ce projet bénéficie du soutien du Journal
of Epidemiology and Community Health, qui a passé commande d’une série d’articles
didactiques à Juan Merlo. J’ai participé en tant que second ou troisième auteur aux trois
premiers articles de la série, et j’interviens en tant que premier auteur pour le quatrième
article. Les deux premiers articles ont d’ores et déjà été acceptés pour publication par le
journal,13, 68 les deux suivants étant en cours de révision. Alors que les trois premiers articles
de la série s’intéressent au modèle multiniveau linéaire simple (adapté aux variables
dépendantes continues), le quatrième article de la série est consacré au modèle logistique.
Dans la suite du premier chapitre, nous rapportons d’abord une lettre de recherche que nous
avons publiée dans l’American Journal of Epidemiology, au sujet des mesures de variation
15
(ou de « clustering ») adaptées au modèle logistique.85 Nous rapportons ensuite le quatrième
article de la série, après avoir résumé le contenu des articles précédents.
Dans un second chapitre, nous résumons brièvement le contenu de quatre articles publiés
ou acceptés dans des revues Européennes et Anglaises (European Journal of Epidemiology,
European Journal of Public Health, Public Health).86, 87, 88, 89 Il s’agit de travaux préliminaires
d’application des modèles multiniveaux à l’analyse contextuelle des comportements relatifs à
la santé (consommation de tabac et d’alcool, sédentarité, modes de recours aux soins). Ces
études ont été réalisées à partir des données du Baromètre Santé 2000 de l’INPES.90, 91 Elles
présentent des limites importantes, notamment liées au fait que nous n’avions pas
d’information de localisation géographique plus précise que le département de résidence des
individus.
Dans le troisième chapitre, nous présentons les principaux travaux de notre thèse, qui
visent à comparer l’approche multiniveau classiquement utilisée dans la littérature à une
perspective d’analyse spatiale qui consiste à étudier les variations des phénomènes de santé
dans un espace continu. Nous rapportons d’abord un premier article qui applique des
techniques d’analyse spatiale à l’étude des modes de recours aux soins en France. Cette étude
a été réalisée à partir des données Françaises de l’enquête SPS de l’IRDES.92 Nous avons
soumis une seconde version de l’article au Journal of Epidemiology and Community Health,
qui examine actuellement les corrections que nous avons apportées à notre travail. Nous
rapportons ensuite un second travail réalisé à partir des données Suédoises des Registres de
Population, dans lequel nous appliquons les dernières avancées méthodologiques en analyse
spatiale à l’étude des variations géographiques des troubles mentaux et comportementaux liés
à la consommation de substances psycho-actives. Ce travail a été réalisé dans le cadre d'une
collaboration étroite avec Juan Merlo du Département de Médecine Communautaire de
l'Hôpital Universitaire de Malmö.
16
Chapitre I – Utilité de l’approche multiniveau en
épidémiologie sociale
1) L’utilisation des modèles multiniveaux dans la littérature d’analyse contextuelle
Au début de la thèse, nous avons d’abord cherché à déterminer l’état de l’art de l’analyse
contextuelle sur le plan méthodologique.66 Le constat réalisé alors, qui vaut encore
aujourd’hui, est celui d’une suprématie hégémonique de l’approche multiniveau. Toutefois, si
la quasi-totalité des auteurs se réclament de cette approche, l’utilisation qu’ils en font est
variable, et l’intérêt des modèles multiniveaux en analyse contextuelle est diversement
apprécié.
De la façon la plus restrictive qui soit, un grand nombre d’auteurs utilisent des modèles
qui tiennent compte de la structure hiérarchique des données dans le seul but de tenir compte
de la non-indépendance des individus à l’intérieur des zones lors de l’estimation des écarttypes des effets fixes.4, 20, 64, 65 En effet, les modèles de régression classiques (qui n’incluent
pas d’effets aléatoires) surestiment souvent le degré de significativité statistique des effets du
contexte (en sous-estimant les écart-types de ces paramètres). Tenant compte de la structure
hiérarchique des données, les modèles multiniveaux aboutissent à une estimation moins
biaisée des écart-types des forces d’association. Dans ce type d’utilisation des modèles
multiniveaux, les auteurs ne prêtent donc attention qu’aux forces d’association (effets fixes du
modèle) et ne rapportent le plus souvent pas les effets aléatoires des modèles, qu’ils se
gardent de toute façon d’interpréter.
Toutefois, un certain nombre d’auteurs dans la littérature ont indiqué que les effets
aléatoires des modèles multiniveaux étaient également susceptibles d’apporter des
informations utiles.3, 6, 23, 34, 67 En effet, les effets aléatoires fournissent un appui lorsque l’on
cherche à interpréter les associations entre facteurs explicatifs et phénomènes étudiés.
Permettant de distinguer la variance inter-zone de la variance au niveau individuel, ils
renseignent sur l’amplitude des variations à expliquer à chacun des niveaux au moyen des
facteurs pris en compte dans les analyses.93, 94 Les auteurs s’intéressent surtout à la manière
dont évolue la variance inter-zone résiduelle du phénomène lorsque l’on introduit des facteurs
individuels puis contextuels dans le modèle. Quantifier la réduction que connaît la variance
entre zones lors de l’introduction successive des différents facteurs explicatifs permet
17
d’évaluer le poids de chacune de ces variables dans la constitution des disparités
géographiques du phénomène. En introduisant les caractéristiques des individus, on est ainsi
en mesure de quantifier le poids des effets de composition, soit la part de la variabilité interzone qui est due à la composition variable des zones sur le plan des caractéristiques
individuelles.3, 23 Les auteurs examinent ensuite si des variations significatives persistent entre
zones après ajustement sur les facteurs individuels, et émettent des hypothèses sur la possible
existence d’effets proprement contextuels. Ils cherchent enfin à quantifier la contribution des
différents effets contextuels à la variabilité inter-zone, et à voir si l’ensemble des facteurs
contextuels pris en compte permet d’expliquer cette variabilité.3, 21
Un premier article publié dans la Revue d’Epidémiologie et de Santé Publique nous a
permis de décrire ces différents modes d’utilisation des modèles multiniveaux dans la
littérature d’analyse contextuelle.66 Un des constats réalisés à l’issue de ce travail est que
l’étude des effets du contexte sur la santé a connu un développement important en Europe du
Nord, en Angleterre, et aux Etats-Unis au cours de la dernière décennie, mais n’a pas connu
d’essor similaire en France, et conserve une place marginale dans le champ de
l’épidémiologie sociale.
2) L’intérêt des mesures de variation comme sources d’information indépendantes sur
l’impact du contexte sur la santé
La revue de littérature publiée dans la Revue d’Epidémiologie et de Santé Publique nous a
donc permis de brosser un tableau des modes d’utilisation des modèles multiniveaux dans la
littérature d’analyse contextuelle en épidémiologie sociale.66 Dans ce travail, nous avons
montré un intérêt particulier pour les applications qui cherchaient à interpréter les effets
aléatoires des modèles multiniveaux. La suite de notre réflexion méthodologique nous a
conduit à nous intéresser plus avant encore à l’utilité qu’il peut y avoir à modéliser la variance
des phénomènes de santé, au-delà du vecteur des espérances.
Cette réflexion a en partie été conduite dans le cadre d’une collaboration avec un
chercheur Suédois, Juan Merlo, dont un objectif est de populariser auprès du milieu des
chercheurs en épidémiologie sociale l’intérêt qu’il y a à modéliser les variances inter-zones
(ou corrélations intra-zones) afin d’évaluer l’importance du contexte sur la santé.12, 69, 95 Son
orientation aboutit à distinguer des « mesures de variation » (obtenues notamment à partir des
18
effets aléatoires des modèles multiniveaux) des « mesures d’association » plus classiques
entre facteurs explicatifs et variables de santé.69
La collaboration engagée avec Juan Merlo s’est notamment structurée autour de la
rédaction d’une série d’articles didactiques sur les modèles multiniveaux, dont commande
avait été passée à ce chercheur par un éditeur du Journal of Epidemiology and Community
Health. Au moment où nous nous sommes engagés dans ce travail, plusieurs exposés ont déjà
été publiés dans la littérature sur l’intérêt des modèles multiniveaux dans le champ de
l’épidémiologie sociale.2,
ce travail
dirigé par
3
Toutefois, nous avons un angle d’attaque original. En effet,
Juan Merlo cherche à fournir un support
aux chercheurs en
épidémiologie sociale peu versés en statistiques, en leur permettant de comprendre de façon
intuitive l’intérêt que présentent les modèles multiniveaux en analyse contextuelle. Au-delà, et
de façon plus originale, notre objectif est de souligner l’utilité qu’il y a sur un plan de santé
publique à modéliser la variance géographique des phénomènes de santé au-delà des
associations qui existent avec les facteurs contextuels. La série d’articles s’articule ainsi
autour de la distinction entre « mesures d’association » et « mesures de variation ».13, 68 Tout
en constatant que ces derniers indicateurs ont été sous-utilisés dans la littérature, les différents
articles ont pour but de souligner de façon didactique l’intérêt que présentent ces « mesures de
variation » lorsque l’on cherche à évaluer l’impact réel du contexte sur la santé des individus
Les trois premiers articles de la série sont consacrés au modèle multiniveau linéaire
simple, qui permet de modéliser les variations inter-zones de variables continues. Un premier
objectif est de souligner l’intérêt du coefficient de corrélation intraclasse, qui exprime la part
des variations totales du phénomène qui survient au niveau des zones de résidence.50, 85, 96 Une
recommandation de l’article est que cette information ne devrait jamais être négligée dans les
études d’analyse contextuelle. Trop d’études dans lesquelles le coefficient de corrélation
intra-zone est proche de zéro aboutissent à la conclusion que le contexte de résidence exerce
un impact sur la santé, en s’appuyant sur des forces d’association faibles entre facteurs
contextuels et phénomènes de santé (odds ratio autour de 1.5). Avant toute introduction de
caractéristiques individuelles et contextuelles dans les modèles, le coefficient de corrélation
intraclasse indique si il est important de tenir compte du contexte pour expliquer les variations
du phénomène, ou si le contexte peut être négligé et les analyses conduites en ne tenant
compte que des facteurs individuels.13, 69
Naturellement, les valeurs de référence que l’on choisit pour juger de l’importance de la
corrélation intra-zone sont plus faibles que celles que l’on retient si l’on s’intéresse à la
19
corrélation de comportements à l’intérieur du ménage, ou à la corrélation de mesures réalisées
à l’intérieur de l’organisme humain. Une vision d’ensemble de la littérature permet d’estimer
qu’une corrélation intra-zone inférieure à 1% exprime un niveau de similitude très faible entre
individus appartenant à la même zone, et indique par conséquent que le contexte n’a pas
d’impact sur le phénomène de santé étudié. Une corrélation intra-zone autour de 3% indique
que le phénomène présente une certaine sensibilité au contexte de résidence, et une
corrélation égale ou supérieure à 5% est le signe d’un rôle important du contexte sur le
phénomène. Ces valeurs peuvent apparaître très faibles au regard des valeurs de référence
habituellement retenues pour juger de l’importance d’une corrélation, et les 5% des variations
qui surviennent au niveau des zones pourraient apparaître sans grande importance par rapport
au 95% des variations restantes qui se manifestent au niveau individuel (variations
individuelles intra-zones). Toutefois, l’expérience indique que l’on ne parvient en général
qu’à expliquer une toute petite partie des variations qui surviennent au niveau individuel,
alors qu’on est souvent en mesure d’expliquer une large part des variations inter-zones à
l’aide d’un petit nombre de facteurs contextuels. De ce fait, même si les variations inter-zones
ne constituent que 5% de la variance totale du phénomène, elles ont en général un poids
nettement plus important si l’on ne considère que la part de la variance qui a pu être expliquée
au moyen de facteurs individuels et contextuels.
Les trois premiers articles de la série ont également présenté de façon aboutie l’utilisation
qui peut être faite du coefficient de partition de la variance,96 en insistant sur l’intérêt des
informations qu’il fournit dans le champ de la santé publique.5, 68 Cet indicateur, qui n’avait
pas été présenté de façon aussi détaillée dans les précédents exposés méthodologiques de la
littérature, constitue une généralisation du coefficient de corrélation intraclasse au cas où la
corrélation intra-zone dépend de façon complexe des caractéristiques des individus prises en
compte dans le modèle.95 En effet, il est d’une part possible de modéliser la variance interzone en fonction des caractéristiques individuelles. Cela conduit à montrer que la variance
inter-zone (ou importance du contexte pour le phénomène) est variable d’un groupe
d’individus à l’autre. Par ailleurs, la variance au niveau individuel (ou variance qui survient
entre individus à l’intérieur de chaque zone) peut également être d’amplitude variable d’un
type d’individus à l’autre.2 Ainsi, un groupe d’individus pour lequel la variable de santé
mesurée prend en moyenne des valeurs élevées pourrait présenter une variabilité plus
importante qu’un autre groupe ayant en moyenne des valeurs plus faibles pour la variable.
Dans le cadre d’un travail récent publié par Juan Merlo dans l’American Journal of
20
Epidemiology,95 il a par exemple été montré à partir de données sur des individus issus de
pays différents (étude MONICA) que le niveau de pression artérielle présentait des variations
importantes d’un pays à l’autre. Toutefois, une analyse plus aboutie a indiqué que les
variations de pression artérielle entre pays présentaient une amplitude nettement plus
importante pour les individus qui étaient en surpoids. Au-delà, au niveau individuel (c’est-àdire entre individus d’un même pays), il est également apparu que la pression artérielle avait
une variabilité plus importante pour les individus en surcharge pondérale.
Dans un tel exemple, on voit que les variations inter-zones et les variations au niveau
individuel dépendent toutes deux d’un facteur individuel, l’indice de masse corporelle. En
conséquence, le coefficient de partition de la variance, que l’on calcule à partir de ces deux
composantes de la variance, n’est pas constant, mais devient une fonction complexe de ce
facteur individuel. En indiquant que la corrélation intra-pays a tendance à être plus élevée
parmi les individus en surpoids que parmi les autres, le coefficient de partition de la variance
apporte des informations pertinentes d’un point de vue de santé publique, en permettant
d’identifier un sous-groupe de population pour lequel la sensibilité au contexte est plus
importante.
Le quatrième article de la série, sur lequel je me suis plus particulièrement focalisé, pose
la problème de la mesure des variations inter-zones dans le cas où la variable réponse est de
nature binaire.85,
96
Cette question prend une acuité particulière dans le contexte de
l’épidémiologie, et de l’épidémiologie sociale, où les phénomènes étudiés (survenue ou non
d’un trouble, pratique ou non d’un comportement, etc.) ne peuvent souvent être pris en
compte dans les analyses qu’à l’aide de variables binaires. Tout en suivant l’approche
didactique des articles précédents, un premier objectif de ce quatrième article est d’expliquer
en quoi le coefficient de corrélation intraclasse utilisé dans le cadre du modèle linéaire simple
n’est pas adapté au modèle logistique multiniveau. La distinction entre variance individuelle
intra-zone et variance inter-zone qui existe dans le modèle linéaire est beaucoup moins claire
dans le modèle logistique.50,
96
En effet, connaissant la valeur moyenne d’une variable
continue dans chacune des zones d’un territoire, il serait impossible de prédire les valeurs de
cette variable pour chacun des individus qui composent ces zones, la variance individuelle à
l’intérieur des zones pouvant être faible ou d’amplitude importante. Dans un tel cas, le
coefficient de corrélation intraclasse permet de quantifier le poids relatif de ces deux
composantes de la variance. Au contraire, dans le cas d’une variable binaire, connaissant la
proportion de cas positifs dans chaque zone ainsi que les effectifs d’individus par zones, on
21
pourrait immédiatement déterminer les valeurs (0 ou 1) de la variable pour chacun des
individus ainsi que les variances à l’intérieure des zones. En effet, conformément à la loi
binomiale, on dit que la variance de la variable au niveau individuel est liée à la moyenne. De
ce fait, la signification du coefficient de corrélation intraclasse est floue dans le cadre du
modèle logistique.
Différentes définitions du coefficient de corrélation intraclasse ont malgré tout été
proposées dans la littérature pour le modèle logistique.50,
96
Puisque le coefficient de
corrélation intraclasse est calculé à partir de la variance au niveau individuel et de la variance
inter-zone et que la variance individuelle est fonction de la prévalence du phénomène, le
coefficient de corrélation intraclasse sera lui-même fonction de la prévalence. Considérant
deux phénomènes de prévalence différente, le coefficient de corrélation intraclasse prendra
donc des valeurs différentes même si les deux phénomènes ont des variations inter-zones
d’amplitude identique. Ainsi, outre ses difficultés d’interprétation dans le cadre du modèle
logistique, cet indicateur apparaît biaisé par la prévalence du phénomène, et semble donc peu
adapté lorsque l’on cherche à évaluer l’importance du contexte pour un phénomène de nature
binaire.
En conséquence, d’autres options ont été proposées dans la littérature pour mesurer la
tendance des phénomènes binaires à survenir en grappe. Dans le cadre du modèle logistique
multiniveau, Klaus Larsen97 puis Klaus Larsen et Juan Merlo98 ont défini un odds ratio
médian (median odds ratio, MOR). Dans le cadre de l’équation d’estimation généralisée, un
indicateur appelé « odds ratio dans la paire » (ou pairwise odds ratio) a également été
proposé.56, 57, 99, 100 Quoique différents l’un de l’autre, ces indicateurs apportent une solution
aux difficultés que soulève le coefficient de corrélation intraclasse, et présentent l’avantage de
quantifier la tendance des phénomènes à survenir en grappe sur l’échelle des odds ratio
communément utilisée en épidémiologie. Dans une lettre publiée dans l’American Journal of
Epidemiology, nous avons appelé à des réflexions sur le sens et la validité des différents
indicateurs de mesure des variations inter-zones dans le cadre du modèle logistique, et avons
engagé un travail de comparaison des avantages et inconvénients respectifs de ces différentes
options. Nous rapportons cette lettre ainsi que la réponse des auteurs auxquels elle était
adressée à la fin de ce chapitre.85
Le quatrième article de la série a donné l’occasion d’avancer dans cette réflexion. Nous
avons d’abord exposé différentes définitions du coefficient de corrélation intraclasse appliqué
au modèle logistique, et avons cherché à en dégager les principales faiblesses. Au-delà, à titre
22
de solution, nous avons présenté de façon didactique l’odds ratio médian récemment
développé par Klaus Larsen, indicateur qui était resté complètement inaperçu dans la
littérature d’épidémiologie sociale.97 Contrairement au coefficient de corrélation intraclasse
du modèle logistique, cet indicateur est indépendant de la prévalence des phénomènes. Il
permet ainsi de comparer l’amplitude des variations inter-zones de phénomènes qui ont une
prévalence différente. De plus, cet indicateur quantifie les variations inter-zones sur l’échelle
des odds ratios, et offre ainsi la possibilité de comparer l’importance de ces variations aux
forces d’association entre facteurs individuels ou contextuels et variables de santé (qui sont
elles-mêmes habituellement exprimées sous forme d’odds ratios).
Dans cet article, nous avons également souligné l’importance qu’il y a à tenir compte des
variations inter-zones résiduelles lorsque l’on cherche à interpréter les associations entre
facteurs contextuels et variables binaires dépendantes. En effet, on suppose habituellement
que l’utilité d’un facteur contextuel pour identifier les zones à risque est fonction de la force
de son association avec la variable de santé étudiée. Toutefois, même en cas d’association, si
les variations inter-zones résiduelles sont importantes, le facteur contextuel n’est pas d’une
grande utilité pour repérer des zones à risque, puisque le niveau de risque d’une zone est alors
au moins autant fonction de variations aléatoires liées à des facteurs non mesurés. Un
indicateur appelé « interval odds ratio » a récemment été proposé dans le cadre du modèle
logistique, qui permet de tenir compte des variations inter-zones résiduelles lorsque l’on
cherche à quantifier l’importance d’un facteur contextuel sur la santé.97 En rapportant le poids
d’un effet contextuel aux variations inter-zones résiduelles, cet indicateur fournit des
informations complémentaires à celles fournies par l’odds ratio habituel.
Au final, loin de considérer la corrélation des individus à l’intérieur des zones comme
simple nuisance dont on ne tiendrait compte que par obligation, l’approche que nous
suggérons en analyse contextuelle accorde une place centrale à l’analyse de la variance interzone et de sa structure. En effet, au-delà des mesures d’association classiques, les mesures de
variation ou de corrélation sont susceptibles de fournir des informations sur l’impact du
contexte sur la santé des individus, et méritent ainsi un intérêt particulier dans le champ de la
recherche en santé publique.
23
A brief conceptual tutorial of multilevel analysis in social
epidemiology – using measures of clustering in multilevel logistic regression to
investigate contextual phenomena
Basile Chaix1,2
Juan Merlo1
Henrik Ohlsson1,3
Anders Beckman1
Kristina Johnell1,4
Per Hjerpe1,5
1
Department of Community Medicine (Preventive Medicine), Malmö University Hospital, Lund University,
Malmö, Sweden
2
Research Team on the Social Determinants of Health and Health Care, National Institute of Health and Medical
Research, Paris, France
3
Skåne County Council. Regional Office for drug utilisation studies
4
Centre for Family Medicine, Karolinska Institutet, Huddinge, Sweden
5
Skaraborg Institute, Skovde, Sweden.
Corresponding author:
Juan Merlo, MD, PhD, Associate Professor
Department of Community Medicine (Section of Preventive Medicine)
Malmö University Hospital, Faculty of Medicine (Malmö campus), Lund University
S-205 02 Malmö
Sweden
[email protected]
Abstract
Study objective
Due to technical reasons it is easier to interpret measures of variation in linear than in multilevel logistic
regression. Since those measures are relevant for understanding contextual phenomena and binary outcomes are
frequent in social epidemiology, we aimed to present measures of variation appropriate for the logistic case in a
didactic rather than a mathematical way.
Design and participants
We used data from the Health Survey conducted in 2000 in the county of Scania, Sweden, which comprised
10,723 individuals aged 18–80 years living in 60 areas. Conducting multilevel logistic regression we applied
different techniques (intra-class correlation (ICC), median odds ratio (MOR)), and interval odds ratio (IOR)) to
investigate whether the individual propensity to consult private physicians was dependent on the area of residence.
Results
Both the ICC and the MOR provided information on the magnitude of dependence of the individual propensity of
consulting private physicians on the residential area. The MOR was more easily interpretable than the ICC and
showed that the unexplained heterogeneity between areas was of greater relevance than the individual variables
considered in the analysis (age, gender, and education) for understanding variations of the propensity of visiting
private physicians. Residing in high-education areas increased the probability of visiting private physicians.
However, the IOR indicated that the residual unexplained variability between areas was too important to allow a
clear distinction between low- and high-propensity areas based on the area educational level.
Conclusion
Measures of variation in logistic regression are easy to compute and provide an efficient mean of quantifying the
importance of the residential context for understanding disparities in health and health-related behaviour.
1
In the study of contextual determinants of health, considering the extent to which individual health phenomena
cluster within areas is not only necessary for obtaining correct estimates in regression analysis. It also provides
relevant information that allows assessment of the importance that the context has for different individual health
outcomes.[1] [2]
In multilevel linear regression analysis it is easy to partition the variance between different levels and
compute measures of clustering that provide intuitive information for capturing contextual phenomena.[3] [4] [5]
However, for binary outcomes, the partition of variance between different levels does not have the intuitive
interpretation of the linear model. Despite these difficulties several methods have been developed in logistic
regression to obtain suitable epidemiological information on area-level variance and clustering within areas.[6] [7]
[8] [9]
In the present study we investigated whether residing in a specific area determines individual health care-seeking
behaviour over and above individual characteristics. The present paper represents the last of a series of four
included in a project [10] aimed to explain in a conceptual rather than a mathematical way how to calculate and
interpret multilevel measures of variance and clustering.[3] [4] [5] The present study is focused at measures of
variation in logistic regression. We put a special emphasis on indicating the relevance of these measures in social
epidemiology and community health.[1]
The illustrative example
BACKGROUND AND OBJECTIVES
In Sweden individual economic resources are not a major determinant for choosing private v. public healthcare
practitioners since the county council supports patient fees in both cases. The choice of a private rather than a
public practitioner may express individual preferences, demands and expectations related to socioeconomic
position. Moreover, place of residence may influence this individual decision over and above individual
characteristics. In the present study, we used multilevel measures of variance and clustering to quantify the
contextual dimension of this healthcare seeking behaviour.
POPULATION AND METHODS
Data sources and variables
Our illustrative analysis was based on the Health Survey in Scania conducted in 2000, a postal self-administered
questionnaire survey.[11] Each of the 33 municipalities of the county of Scania, Sweden, corresponded to a survey
area, except the four largest municipalities Helsingborg, Kristianstad, Lund and Malmö, which were subdivided
into six, five, ten and ten administrative areas respectively. In total there were 60 different survey areas. The initial
survey sample consisted of 23,437 individuals born between 1919 and 1981, 13,715 (59%) of whom agreed to
participate.
After approval by the Ethical Committee at the Medical Faculty of Lund, survey data were linked to the
1999 patient administrative register, which contains individual-level information on utilisation of all publicly
financed health care. The present study only considered individuals who had had at least one contact with a health
care provider during 1999 (10,723 individuals aged 18–80 years).
The binary outcome distinguished those individuals who had consulted a private physician at least once
in 1999 from those who had not. Age was introduced as a continuous variable. The educational level was divided
into two categories (9 years or less, more than 9 years). An area-level socioeconomic variable, defined as the
percentage of highly educated inhabitants, was coded in two classes with the median value as the cut-off. This area
variable was derived from data on the whole population of the county.
Multilevel analysis
We aimed to investigate whether the residential area determined the choice of a private as opposed to a
public practitioner. We first estimated an “empty” model (Model i), which only includes a random intercept and
allowed us to detect the existence of a possible contextual dimension for this phenomenon.[3] Thereafter, we
included the individual characteristics in the model (Model ii) to investigate the extent to which area-level
differences were explained by the individual composition of the areas.[4] Finally we added the area variable
(Model iii) to investigate whether this contextual phenomenon was conditioned by specific area characteristics.[5]
The multilevel logistic regression models were estimated with Markov Chain Monte Carlo (MCMC)
method using MLwiN software (version 1.2., Institute of Education, London).[12] [13]
The multilevel logistic regression
In logistic regression the aim is to predict the probability pI that a phenomenon (e.g., visiting a private physician)
occurs for the individual i in function of a certain number of variables. Since the natural values of pI extend from 0
to 1 and a regression analysis is better performed on values between -∞ and +∞ we transform pI in logit (pI), which
is comprised of values between
-∞ and +∞.[14]
2
More specifically, multilevel logistic regression considers that the individual probability is also dependent
on the area of residence of the individuals. This dependence on the context needs to be accounted for to obtain
correct regression estimates, but doing so also conveys substantive information in itself.[1] [15] [16]
In Model i (i.e., the empty model) the probability of visiting a private physician is only function of the
area in which the individuals live, which is accounted for with an area-level random intercept:
⎛ pI
⎝1− pI
Logit (pI) = log odds = log ⎜⎜
⎞
⎟⎟ = MC + EC–A
⎠
Equation 1
MC = overall mean probability (prevalence) expressed on the logistic scale
EC–A = area-level residual, defined as the shrunken1 difference between MC (which expresses the overall
prevalence on the logistic scale) and MA (which expresses the prevalence in a given area on the logistic
scale). The area-level residuals are therefore on the logistic scale and normally distributed with mean 0 and
variance VA.
VA = area residual variance expressed on the logistic scale (i.e. variance around MC)
VI = pI (1 – pI) = individual variance expressed on the probability scale, and depending on the predicted
probability pI of the outcome.
In Model i the probability of visiting a private physician for an individual living in an area A depends on
MC and EC–A.
pI =
exp(M C + E C−A )
1 + exp(M C + E C−A )
Equation 2
In Model ii the probability of visiting a private physician is function of the area of residence of the
individuals and of the individual variables (i.e. sex, age, and education).
Logit (pI) = MC + β1 sexI + β2 ageI + β3 eduI + EC–A
β1, β2, β3 = regression coefficients for the individual covariates
Equation 3
In Model iii the probability of visiting a private physician depends on the residential area of the
individuals, on the individual variables and on the area variable (percentage of individuals with a high educational
level).
Logit (pI) = MC + β1 sexI + β2 ageI + β3 eduI + β4 eduA + EC-A
β4 = regression coefficient for the area-level educational variable
Equation 4
Measures of area-level variance and clustering in multilevel logistic regression
INTRACLASS CORRELATION AND THE RELATED VARIANCE PARTITION COEFFICIENT
We have previously discussed the relevance of the intraclass correlation coefficient (ICC) (also termed variance
partition coefficient (VPC) in its most general form) for understanding contextual phenomena expressed with
continuous variables.[3] [4] [5] In the linear case, the VPC informs us on the proportion of total variance in the
outcome that is attributable to the area level.
VPC = VA / (VA + VI)
Equation 5
where VA is the area-level variance and VI corresponds to individual-level variance.
In the linear model, the VPC is based on the clear distinction that exists between the individual-level
variance and the area-level variance. Indeed, knowing the mean value of a continuous outcome variable in each
area, you would not be able to infer the values of the variable for each individual: the individual-level variance
within areas could be small or very large. By contrast, with a binary variable the individual-level values (0 and 1)
are immediately known from the prevalence existing in each area. This absence of a clear distinction between
individual-level variance and area-level variance makes it trickier to compute and interpret the VPC in logistic
models.
1
In multilevel regression analysis, the area-level residuals are “shrunken” towards their mean of 0, in an attempt to disentangle
the part of the variations that may be due to true variations between areas from that part which might be better attributed to
random variations. The fewer the number of individuals in an area, or the higher the variability within areas as compared to the
variability between areas, the more the value of the area-level residual will be shrunken towards 0. More detailed explanations
are provided in a previous paper.[3]
3
In multilevel linear regression both the individual-level and the area-level variances are expressed on the
same scale (for example, mmHg for systolic blood pressure). Therefore, partition of variance between different
levels is easy to perform for detecting contextual phenomena.[3] [4] [5] In multilevel logistic regression, however,
the individual-level variance and the area-level variance are not directly comparable. Whereas the area-level
residual variance VA is on the logistic scale, the individual-level residual variance VI is on the probability scale.
Moreover, VI is equal to pI (1 – pI) and therefore depends on the prevalence of the outcome (i.e. the predicted
probability).
To solve these technical difficulties, Goldstein and others [6] [17] have described some alternative
approaches for computing the VPC in the case of logistic regression. Two of these methods are (a) the simulation
method [7]; and (b) the linear threshold model method, or latent variable method proposed by Snijders and
Bosker.[17] Both methods convert the individual-level and area-level components of the variance to the same
scale before computing the VPC.
a) The principle of the simulation method is to translate the area-level variance from the logistic to the
probability scale in order to have both components of variance on the probability scale. These two components of
variance can then be used on the probability scale to compute the ICC with the usual formula (Equation 5). More
details on this approach are provided in table 1 and elsewhere.[6] [7]
Table 1. Hypothetical data showing that the size of the intra-class correlation (ICC) calculated by the simulation method
[6] in a multilevel logistic model depends of the prevalence of the outcome (i.e. the predicted probability). We present
eleven cases, all with the same area variance VA but with different outcome prevalence (pI).
Prevalence pI of Prevalence of
Area variance
Area variance
Individual
Intra-class correlation
the outcome
converted to the
the outcome on the VA on the
variance**
logistic scale
(probability
probability scale*
logistic scale
ICC = */(*+**)
scale)
(intercept MC)
0.01
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.99
-4.6
-2.2
-1.4
-0.8
-0.4
0.0
0.4
0.8
1.4
2.2
4.6
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.00003
0.00185
0.00519
0.00872
0.01063
0.01136
0.01062
0.00872
0.00518
0.00185
0.00003
0.0108
0.0936
0.1589
0.2079
0.2304
0.2386
0.2305
0.2080
0.1590
0.0936
0.0108
0.002 (i.e. 0.2%)
0.019 (i.e. 1.9%)
0.032 (i.e. 3.2%)
0.040 (i.e. 4.0%)
0.044 (i.e. 4.4%)
0.045 (i.e. 4.5%)
0.044 (i.e. 4.4%)
0.040 (i.e. 4.0%)
0.032 (i.e. 3.2%)
0.019 (i.e. 1.9%)
0.002 (i.e. 0.2%)
*In order to convert the area-level variance to the probability scale, we simulated 100 000 area-level residuals EC–A based
on the area-level variance VA, and calculated the predicted probability in each of these 100 000 simulated area as p = exp
(MC + EC-A) / [1+exp (MC + EC-A)]. We computed the area-level variance on the probability scale as the variance of these
predicted probabilities.
**The overall individual-level variance is computed as the mean of the individual-level variances computed as p (1 – p)
for each of the 100 000 simulated values.
As noted previously, the individual-level variance depends on the expected prevalence. A first
consequence is that different phenomena with a similar area-level variance but a different prevalence (MC) will
have different VPCs. As illustrated in Table 1 using hypothetical data, for a given amount of area-level variation,
the VPC will always be the highest for outcomes with a prevalence of 50%. This aspect needs to be considered
when comparing the magnitude of clustering between phenomena with a different prevalence.
A second consequence occurs in the model including covariates. Since the VPC depends on the
prevalence, which in turn depends on the characteristics of the individuals, there will be one different VPC for
each different type of individual. Note that this heterogeneity in the VPC is just a consequence of the dependence
of this definition of the VPC on the prevalence of the outcome.
b) The linear threshold model method or latent variable method converts the individual-level variance
from the probability scale to the logistic scale, on which the area-level variance is expressed. In our case, the
method assumes that the propensity for visiting a private physician is a continuous latent variable underlying our
binary response (i.e. having visited a private physician or not). In other words, every individual has a certain
propensity for visiting a private physician but only individuals whose propensity crosses a certain threshold
4
actually do it. The unobserved individual variable follows a logistic distribution with individual-level variance VI
equal to π2/3 (i.e. 3.29).[6] [7] [17] On this basis, the VPC is calculated as:
Equation 6
VPC = VA / (VA + 3.29)
The VPC is only a function of the area-level variance and does not directly depend on the prevalence of
the outcome as in the simulation method.
These methods for computing the VPC in logistic models have their own statistical consistency.
However, they consist in an attempt to apply to the logistic case notions that are based on the clear distinction
between the individual-level variance and the area-level variance that exists in the linear case. Since this
distinction is not so clear in the logistic case, the interpretation of the VPC for dichotomous outcomes is awkward
to understand in epidemiological terms.[6] [8] [18]
THE MEDIAN ODDS RATIO
The aim of the median odds ratio (MOR) [8] [9] is to translate the area-level variance in the widely used odds ratio
(OR) scale, which has a consistent and intuitive interpretation. In the present study, the MOR shows the extent to
which the individual probability of visiting a private physician is determined by residential area and is therefore
appropriate for quantifying contextual phenomena. The MOR is independent of the prevalence of the
phenomenon, and can be easily computed in the empty model and in more elaborated models.
To intuitively understand the rationale for the MOR, imagine that we consider all possible pairs of
individuals with similar covariates but residing in different areas. In Figure 1 we consider two different fictive
cases, one with weak variations between areas, the other with very important variations. Using the area-level
residuals of the multilevel model we compute the OR for each pair of individuals with the subject with the higher
odds always placed in the numerator (the OR is always larger than or equal to one). This procedure yields a
distribution of the OR. Figure 2 gives the distribution of the OR that we obtained in considering the 56 million
pairs of individuals from different areas that can be formed in our dataset. The MOR is the median of this
distribution.
Figure 1 Heterogeneity between areas in the utilisation of private health care providers as expressed using the
median odds ratio (MOR) computed from the empty multilevel logistic model. Two fictive cases including four
different areas are presented in the Figure. In the top part of the Figure we present a situation with very weak
variations between areas. In the bottom part of the Figure, area-level variations were much more important, which
will be reflected in a higher MOR. Considering the area-level residuals of the multilevel model, the odds ratio
between the individual at lowest risk and the individual at highest risk is computed for each pair of individuals
from different areas. The MOR is defined as the median value of the distribution of this odds ratio.
A
B
5
Figure 2 Considering the area-level residuals of the multilevel model, we computed the odds ratio between the
individual at lowest risk and the individual at highest risk for each pair of individuals from different areas. We
present the distribution of this odds ratio for the 56 million pairs of individuals from different areas that can be
formed in our sample of 10,723 individuals. As indicated in the Figure, the MOR is defined as the median value of
the distribution.
In practice, it is not necessary to empirically consider all possible pairs of individuals from different
areas. The MOR depends directly on the area-level variance and can be computed with the following formula:
Equation 7
MOR = exp [√(2 × VA) × 0.6745] ≈ exp(0.95 √VA)
where VA is the area-level variance.
If the MOR was equal to one, there would be no differences between areas in the probability of seeking a
private physician (as in the fictive case presented in Figure 1A). If there were important area-level differences (as
in Figure 1B), the MOR would be large and the area of residence would be relevant for understanding variations
of the individual probability of visiting a private physician.
The standard error of the area-level variance indicates the precision of the estimate. Also, using the
MCMC method available in MLwiN [12] and other software, we can directly compute a 95% credible interval
(CI) for the MOR. The MCMC method consists in running a chain in which values of the different parameters are
simulated until convergence. When the chain has converged, it just remains to consider subsequent simulated
values of the area-level variance (which describe the posterior distribution of the area variance) and compute the
MOR for each of these values. Considering the 2.5th and 97.5th percentiles of the resulting distribution of the
MOR yields a 95% CI for the MOR. In our example the MOR was equal to 1.81 in the empty model, with a 95%
CI (1.62 to 2.06) that clearly excluded the value 1 (Table 2).
One feature of interest of the MOR is that it is directly comparable with the ORs of individual or area
variables. In the model including individual-level variables (Table 2) the MOR was equal to 1.80, which indicates
that in the median case the residual heterogeneity between areas increased by 1.8 times the individual odds of
seeking a private physician when randomly picking out two individuals in different areas. The residual
heterogeneity between areas (MOR = 1.80) was of greater relevance than was the impact of the individual’s level
of education (OR = 1.25) for understanding variations in the odds of seeking a primary care physician.
6
Table 2. Measures of association between individual and area characteristics and the outcome and measures of
variation and clustering in the utilisation of private providers in the county of Scania, Sweden, 2000, obtained from
multilevel logistic models*
Empty model
Measures of association (OR, 95% CI)
Individual-level variables
Female (v. male)
Age (in 10 years unit)
High (v. low) educational
achievement
Area-level variable
High (v. low) percentage of highly
educated inhabitants
Interval odds ratio (IOR)
Measures of variation or clustering
Area-level variance (SE)
PCV†
MOR (95% CI)
ICC (latent variable method)
ICC (simulation method)
Model with
individual-level
variables
Model with the
area-level variable
1.57 (1.44 to 1.70)
1.13 (1.11 to 1.15)
1.25 (1.13 to 1.38)
1.56 (1.44 to 1.70)
1.13 (1.11 to 1.15)
1.24 (1.12 to 1.38)
1.95 (1.45 to 2.62)
[0.75 to 5.05]
0.388 (0.080)
1.81 (1.62 to 2.06)
0.105
0.082
0.379 (0.078)
-2.3%
1.80 (1.62 to 2.04)
0.103
0.070 – 0.080‡
0.275 (0.059)
-27.4%
1.65 (1.50 to 1.84)
0.077
0.044 – 0.061‡
CI = credible interval; ICC = intraclass correlation; IOR = interval odds ratio; MOR = median odds ratio; OR = odds
ratio; PCV = proportional change in variance; SE = standard error.
*Multilevel models were estimated with the Markov Chain Monte Carlo method implemented in MLwiN (version
1.2., Institute of Education, London).
†The proportional change in variance expresses the change in the area-level variance between the empty model and
the individual-level model, and between the individual-level model and the model further including the area-level
covariate.
‡As discussed in the text, in a model including explanatory factors one different ICC is computed for each
combination of the explanatory factors. Note that this heterogeneity in the ICC is merely a consequence of the
dependence of the ICC on the prevalence.
TAKING AREA-LEVEL VARIANCE INTO ACCOUNT WHEN INTERPRETING ASSOCIATIONS BETWEEN AREA VARIABLES
AND HEALTH WITH THE INTERVAL ODDS RATIO
In multilevel models regression coefficients are adjusted for the dependence of the outcome within areas by
including the area-level residuals in the equation (Equations 1, 3 and 4). The regression coefficients for individual
variables, in being adjusted for area-level residuals, reflect the association between the individual-level variables
and the outcome within a specific area (and are termed “area-specific coefficients”, or “cluster-specific
coefficients”). However, for area variables, regression coefficients cannot be interpreted as being area-specific in
the same way as with individual variables: since area variables only take one value in each area it is necessary to
compare individuals with different area-level residuals to quantify the area-level effect.
In our data we found that living in areas with a high percentage of highly educated people increased the
individual probability of visiting a private physician. However, if residual variability between areas remains
important, the likelihood is high of finding an individual in a low-education area who presents higher odds of
consulting private providers than an individual in a highly educated area. It is therefore particularly useful to
consider the magnitude of area-level residual variations when interpreting effects of area-level variables. In order
to integrate the area-level fixed effect and the random residual variations we suggest using the 80% Interval Odds
Ratio (IOR-80), as described in detail elsewhere.[8] [9] As indicated in the two contrasted fictive cases in Figure
3, the usual OR consists in comparing the mean odds in low- and high-education areas. By contrast, when
7
comparing individuals in areas with low education with individuals in areas with high education, the IOR also
takes into account the specific area-level residuals.
Figure 3 Illustration of the rationale of the Interval Odds Ratio. Low-education areas are grouped on the left and
high-education areas on the right. The thick grey lines represent the mean odds of consulting private providers in
low-education and high-education areas. The log odds of consulting private providers in each of the 60 areas are
function of the area educational level and of the area-level residual, and are represented as black segments over
and above the thick grey lines. The common odds ratio consists in comparing the thick grey lines. By contrast, the
interval odds ratio also takes into consideration the unexplained area-level variations, and therefore compares the
black segments of one individual selected in a low-education area and one individual from a high-education area.
We present two contrasted fictive cases. In the top part we present a situation in which area-level residual
variations are weak compared with the effect of the area educational level. Conversely, in the bottom part, the
area-level variations are much more important than the area educational effect. In that case, the likelihood is high
of finding an individual in a low-education area who presents higher odds of consulting private providers than
does an individual in a highly educated area.
A
B
Imagine we consider all possible pairs of individuals with similar covariates, in which one individual
resides in a low-education area and the other in a high-education area. For each pair, taking into account the
educational level and the residual of these areas, we compute the OR between the individual in the low-education
area and the individual in the high-education area (the latter individual is always taken into account in the
numerator of the OR, which may therefore be inferior or superior to one). Considering all possible pairs, we then
obtain the distribution of this OR. The IOR-80 is defined as the interval centred on the median of the distribution
that comprises 80% of the values of the OR. In Figure 4 we present the distribution of the OR for the area
educational level in our empirical example, and give the lower and upper bounds of the IOR.
In practice, it is not necessary to calculate the OR for each possible pair. Rather, the lower and upper
bounds of the IOR can be computed with the following equations:
IORlower = exp[β + √(2 × VA) × (-1.2816)] ≈ exp(β – 1.81 √VA)
IORupper = exp[β + √(2 × VA) × (1.2816)] ≈ exp(β + 1.81 √VA)
8
Equation 8
Equation 9
Figure 4 Computation of the interval odds ratio (IOR) for the impact of the area educational variable on the
utilisation of private providers (continuation of figure 3). We consider all possible pairs of individuals with similar
individual covariates, in which one individual resides in a low-education area and the other in a high-education
area. For each pair, taking into account the educational level and the residual of these areas, we compute the odds
ratio between the individual in the low-education area and the individual in the high-education area (the latter
individual is always taken into account in the numerator of the odds ratio). Considering all possible pairs of
individuals from a low- and a high-education areas in our sample, we obtain the distribution of the odds ratio
shown in the Figure. The IOR is defined as the interval centred on the median of the distribution that comprises
80% of the values of the odds ratio. In the Figure we give the lower and upper bounds of the IOR.
The IOR-80 is not a common confidence interval. The interval is narrow if the residual variation between
areas is small (Figure 3, top), and wide if the variation between areas is large (Figure 3, bottom). If the interval
contains the value one, this indicates that the effect of the area characteristic under scrutiny is not that important
when compared with the remaining residual area-level heterogeneity.
In our case, individuals residing in high- v. low-education areas had higher odds of visiting private
physicians (OR = 1.99, 95% CI: 1.49 to 2.65). However, the IOR-80 was fairly wide (0.75 to 5.05) and comprised
the value one (Figure 4). In other words, in comparison with residual area-level variations, the educational variable
was not that important for understanding area-level variations in the individual propensity for seeking a private
practitioner. The IOR therefore brings complementary information to the information provided by the usual OR.
Discussion
We followed a didactic example on health care utilisation in Sweden to indicate how to calculate and interpret
several measures of variance which are appropriate for investigating contextual phenomena of a binary nature.
Measuring clustering of binary phenomena within areas is certainly more problematic than measuring
clustering in the linear case. Different methods have been developed to calculate the VPC in logistic models.[6]
[7] However, the simulation method leads to VPCs that are dependent on the prevalence of the outcome, and can
therefore not be used to compare the magnitude of clustering between phenomena with a different prevalence. On
the other hand, the threshold method for computing the VPC necessitates conversion of binary outcomes into
continuous linear latent variables, which may not be adequate for all phenomena. Furthermore, these methods for
calculating the VPC in logistic regression have interpretative drawbacks when it comes to measuring clustering of
phenomena, owing to the inherent difficulty of distinguishing the individual-level and the area-level variance in
the logistic case.[6] [8] [18]
Computing the MOR is an epidemiologically more suitable option for obtaining measures of variance in
logistic regression. It is not dependent on the prevalence of the outcome and furthermore allows expression of the
area-level variance on the well-known OR scale.[19] Therefore, it allows comparison of the magnitude of arealevel variations with the impact of specific factors.[8] [9]
As previously discussed,[1] [5] it is useful to take into account the magnitude of residual random
variations between areas when interpreting associations between contextual factors and the outcome. In multilevel
logistic models this information is conveyed by the IOR, which indicates whether the contextual factor is useful to
identify high-risk areas, or whether area-level variations are too important to use the contextual factor in
distinguishing high-risk from low-risk areas.
It is noteworthy that measures of variance in logistic regression can be extended to include more complex
patterns of heterogeneity following analogous reasoning than presented with random slopes for linear regression
analysis.[3] [4] [5]
9
CONCLUSION:
As previously indicated by one of us [1] and explained in greater details elsewhere,[5] strategies of
disease prevention need to combine a person-centred approach with approaches aimed at changing the residential
environment.[20] In order to gather information on cross-level causal pathways, which is useful in implementing
these interventions, it is relevant to investigate traditional measures of association between area socioeconomic
characteristics and individual health. However, for assessing the public health relevance of specific geographical
units (e.g. neighbourhoods, municipalities, or districts),[2] multilevel measures of health variation present
themselves as the appropriate epidemiological approach in social epidemiology.
The aim of our paper was to explain why measures of variation available in logistic regression should be
promoted in social epidemiological and public health research as efficient means of quantifying the importance of
the context of residence for understanding disparities in health and health-related behaviour.
References
1.
Merlo J. Multilevel analytical approaches in social epidemiology: measures of health variation compared
with traditional measures of association. J Epidemiol Community Health, 2003; 57:550-52.
2.
Boyle MH, Willms JD. Place effects for areas defined by administrative boundaries. 1999; 149:577-85.
3.
Merlo J, Chaix B, Yang M, et al. A brief conceptual tutorial of multilevel analysis in social
epidemiology - linking the statistical concept of clustering to the idea of contextual phenomenon. J
Epidemiol Community Health, 2004.
4.
Merlo J, Yang M, Chaix B, et al. A brief conceptual tutorial of multilevel analysis in social
epidemiology - investigating contextual phenomena in different groups of individuals. J Epidemiol
Community Health, 2004.
5.
Merlo J, Yang M, Chaix B, et al. A brief conceptual tutorial of multilevel analysis in social
epidemiology - interpreting neighbourhood differences and the effect of neighbourhood characteristics on
individual health. J Epidemiol Community Health, 2004.
6.
Goldstein H, Browne W, Rasbash J. Partitioning variation in generalised linear multilevel models.
Understanding Statistics, 2002; 1:223-32.
7.
Rasbash J, Steele F, Browne W. Logistic models for binary and binomial responses, in A User's Guide to
MLwiN Version 20 Documentation Version 21e. 2003, Centre for Multilevel Modelling Institute of
Education University of London: London, UK.
8.
Larsen K, Merlo J. Appropriate assessment of neighborhood effects on individual health - integrating
random and fixed effects in multilevel logistic regression. Am J Epidemiol, 2004; In press.
9.
Larsen K, Petersen JH, Budtz-Jorgensen E, et al. Interpreting parameters in the logistic regression
model with random effects. Biometrics, 2000; 56:909-14.
10.
Merlo J. FAS- Swedish Council for Working Life and Social Research: "Socioeconomic disparities in
cardiovascular
diseases-a
longitudinal
multilevel
analysis"
(#
2003-0580).
http://wwwfasforskningse/projekt/, 2003.
11.
Hanson BS, Ostergren PO, Merlo J, et al. Halsoforhallande i Skane. Folkhalsoenkat Skane 2000
[Health Conditions in Scania. Public Health Questionnaire, Scania, 2000]. 2001, Malmö, Sweden:
Department of Community Medicine, Malmö University Hospital.
12.
Browne WJ. MCMC estimation in MLwiN. Version 2.0. 2003, London: UK: Centre for Multilevel
Modelling. Institute of Education. University of London. 297p.
13.
Rasbash J, Steele F, Browne W. A User's Guide to MLwiN. Version 2.0. Documentation Version 2.1e.
Centre for Multilevel Modelling Institute of Education University of London, 2003.
14.
Hosmer DW, Lemeshow S. Applied logistic regression. 2nd ed. 2000, Chichester, England: Wiley &
Sons Ltd. 392p.
15.
Rodriguez G, Goldman N. An assessment of estimation procedures for multilevel models with binary
responses. J R Statist Soc A, 1995; 158:73-78.
16.
Snijders TAB, Bosker RJ. Statistical treatment of clustered data, in Multilevel analysis - an introduction
to basic and advanced multilevel modeling. 1999, SAGE Publications: Thousand Oaks, CA. p. 13-37.
17.
Snijders TAB, Bosker RJ. Multilevel analysis - an introduction to basic and advanced multilevel
modeling. 1st ed. 1999, Thousand Oaks, CA: SAGE Publications.
18.
Diez Roux AV. Estimating neighborhood health effects: the challenges of causal inference in a complex
world. Soc Sci Med, 2004; 58:1953-60.
19.
Chaix B, Merlo J, Bobashev G, et al. Re: "Detecting patterns of occupational illness clustering with
alternating logistic regressions applied to longitudinal data". Am J Epidemiol, 2004; 160:505-06.
20.
Macintyre S, Elleway A. Ecological approaches: rediscovering the role of the physical and social
environment, in Social epidemiology, Berkman, LF, Kawachi, I, Editors. 2000, Oxford University Press:
New York. p. 332-48.
10
Chapitre
II
–
Exemples
préliminaires
d’analyse
contextuelle
1) Analyse des effets du contexte de résidence sur différents comportements relatifs à
la santé
Parallèlement à la réflexion sur les méthodes à utiliser en analyse contextuelle, nous avons
cherché à appliquer ces outils à l’étude de comportements relatifs à la santé en France. Dans
une première série d’études, nous nous sommes intéressés aux déterminants contextuels de
facteurs de risque de maladies cardiovasculaires tels que la consommation de tabac et
d’alcool, la sédentarité, et la surcharge pondérale.86, 87 Dans une seconde série d’études, nous
avons étudié l’impact du contexte de résidence sur les modes de recours aux soins
ambulatoires.88, 89 Nous résumons ci-dessous le contenu des quatre articles tirés de ces études
qui ont été publiés dans des revues Britannique et Européennes (Public Health, European
Journal of Public Health, European Journal of Epidemiology).
Avant de décrire les résultats d’intérêt obtenus au moyen de ces travaux, il convient d’en
noter les limites, que nous avons cherché à surmonter dans une série d’études ultérieures. Les
analyses des quatre articles évoqués ci-dessus ont eu recours aux données d’enquête du
Baromètre Santé, produites par l’Institut National de Prévention et d’Education pour la
Santé.90, 91 L’utilisation de ces données, comme de beaucoup d’autres sources d’information
en France, impose des limites importantes à l’étude des effets du contexte sur la santé.
En effet, sur le plan des données, l’analyse contextuelle en épidémiologie sociale requiert
différents éléments :
- Il est d’une part nécessaire de disposer de données de morbidité ou de recours aux soins,
et de données démographiques et socio-économiques que l’on puisse mettre en rapport au
niveau individuel.
- D’autre part, de telles analyses requièrent des tailles d’échantillon importantes, qui
puissent fournir une puissance suffisante pour détecter et étudier les variations géographiques
des phénomènes de santé.
- Enfin, il est d’importance décisive de pouvoir localiser géographiquement les individus
de la façon la plus précise possible. Il n’est pas seulement utile de disposer d’informations sur
24
le contexte de résidence qui aient été couplées aux données individuelles utilisées; il est
d’autre part nécessaire de disposer d’un identifiant du lieu de résidence des individus, qui seul
permet de quantifier le degré de similitude existant entre individus résidant au même endroit ;
il est enfin utile de pouvoir localiser cartographiquement les différents lieux de résidence, afin
de représenter les résultats des analyses.
Du point de vue de ces réquisits, les diverses études que nous avons entreprises à partir
des données du Baromètre Santé présentent des limites notables. Nous disposions certes d’une
quantité d’information importante au niveau individuel, tant au niveau des variables de santé
que des caractéristiques socio-économiques des individus. Toutefois, s’agissant de données
d’enquête représentatives de la population métropolitaine Française, la taille de l’échantillon
(13 000) était peut-être insuffisante, ne permettant certainement pas de capter les variations
géographiques de phénomènes de santé sur une telle étendue territoriale. Mais c’est des
possibilités de localisation géographique des individus que vient la limite la plus importante.
En effet, nous n’étions en mesure de localiser les individus enquêtés qu’au niveau de leur
département de résidence. Un tel niveau d’analyse situe d’emblée nos études en-deça des
standards de la littérature internationale, où le quartier de résidence (soit un niveau infracommunal) est souvent perçu comme le niveau approprié pour étudier l’impact du contexte de
résidence sur la santé.5 Il est en effet difficile de croire que la plupart des processus
contextuels opèrent au niveau géographique des départements français. En conséquence, dans
ces études, nous nous sommes avant tout focalisés sur des processus susceptibles d’opérer à
un niveau macroscopique.
Nous décrivons maintenant brièvement les principaux résultats de ces quatre articles, que
nous rapportons dans leur intégralité à la fin de ce chapitre. Le premier de ces articles a été
publié dans le European Journal of Epidemiology en 2003, et s’intitule « Tobacco and alcohol
consumption, sedentary lifestyle and overweightness in France: a multilevel analysis of
individual and area-level determinants ».87 Dans cette étude, nous nous sommes intéressés à
l’impact des caractéristiques de la zone élargie de résidence sur la consommation de tabac et
d’alcool et sur les risques de sédentarité et de surcharge pondérale.46 Nous avons
particulièrement cherché à voir si le poids des attitudes consuméristes dans la zone de
résidence, que l’on supposait corrélé au niveau de développement économique, pouvait
influer sur ces comportements individuels relatifs à la santé. Nos hypothèses partaient du
constat qu’il y a plus de publicités dans les zones les plus riches (définies à partir du produit
intérieur brut), et que les bars, restaurants, ou fast-foods y sont plus nombreux et y ouvrent
25
plus tard le soir. Cela pourrait contribuer à créer un contexte consumériste dans ces zones
d’activité économique plus intense, susceptible de tirer à la hausse la consommation de
nourriture, de tabac, et d’alcool. Les modes de consommation les plus excessifs pourraient s’y
trouver particulièrement encouragés.
Après ajustement sur une série de facteurs individuels, le risque de fumer de façon
modérée ne semblait pas associé au niveau de richesse économique du département de
résidence. Par contre, la prévalence de fumeurs fortement dépendants au tabac avait tendance
à augmenter avec le produit intérieur brut par habitant. Des résultats identiques ont été trouvés
pour la consommation d’alcool : la proportion de consommateurs modérés n’était pas liée au
produit intérieur brut, mais la prévalence de consommateurs fortement dépendants à l’alcool
avait tendance à augmenter avec le niveau de richesse économique de la zone de résidence, ce
dernier effet n’étant toutefois identifié que parmi les femmes. Du fait de cette interaction entre
effet de genre et effet contextuel, nous avons observé que l’écart entre hommes et femmes
dans le risque de dépendance forte à l’alcool était plus faible dans les zones économiquement
riches. Par ailleurs, après ajustement sur différents facteurs individuels, nous avons trouvé que
le risque de surcharge pondérale augmentait avec le niveau économique de la zone de
résidence, mais cet effet n’a été observé que chez les ouvriers.
Dans une étude parallèle publiée dans le European Journal of Public Health et intitulée
« A multilevel analysis of tobacco use and tobacco consumption levels in France: are there
any combination risk groups? »,86 nous nous sommes particulièrement intéressés à la
consommation de tabac, en cherchant à voir si certains des facteurs individuels et contextuels
associés au risque de fumer étaient également associés à la quantité de tabac consommée
parmi les fumeurs. Au niveau individuel, nous avons observé que les hommes, les individus
faiblement instruits, et les personnes divorcées avaient un risque accru de consommer du
tabac, et qu’ils consommaient des quantités de tabac plus importantes lorsqu’ils étaient
fumeurs. Prolongeant l’étude précédente,87 au niveau du département de résidence, il est
apparu après ajustement sur une série de facteurs individuels que les individus résidant dans
des zones économiquement riches à la fois avaient des chances accrues de fumer et
consommaient des quantités de tabac plus importantes lorsqu’ils étaient fumeurs.
Les résultats de ces deux études sont en cohérence avec les hypothèses que nous avions
émises. Il convient toutefois de noter que les forces d’association étaient faibles. L’intérêt de
ces études sur un plan de santé publique doit donc être envisagé avec prudence. De plus, les
données utilisées imposent des limites importantes aux analyses. Cherchant à identifier un
26
effet du contexte de résidence élargi lié au niveau de consumérisme ambiant, il serait
nécessaire d’ajuster les modèles sur les caractéristiques des zones locales de résidence des
individus, afin de distinguer les effets contextuels intervenant à différents niveaux. N’étant en
mesure de localiser les individus qu’au niveau de leur département de résidence, nous n’avons
pu avancer dans cette voie dans le cadre de ces études, qui n’ont de ce fait que le mérite
d’indiquer des pistes à suivre pour des recherches futures.
Les deux études suivantes réalisées à partir des donnés du Baromètre Santé se sont
intéressées aux comportements de recours aux soins ambulatoires. La première de ces études
sera prochainement publiée dans la revue Britannique Public Health sous le titre « Area-level
determinants of specialty care utilisation in France: a multilevel analysis ».88 Partant du
constat effectué dans la littérature d’un moindre recours aux spécialistes en milieu rural qu’en
milieu urbain,101, 102, 103, 104 nous avons cherché à voir si ce contraste urbain – rural pouvait
être lié aux variations de la densité de spécialistes et du niveau socio-économique du contexte
de résidence sur le territoire. Un modèle multiniveau a permis de mettre en évidence des
variations inter-départementales modestes mais néanmoins significatives dans la propension à
consulter des médecins spécialistes. Une grande part du contraste observé entre zones
urbaines et zones rurales dans la propension à recourir à des médecins spécialistes a pu être
expliquée au moyen des variables départementales socio-économiques et de densité de
médecins. Toutefois, nous avons observé que ces effets différaient en intensité et en nature
chez les hommes et les femmes. Chez les hommes, la propension à recourir à des spécialistes
augmentait fortement avec la densité départementale de spécialistes. Après ajustement des
facteurs les uns sur les autres, cet effet n’était pas significatif chez les femmes. Leur
comportement de recours semblait plutôt associé, quoique moins fortement, au niveau socioéconomique du département de résidence. Nous avons formulé des hypothèses afin
d’expliquer cet effet apparemment différencié du contexte sur les modes de recours aux soins
des hommes et des femmes, hypothèses qui constituent un point de départ pour d’éventuels
travaux futurs sur la question.
La dernière des quatre études entreprises à partir des données du Baromètre Santé sera
prochainement publiée dans le European Journal of Public Health sous le titre « Acess to
general practitioners: the disabled elderly lag behind in underserved areas ».89 Dans cette
étude, nous sommes partis du constat de la littérature que les individus qui résident dans des
zones à faible densité médicale ont des risques accrus de ne pas consulter de médecins sur une
période donnée.102, 105 Nous intéressant à la médecine générale, nous avons cherché à voir si
27
derrière cet effet moyen pour l’ensemble de la population, des sous-groupes d’individus à
mobilité réduite (tels que les personnes âgées ou les personnes présentant un handicap)
n’avaient pas un recours au médecin particulièrement réduit quand ils résidaient dans des
zones à faible densité médicale. Utilisant des modèles de Poisson multiniveaux pour étudier
les variations du nombre de consultations de généraliste rapportées au cours des 12 derniers
mois, nous avons trouvé après ajustement sur une série de caractéristiques individuelles et
contextuelles que le fait de vivre dans une zone à faible plutôt que forte densité médicale était
associé à une réduction plus importante du nombre de consultations de généraliste pour les
individus âgés que pour les plus jeunes, et plus particulièrement encore pour les personnes
âgées qui présentaient un handicap. Les personnes âgées ayant rapporté un handicap avaient
244% de consultations de généraliste en plus (intervalle de confiance à 95% : 79% – 562%)
quand elles résidaient dans des zones à densité élevée plutôt que faible (ces zones ayant été
définies à partir des 10ème et 90ème percentiles).
Ces deux études des variations géographiques des modes de recours aux soins
ambulatoires mettent en évidence des effets contextuels importants liés aux densités
médicales ainsi qu’au niveau socio-économique du milieu de résidence, qui persistent après
que l’on ait tenu compte des effets associés aux caractéristiques des individus. A partir de ces
études préliminaires, il reste toutefois difficile de tirer des conclusions définitives utilisables
en santé publique. Les disparités territoriales mises en évidence renvoient probablement d’une
part à des difficultés réelles d’accès aux médecins spécialistes dans certaines zones, et d’autre
part à des modes de recours aux soins centrés sur l’utilisation de spécialistes dans les zones
les plus urbaines et favorisées. La confirmation de ces différentes hypothèses au travers
d’études plus approfondies justifierait l’intervention des pouvoirs publics, tant pour œuvrer à
combler les trous de couverture du territoire en médecins spécialistes, que pour informer les
populations des zones les plus favorisées de l’utilité qu’il y a recourir à un médecin
généraliste pour une bonne coordination des soins. Dans la suite de notre travail de thèse,
nous avons cherché à avancer dans ces analyses des variations géographiques des modes de
recours aux soins et des effets du contexte sur l’utilisation des soins. Cela a nécessité de
travailler à un niveau de granularité spatiale plus local que le département de résidence.
Au-delà des questionnements spécifiques à l’analyse de phénomènes particuliers, ces
premières études ont donc soulevé la question du choix du niveau à utiliser pour capter les
variations géographiques des phénomènes et définir les effets du contexte. De plus, à quelque
échelle d’analyse que ce soit, on peut interroger la pertinence qu’il y a à s’appuyer sur des
28
zones définies à partir de limites administratives, qui peuvent s’avérer arbitraires au regard
des différents phénomènes étudiés.
2) Analyse des effets du ménage de résidence sur les modes de recours aux soins
Dans la plus grande partie de notre travail de thèse, nous nous sommes intéressés aux
déterminants du contexte géographique de résidence. Toutefois, les facteurs du contexte
peuvent être appréhendés à bien d’autres niveaux, puisque au-delà du milieu géographique de
résidence, le contexte de vie des personnes comprend également l’environnement familial ou
le milieu professionnel.23, 94, 106 Dans le cadre d’une étude entreprise à partir des données de
l’Enquête Permanente sur les Conditions de Vie des Ménages de l’INSEE et réalisée en
collaboration avec une chercheuse du Center for Home Care Policy and Research de New
York, nous avons examiné différents processus opérant au sein du ménage et susceptibles
d’influer sur les modes de recours aux soins des individus. Cette étude a été soumise au
European Journal of Public Health, qui nous a demandé d’effectuer les quelques corrections
suggérées par ses relecteurs. Elle est actuellement en cours de révision.
Dans cette étude, les facteurs contextuels considérés n’ont pas été définis à l’échelle des
ménages et attribués à l’ensemble des individus de ces ménages, comme on le fait
habituellement lorsque l’on s’intéresse aux effets du contexte de résidence. Nous avons plutôt
cherché à saisir l’impact que certaines dynamiques inter-individuelles pouvaient avoir sur les
modes de recours aux soins des membres du ménage. Tout particulièrement, nous avons
examiné si les individus qui résidaient avec des personnes en mauvaise santé n’avaient pas
une moindre utilisation de soins que ceux qui résidaient avec des personnes en bonne santé.
L’hypothèse sous-jacente était que les ressources financières et les ressources en temps
disponible des différents membres du ménage sont en priorité dépensées pour les individus du
ménage dont les besoins de santé sont les plus urgents.107, 108, 109, 110 De ce fait, nous nous
attendions à observer un recours aux soins plus réduit pour les individus résidant avec des
personnes en mauvaise santé.
Conformément à nos hypothèses, nous avons observé que la probabilité qu’un individu
recoure à des soins ambulatoires diminuait à mesure que l’état de santé des personnes avec
qui il résidait était dégradé, et à mesure que le nombre de co-résidents en mauvaise santé dans
son ménage augmentait. Ces deux associations ont été séparément observées pour trois types
29
de recours aux soins : pour l’utilisation de médecins généralistes, pour l’utilisation de
spécialistes, et pour le recours à des examens et tests préventifs.
Cette étude a été réalisée à partir de données issues d’une enquête en population générale
de l’INSEE, qui n’avait pas été conçue pour répondre à de tels objectifs de recherche. En
conséquence, malgré l’originalité de nos résultats sur des associations qui n’avaient pas été
étudiées dans la littérature, nous n’étions pas en mesure d’avancer plus avant dans l’analyse
des mécanismes à l’origine de ce phénomène. Si des études futures confirment les résultats
obtenus, il pourrait s’avérer à la fois légitime et coût-efficace d’assurer un accès aux soins
régulier aux individus qui résident avec des personnes en mauvaise santé, leur permettant à la
fois de rester en bonne santé sur le long terme et de prodiguer d’éventuels soins ou du soutien
à leur co-résidents malades.
30
ARTICLE IN PRESS
Public Health (0000) xx, xxx–xxx
1
57
2
58
3
59
4
60
5
61
6
62
7
8
9
10
Area-level determinants of specialty care
utilization in France: a multilevel analysis
11
12
13
17
18
69
70
71
72
a
Research Unit in Epidemiology and Information Sciences, National Institute
of Health and Medical Research (INSERM U444), Paris, France
b
National Institute for Prevention and Health Education, Vanves, France
73
74
75
F
Received 4 August 2003; received in revised form 4 May 2004; accepted 4 May 2004
O
21
22
26
27
28
Health services research;
Referral and
consultation; Health
services accessibility;
Socio-economic factors
29
TE
D
30
31
32
33
34
35
EC
36
37
38
R
39
40
41
R
42
43
O
44
45
51
52
53
54
55
56
U
50
N
47
48
*Corresponding author. Address: INSERM U444, Faculté de
Médecine Saint-Antoine, 27 rue Chaligny, 75571 Paris cedex 12,
France. Tel.: þ 33-1-4473-8443; fax: þ 33-1-4473-8663.
E-mail address: [email protected]
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
C
46
49
Summary Objectives. We investigated the effects of the density of specialists and of
the area-level percentage of highly educated individuals on the odds of consulting a
specialist, and examined whether these variables could explain the observed
urban/rural contrast in utilization of specialty care.
Study design. The study sample, representative of the French population aged 18– 75
years in 1999, comprised 12,435 individuals.
Methods. Multilevel logistic models allowed us to investigate predictors of the odds
of consulting a specialist occasionally, regularly and frequently over the previous 12
months.
Results. We observed a modest but significant clustering within areas of the
utilization of specialty care, with higher levels of clustering for behaviours
representing heavy consumption of care. After adjustment for individual factors,
the odds of consulting a specialist were higher in larger cities compared with rural
areas, but most of this effect was attributable to other area-level variables. These
area-level effects were different in magnitude and nature among males and females.
Among males, the odds of consulting a specialist increased with the area-level density
of specialists. Among females, such an effect was not significant, but the odds of
consulting a specialist increased with the area-level percentage of highly educated
individuals.
Conclusions. Further investigation is required to better understand the processes
operating at the area level that were shown to affect healthcare utilization in a
different way for males and females. Policies may be needed to address problems of
geographical access to specialty care, as well as situations of overuse of specialty care
without regular recourse to primary care.
Q 2004 Published by Elsevier Ltd. on behalf of The Royal Institute of Public Health.
O
25
KEYWORDS
PR
23
24
66
68
B. Chaixa,*, P-Y. Boëllea, P. Guilbertb, P. Chauvina
19
20
65
67
14
15
16
63
64
103
104
105
Introduction
106
107
In most industrialized countries, significant territorial variations have been reported in the utilization of healthcare services. This may reflect an
important public health problem since access to
0033-3506/$ - see front matter Q 2004 Published by Elsevier Ltd. on behalf of The Royal Institute of Public Health.
doi:10.1016/j.puhe.2004.05.006
PUHE 206—1/7/2004—13:09—SHYLAJA—109721— MODEL 6 — pp. 1–8
108
109
110
111
112
ARTICLE IN PRESS
2
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
F
125
O
124
Methods
O
123
PR
122
This French system leads to access to medical
resources that is not entirely equitable, since
poorer people have restricted access to specialty
care.15
The present study had three main objectives.
First, we used logistic multilevel models16 and the
recently developed median odds ratio (MOR)17 to
quantify the variations in specialty care utilization
from one area to another. Since social segregation18
and an unequal distribution of specialist physicians19
have long been known to prevail in France, our
second objective was to examine whether the odds
of consulting a specialist were related to the density
of specialists and to the area-level percentage of
highly educated individuals. Thirdly, we sought to
determine whether the contrast between rural and
urban environments with respect to the odds of
consulting a specialist could be attributed to these
area-level factors, and whether these contextual
factors were sufficient to account for the territorial
variations in the utilization of specialty care.
The analyses used a representative sample of
the French population made up of individuals aged
18 – 75 years ðn ¼ 12; 435Þ enrolled in a 1999 telephone survey by the National Institute for Prevention and Health Education (INPES).20 In this survey,
the response rate was 0.69.
TE
D
121
EC
119
120
R
118
R
117
O
116
C
115
medical resources should be made equal wherever
individuals reside. A large number of studies
depended on the contrast between urban and
rural areas to describe these territorial variations
in utilization of healthcare services.1 Many of these
studies reported that utilization was particularly
reduced in rural compared with urban settings with
regard to specialty care services.2 – 5
The lower utilization of physician services in
rural areas may be due to the shorter supply of
physicians in these areas, and to attitudes and
beliefs associated with the lower socio-economic
status of rural settings. Several studies have
investigated whether the area-level supply of
physicians and socio-economic indicators were
associated with utilization and related outcomes.6,7 Certain studies found some effects of
the area-level density of physicians on the odds of
consulting,8,9 whereas others found little or no such
effects.10 – 13 Reduced utilization of specialty care
has also been reported in socio-economically
deprived areas (after adjustment for the individual-level socio-economic status).13 Of particular
interest for the present analysis are the few studies
that examined whether a significant proportion of
the urban/rural contrast in the utilization of
physician services was attributable to these arealevel factors of physician supply and socio-economic deprivation. For example, one US study of
Medicaid beneficiaries reported that rural individuals had less access to specialty mental health care
than urban dwellers, and found that this difference
was largely explained by variations in the supply of
specialty mental health providers.3 However, this
study suffers from major methodological limitations as it did not use multivariate methods to
control for the effects of other factors influencing
utilization, nor did it use statistical methods (such
as multilevel models) to take the hierarchical
structure of the data into account.
In our study, while addressing these shortcomings, we examined the impact of the number of
specialists per 100,000 inhabitants as a proxy for
accessibility to specialists,10,11 and looked into the
effect of the area-level percentage of highly
educated individuals on the odds of consulting a
specialist. In France, patients may consult any
physician of their choice (general practitioner (GP)
or specialist) at any time and as frequently as they
wish (although higher fees prevail for specialists).
As a general rule, patients have to pay for outpatient care services at the point of delivery, and
later obtain partial reimbursement from the social
security system.14 User charges that are not
reimbursed in this way may be refunded by
supplementary elective health insurance.
N
114
U
113
B. Chaix et al.
Data
Individuals were asked about the number of times
over the previous 12 months they had consulted, for
health concerns of their own, a psychiatrist,
psychologist or psycho-analyst; a gynaecologist;
an acupuncturist, mesotherapist or osteopath; or
any other specialist (the following examples were
given to the surveyed individuals: a dermatologist;
a paediatrician; and an allergist). These consultations were totalled and three binary variables
were defined to indicate whether the individual in
question saw a specialist at least once (occasionally), at least four times (regularly) or at least six
times (frequently) over the previous 12 months.
People were also asked about the number of times
they saw a GP over the same period.
Regarding health status, we first considered
whether the individual reported a chronic disease
and whether he was disabled. Secondly, four of
the scales of the Duke Health Profile (based on a
17-item generic questionnaire21) were used as
continuous variables: physical health, mental
PUHE 206—1/7/2004—13:09—SHYLAJA—109721— MODEL 6 — pp. 1–8
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
ARTICLE IN PRESS
Area-level determinants of specialty care utilization in France
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
In order to confirm the previously reported socioeconomic differences in access to specialists and
compare them with access to GPs in a country with
universal health insurance, mean numbers of
consultations with GPs and specialists were computed for ordered socio-economic classes (education, occupation, income). The variations were
put through the non-parametric Jonckheere-Terpstra test.
Multilevel logistic models16 (with individuals
nested within areas) containing individual factors
(whether these turned out to be significant or not)
were fitted to the data for the odds of consulting a
specialist occasionally, regularly and frequently
(model 1). In order to quantify the heterogeneity
between areas in utilization of specialty care on the
OR scale, we used Larsen’s MOR.17 If we choose two
individuals with similar covariates in two different
areas at random and computed the OR between the
individual at lowest risk and the individual at
highest risk, the MOR is defined as the median
value of the distribution of this OR. The MOR can be
directly computed from the area-level variance of
the multilevel models. In model 2, we introduced
the size of the municipality of residence. In model
3, we also introduced the two area-level variables,
but only retained them when they were significantly
associated with the outcomes. At each step, arealevel residuals were estimated.
The fully adjusted models were stratified by
gender to determine whether area-level effects
differed between males and females. For confirmation, the analyses were repeated for females
without taking consultations with gynaecologists
into account, allowing for a better comparison with
the results for males.
The parameters of the multilevel models were
estimated using MLwiN 1.2 software (Institute of
Education, London). The ORs and their 95% confidence intervals (CI) were computed.
F
237
285
O
236
282
284
Data analysis
O
235
281
283
PR
234
medium-high, high) with the 15th, 50th, and 85th
percentiles as cut-off points.
TE
D
233
EC
231
232
R
230
R
229
O
228
C
227
health and perceived health (range: 0 – 100), with
higher scores indicating better health; and disability (range: 0 – 100), with higher scores indicating
more acute dysfunction. Finally, as an additional
proxy for health status, we took the number of GP
consultations over the previous 12 months into
account.
With regard to socio-economic status, we
divided educational achievement into three categories, the intermediate category comprising
individuals who only graduated from secondary
school. Monthly household income was adjusted
for household size, and then divided into four
categories (e610 or less, e611 – 1100, e1101 – 1350
and above e1351/person). With regard to occupational status, we differentiated among farmers,
craftsmen-shopkeepers, blue-collar workers,
lower-level white-collar workers, intermediate
professions and upper-level white-collar workers.
We also took employment status (inactivity, unemployment, government-subsidized employment,
full-time work or part-time work) and marital status
(never married, married, divorced or widowed) into
account. In order to distinguish rural from urban
contexts, we classified the size of the municipality
of residence into four categories: rural municipalities; small towns (population 2000 – 20,000); medium-sized towns (population 20,000 – 200,000); and
larger cities.
Regarding the area-level variables of interest,
the level of accessibility to specialists was
measured by the number of specialists per
100,000 people, as obtained from the French
Ministry of Health. Using census data, we defined
the socio-economic level of the context as the
percentage of residents with some higher education
(university degree or equivalent). These variables
were defined at the level of French administrative
‘departments’. Mainland France (excluding overseas territories) is subdivided into 95 administrative
‘departments’, referred to below as areas of
residence. In 1999, between 75,000 and 2,500,000
individuals resided in each of these areas. To verify
that departments were homogeneous with respect
to area-level variables, we considered the 329
subdepartmental administrative areas and computed intradepartmental correlation coefficients in
order to measure the correlation of an area-level
variable between subdepartmental areas belonging
to the same department.22 These coefficients were
very high (0.50 for specialist density and 0.67
for mean educational level) and highly significant
ðP , 0:0001Þ; indicating that it is relevant
to measure these area-level variables at the
departmental level. Each area-level variable was
divided into four categories (low, medium-low,
N
226
U
225
3
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
Results
329
330
In our study sample, 53% of the individuals had
consulted a specialist occasionally, 21% regularly
and 9% frequently. The mean number of consultations with a GP decreased with increasing socioeconomic status of the individual (Fig. 1). Conversely, the mean number of consultations with
PUHE 206—1/7/2004—13:09—SHYLAJA—109721— MODEL 6 — pp. 1–8
331
332
333
334
335
336
ARTICLE IN PRESS
4
B. Chaix et al.
337
393
338
394
339
395
340
396
341
397
342
398
343
344
399
400
345
401
346
402
347
403
348
404
349
405
350
406
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
381
386
387
388
389
390
391
392
C
N
385
Model for consulting a specialist occasionallya
Model for consulting a specialist regularlya
Model for consulting a specialist frequentlya
F
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
437
Table 1 Variations between areas in the utilization of specialist physicians, as estimated from multilevel logistic models with
individual-level variables.
U
383
384
407
408
436
O
380
382
O
358
O
357
compared with people living in rural municipalities
(Table 2, model 2). When including area-level
variables in this model taking males and females
together, the percentage of highly educated individuals showed no association with consulting a
specialist occasionally, and was therefore removed
from the model (Table 2, model 3). A dose-response
relationship indicated that the odds of occasional
consultation increased with the area-level density
of specialists (OR 1.25, 95% CI 1.04 – 1.52 for
medium-high density, OR 1.36, 95% CI 1.10 –1.68
for high density, compared with areas with a low
density). It is notable that when the area-level
density of specialists was added to the model, the
odds of consulting a specialist occasionally were no
longer significantly higher for residents of large
cities than for those of rural municipalities. When
we shifted from model 1 to model 3, the area-level
unexplained variations decreased by 39% (Table 2).
The MOR decreased from 1.20 to 1.15, indicating
that some part of the variability between areas had
been explained. The area-level residuals from
model 1 and model 3 were plotted on Fig. 2 in
increasing values from left to right. This figure
shows the decrease in the variance of the area-level
PR
356
a specialist increased with increasing socio-economic status.
Multilevel logistic models indicated that there
were weak but significant variations between areas
in the odds of consulting a specialist (Table 1). The
magnitude of area-level variations was stronger for
regular consultations than for occasional consultations, and was stronger still for frequent consultations. In the empty model for occasional
consultations, the MOR was equal to 1.20
(Table 1). When selecting two individuals in two
different areas at random, the OR between the
individual at lowest risk and the individual at
highest risk was above 1.20 in half of the cases,
indicating a certain level of heterogeneity in
consulting behaviour between areas. The MOR
increased to 1.22 for regular consultations and
1.26 for frequent consultations, indicating a
higher heterogeneity between areas for patterns
of frequent utilization of specialty care.
After adjustment for health needs and sociodemographic factors, the odds of consulting a
specialist occasionally were higher for people living
in medium-sized towns (OR 1.18, 95% CI 1.01 –1.38)
or large cities (OR 1.20, 95% CI 1.04 – 1.38)
TE
D
355
EC
354
R
353
Figure 1 Number of consultations of general practitioners and specialists per capita over the prior 12 months, France,
1999.
R
351
352
Area-level variance s2u0 (SE)
Median odds ratio
0.036 (0.012)**
0.044 (0.016)**
0.057 (0.020)**
1.20
1.22
1.26
438
439
440
441
* P , 0:05; * * P , 0:01: SE, standard error. models were adjusted for age, sex, education, income, occupation, employment and
marital status.
a
Consulting occasionally, regularly and frequently were defined as having consulted at least one, three and six times, respectively,
over the prior 12 months.
PUHE 206—1/7/2004—13:09—SHYLAJA—109721— MODEL 6 — pp. 1–8
442
443
444
445
446
447
448
ARTICLE IN PRESS
Area-level determinants of specialty care utilization in France
449
450
Table 2 Impact of the rural/urban status of municipality of residence and area-level factorsa on the odds of consulting a specialist
at least once over the prior 12 months. Fully adjusted odds ratios (and 95% confidence intervals), France, 1999.
451
452
5
Model 1b
Variable name
453
Model 2b
c
OR (95% CI)
(95% CI)
(95% CI)
508
OR
(95% CI)
454
459
460
461
462
463
464
465
466
467
468
469
470
Random components
s2u0 (SE)
0.97
1.18
1.20
(0.83– 1.15)
(1.01– 1.38)
(1.04– 1.38)
s2u0 (SE) MOR
0.036** (0.012) 1.20
s2u0 (SE) MOR
0.031** (0.010) 1.18
* P , 0:05; * * P , 0:01: OR, odds ratio; CI, confidence interval; SE, standard error; MOR, median odds ratio.
a
The area-level percentage of highly educated individuals was not significant in the model for both genders, so so it was removed
from the model.
b
Models 1–3 were adjusted for age, sex, education, income, occupation, employment and marital status.
c
For 100,000 inhabitants: low, ,90.5; medium– low, 90.5 – , 132; medium–high, 132– , 201; high, $201.
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
PR
Beyond the barriers associated with individual
socio-economic status, we found that area-level
variables are associated with the utilization of
specialty care; specifically, the density of specialists among males, and the percentage of highly
educated individuals among females.
One strength of our analysis was that our models
were adjusted for more individual health variables
than in many previous studies.23,24 However, our
study had a few limitations that should not be
overlooked. First, we used self-reported data.
These may be subject to memory bias since
individuals had to remember the number of times
they visited a specialist over a long period of time
(12 months). Nevertheless, there is no evidence to
TE
D
482
EC
481
R
479
480
Discussion
R
478
residuals from model 1 to model 3. The parameter
of variance persisted in differing significantly from
zero in the final model (Table 2, model 3).
In estimating the models among males, we found
that the percentage of highly educated individuals
showed no correlation with the utilization of
specialty care (Table 3), whereas the density of
specialists did. The strength of association was
stronger for regular consultations (OR 1.72, 95% CI
1.14 – 2.58 for a high compared with a low density of
specialists) than for occasional consultations (OR
1.53, 95% CI 1.21 – 1.94), and was stronger still for
frequent consultations (OR 2.84, 95% CI 1.62 – 4.99).
Among females, area-level variables were not
significantly associated with occasional consultations (Table 3). In the models for females for
regular consultations and frequent consultations,
the density of specialists was not significant as it
was for males; however, the odds of consulting a
specialist increased with the percentage of highly
educated individuals. The strength of association
was stronger for frequent consultations (OR 1.72,
95% CI 1.31 –2.26 for a high compared with a low
percentage of highly educated individuals) than for
regular consultations (OR 1.58, 95% CI 1.20 – 2.09).
Strengths of association of a comparable magnitude
were found among females when the analyses were
repeated, without taking consultations with gynaecologists into account. Overall, therefore, arealevel variables had a weaker impact on consulting
behaviour for females than for males.
514
515
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
O
477
s2u0 (SE) MOR
0.022* (0.008) 1.15
549
550
C
476
(0.88 –1.30)
(1.04 –1.52)
(1.10 –1.68)
551
552
N
475
513
517
1.07
1.25
1.36
553
554
U
474
(0.82 –1.14)
(1.01 –1.37)
(0.97 –1.30)
516
471
472
473
0.97
1.18
1.13
F
458
511
512
O
457
509
510
Municipality of
residence (compared
with rural municipality)
Small town
Medium-sized town
Large city
Number of specialistsc
(compared with low)
Medium–low
Medium–high
High
O
455
456
506
507
Model 3b
OR
505
555
556
557
Figure 2 Area-level residuals from the individual-level
model (model 1) and the final model (model 3), France,
1999.
PUHE 206—1/7/2004—13:09—SHYLAJA—109721— MODEL 6 — pp. 1–8
558
559
560
ARTICLE IN PRESS
6
561
562
563
B. Chaix et al.
Table 3 Gender-specific impact of area-level density of specialists and percentage of highly educated individuals on the odds of
consulting a specialist occasionally, regularly or frequently (at least one, three or six times, respectively, over the prior 12 months).
Fully adjusted odds ratios (and 95% confidence intervals), France, 1999.
564
565
For consulting a specialist
occasionallya
For consulting a specialist
regularlya
For consulting a specialist
frequentlya
OR
OR
OR
566
Among males
Percentage of highly educated
individuals
Density of specialistsc
Low
Medium–low
Medium–high
High
574
575
576
577
578
579
580
581
582
583
584
585
586
587
Among females
Percentage of highly educated
individualsd
Low
Medium–low
Medium–high
High
Density of specialists
Non-significantb
Non-significantb
1.00
1.13
1.36
1.53
1.00
1.31
1.56
1.72
1.00
1.49
2.53
2.84
(0.89– 1.43)
(1.09– 1.71)
(1.21– 1.94)
–
–
–
–
Non– significantb
601
602
603
604
605
606
607
608
609
610
611
612
Principal findings
613
614
615
616
(0.91–1.38)
(1.02–1.55)
(1.20–2.09)
1.00
1.17
1.24
1.72
Non–significantb
629
630
631
632
634
(0.92–1.47)
(0.98–1.57)
(1.31–2.26)
within some areas rather than others. The context
dependence of this healthcare utilization behaviour
was partly attributable to the difference between
urban areas and settings that were more rural in
nature. However, we did observe that the contrast
in specialty care utilization between urban and
rural areas could be explained by other area-level
variables, namely the density of specialists and the
socio-economic level of the area.
Specialty care utilization appeared to be associated with different area-level variables among
males and females. Among males, the odds of
consulting a specialist increased with the area-level
density of specialists. Among females, such odds
increased with the area-level percentage of highly
educated individuals. In comparison with the
standards of contextual analysis for such large
areas, associations were strong and outstandingly
linear.
The two area-level variables investigated here
may be related to different processes operating at
the area level. The independent effect of the
density of specialists may stem from several
mechanisms. First, the distance to the nearest
specialist is likely to be greater in areas with a low
density of specialists, thereby impeding access to
them.9 Secondly, it may also be more difficult to get
TE
D
EC
599
600
R
598
R
597
suggest that the accuracy of recollection might be
lower in low-density areas or in low-educated
areas, after adjustment for individual factors and
type of municipality of residence. Secondly, due to
the original INPES questionnaire, consultations with
psychologists or psycho-analysts (who are not
considered to be physicians), and of acupuncturists
(who are registered as GPs in France) had to be
taken into account in the outcome variable.
Thirdly, we defined areas of residence in terms of
French ‘departments’, which are quite large
areas. Thus, the area-level indicators (density
of specialists and percentage of highly
educated individuals) were rather crude. However,
considering the administrative subdivisions of these
departments, we observed that the subdepartmental areas belonging to the same department were
similar to a significant degree with respect to the
area-level variables investigated. Such intradepartmental homogeneity indicates that it is meaningful
to conduct the analysis at the departmental level.
O
596
628
(0.90–2.47)
(1.51–4.26)
(1.62–4.99)
OR, odds ratio; CI, confidence interval.
a
Models were adjusted for age, sex, education, income, occupation, employment, marital status, and type of municipality of
residence.
b
Area –level variables that were not significant were removed from the models.
c
Density of specialists per 100,000 inhabitants: low: ,90.5; medium–low: 90.5– , 132; medium–high: 132– , 201; high: $201.
d
Percentage of highly educated individuals: low, ,5.1%; medium–low, 5.1%– , 7.4%; medium– high, 7.4– , 12.0%; high, $12.0%.
C
595
(0.94–1.82)
(1.09–2.23)
(1.14–2.58)
1.00
1.12
1.25
1.58
Non–significantb
N
594
623
624
633
U
593
621
626
Non-significantb
589
591
592
620
627
588
590
619
625
O
573
(95% CI)
Non-significantb
PR
572
(95% CI)
F
569
(95% CI)
O
Variable name
571
618
622
567
568
570
617
After adjustment for several individual-level variables, we observed that specialty care utilization,
especially at high levels of consumption, clustered
PUHE 206—1/7/2004—13:09—SHYLAJA—109721— MODEL 6 — pp. 1–8
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
ARTICLE IN PRESS
Area-level determinants of specialty care utilization in France
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
F
685
Acknowledgements
O
684
730
Utilization of specialty care appears to be related
to the density of specialists among males, and to
the area-level educational level among females.
These geographic variations in the utilization of
specialty care suggest potential risks of underuse
and inappropriate use of specialty care. On one
hand, underuse of specialty care in certain
areas may result in suboptimal diagnosis and
treatment options.26,27 Conversely, in other
areas, individuals may frequently self-refer to
specialists without regular recourse to GPs, which
may lead to a lack of co-ordination in health
care.28,29 Policies may need to be developed to
address potential problems of access to specialty
care,30,31 and educational programmes instituted
to clarify the respective roles of GPs and
specialists.
O
683
729
We gratefully thank the National Institute for
Prevention and Health Education for providing the
data for the present study. The first author
carried out this work with a doctoral grant, and
a grant from the French Ministry of Research
(TTT027). The project was supported by the
Avenir 2002 programme of INSERM (the French
National Institute of Health and Medical
Research).
PR
682
Conclusion
TE
D
681
EC
679
680
R
678
R
677
O
676
C
675
an appointment with a specialist in these areas
because of their heavier workload.8 Thirdly, there
may be a higher physician-induced demand in high
medical density areas.25 The independent effect of
the percentage of highly educated individuals is
likely to stem from different mechanisms. Since we
adjusted for several socio-economic indicators at
the individual level, this effect may result from the
fact that individuals residing in more socially
advantaged areas have different beliefs, expectations and attitudes regarding the healthcare
system, especially regarding the differences
between GPs and specialists.
One hypothesis to account for differences in
area-level predictors of specialty care utilization
for males and females is related to the fact that
they have very different attitudes regarding specialty care utilization. In the model for occasional
consultations that includes individual and contextual factors, the OR for being a female compared
with a male was equal to 6.44 ðP , 0:0001Þ; and was
still significant when consultations with gynaecologists were excluded. Males may feel the need to
consult with a specialist much less frequently than
females. Therefore, difficulties of access to
specialists, as expressed by a low density of
specialists in the area of residence, may discourage
males from consulting a specialist. Females, on the
other hand, whether because of their regular need
to visit gynaecologists or because they may be more
exposed to health information through the media
(e.g. the women’s press), may be much more
accustomed to consulting a specialist. They may
be particularly sensitive to the common values,
beliefs and expectations shared in the context of
their place of residence, which may have been
captured in our study in the area-level percentage
of highly educated individuals. The data available
did not allow us to confirm this hypothesis, although
further investigation is clearly warranted.
We did observe that the area-level effects on
consulting patterns were stronger for regular
consultations than for occasional consultations,
and were stronger still for frequent consultations
for both genders. This validates our findings
regarding area-level effects. On the other hand,
area-level effects were found to be weaker among
females. There is no reason to expect that, for
example, males and females differed in their ability
to recall whether they had consulted at least once
over the previous 12 months. Rather than attribute
it to a measurement error, this difference in the
magnitude of the area-level effects could be due to
the fact that different area-level processes are
involved among males and females.
N
674
U
673
7
References
1. Blazer DG, Landerman LR, Fillenbaum G, Horner R. Health
services access and use among older adults in North Carolina:
urban vs rural residents. Am J Public Health 1995;85:
1384—90.
2. Rosenthal TC, Fox C. Access to health care for the rural
elderly. JAMA 2000;284:2034—6.
3. Lambert D, Agger MS. Access of rural AFDC Medicaid
beneficiaries to mental health services. Health Care Financ
Rev 1995;17:133—45.
4. Halldorsson M, Kunst AE, Kohler L, Mackenbach JP. Socioeconomic differences in children’s use of physician services
in the Nordic countries. J Epidemiol Commun Health 2002;
56:200—4.
5. Casey MM, Thiede Call K, Klingner JM. Are rural residents less
likely to obtain recommended preventive healthcare services? Am J Prev Med 2001;21:182—8.
6. Parchman ML, Culler SD. Preventable hospitalizations in
primary care shortage areas. An analysis of vulnerable
Medicare beneficiaries. Arch Fam Med 1999;8:487—91.
7. Saag KG, Doebbeling BN, Rohrer JE, Kolluri S, Mitchell TA,
Wallace RB. Arthritis health service utilization among
the elderly: the role of urban-rural residence and other
utilization factors. Arthritis Care Res 1998;11:177—85.
PUHE 206—1/7/2004—13:09—SHYLAJA—109721— MODEL 6 — pp. 1–8
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
ARTICLE IN PRESS
8
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
TE
D
814
815
816
817
818
819
EC
820
821
822
R
823
824
825
R
826
827
O
828
829
C
830
835
836
U
834
N
831
832
833
F
788
19. Lucas-Gabrielli V, Tonnellier F. Déserts médicaux ou zones
défavorisées? Démographie médicale et indicateurs de
besoins. Technologie et Santé 2001;45:32—8.
20. Guilbert P, Baudier F, Gautier A, Arwidson A, Janvrin P,
Baromètre M. Baromètre Santé 2000. Méthodes. Vanves;
2001. Editions CFES.
21. Guillemin F, Paul-Dauphin A, Virion JM, Bouchet C, Briancon
S. The DUKE health profile: a generic instrument to measure
the quality of life tied to health. Sante Publique 1997;9:
35—44.
22. Leyland AH, Goldstein H. Multilevel modelling of health
statistics. Chichester: Wiley; 2001.
23. Birch S, Eyles J, Newbold KB. Equitable access to health
care: methodological extensions to the analysis of physician
utilization in Canada. Health Econ 1993;2:87—101.
24. Kephart G, Thomas VS, MacLean DR. Socio-economic
differences in the use of physician services in Nova Scotia.
Am J Public Health 1998;88:800—3.
25. Tussing AD, Wojtowycz MA. Physician-induced demand by
Irish GPs. Soc Sci Med 1986;23:851—60.
26. Soloway B. Primary care and specialty care in the age of
HAART. AIDS Clin Care 1997;9:37—9.
27. Baker DW, Hayes RP, Massie BM, Craig CA. Variations in
family physicians’ and cardiologists’ care for patients with
heart failure. Am Heart J 1999;138:826—34.
28. Grumbach K, Selby JV, Damberg C, et al. Resolving the
gatekeeper conundrum: what patients value in primary care
and referrals to specialists. JAMA 1999;282:261—6.
29. Bodenheimer T, Lo B, Casalino L. Primary care physicians
should be coordinators, not gatekeepers. JAMA 1999;281:
2045—9.
30. Bensadon A-C. Perspectives de la Démographie Médicale.
Paris: DGS; 2001.
31. Nicolas G, Duret M. Propositions sur les Options à Prendre en
Matière de Démographie Médicale. Paris: DGS; 2001.
O
787
8. Ettner SL, Hermann RC. Provider specialty choice among
Medicare beneficiaries treated for psychiatric disorders.
Health Care Financ Rev 1997;18:43—59.
9. Carr-Hill RA, Rice N, Roland M. Socio-economic determinants
of rates of consultation in general practice based on the
Fourth National Morbidity Survey of General Practices. BMJ
1996;312:1008—12.
10. Earle CC, Neumann PJ, Gelber RD, Weinstein MC, Weeks JC.
Impact of referral patterns on the use of chemotherapy for
lung cancer. J Clin Oncol 2002;20:1786—92.
11. Hendryx MS, Ahern MM, Lovrich NP, McCurdy AH. Access to
health care and community social capital. Health Serv Res
2002;37:87—103.
12. Briggs LW, Rohrer JE, Ludke RL, Hilsenrath PE, Phillips KT.
Geographic variation in primary care visits in Iowa. Health
Serv Res 1995;30:657—71.
13. Gresenz CR, Stockdale SE, Wells KB. Community effects on
access to behavioral health care. Health Serv Res 2000;35:
293—306.
14. Busse R, Dixon A, Healy J. Health care systems in eight
countries: trends and challenges. London: London School of
Economics and Political Science; 2002.
15. Auvray L, Dumesnil S, Le Fur P. Santé, Soins et Protection
Sociale en 2000 [Health, Healthcare and Insurance in 2000]
(in French). Paris: CREDES; 2000.
16. Snijders T, Bosker R. Multilevel analysis. An introduction to
basic and advanced multilevel modelling. London: Sage;
1999.
17. Larsen K, Petersen JH, Budtz-Jorgensen E, Endahl L.
Interpreting parameters in the logistic regression model
with random effects. Biometrics 2000;56:909—14.
18. Tabard N. Représentation Socio-économique du Territoire.
Typologie des Quartiers et Communes Selon la Profession et
l’Activité Économique de Leurs Habitants. Paris: INSEE;
1993.
O
786
PR
785
B. Chaix et al.
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
837
893
838
894
839
840
895
896
PUHE 206—1/7/2004—13:09—SHYLAJA—109721— MODEL 6 — pp. 1–8
Access to general practitioner services: the disabled elderly lag behind
in underserved areas
BASILE CHAIX, PAUL J. VEUGELERS, PIERRE-YVES BOELLE, PIERRE CHAUVIN *
* B. Chaix (PhD)1, P. J. Veugelers (PhD)2, P.-Y. Boëlle (PhD)1, P. Chauvin (MD, DSc)1
1 Research Unit in Epidemiology and Information Sciences, National Institute of Health and Medical Research
(INSERM U444), France.
2 Department of Community Health and Epidemiology, Dalhousie University, Halifax, Canada.
Correspondence: Basile Chaix, INSERM U444, Faculté de Médecine Saint-Antoine, 27 rue Chaligny, 75571
Paris Cedex 12, France, tel. +33 (0)1 44 73 84 43, fax +33 (0)1 44 73 84 62, e-mail : [email protected]
Background: Several studies have shown that people living in areas underserved in physicians have
reduced odds of consulting. However, beyond the magnitude of this effect averaged for the whole
population, policymakers need to know whether specific subgroups faced with transportation difficulties,
such as the elderly and especially the disabled elderly, have a particularly restricted access to physicians
when residing in underserved areas. Methods: The study sample, representative of the French population
aged 18−75 in 1999, comprised 12 405 individuals. Multilevel Poisson models were used to investigate the
impact of the area-level density of general practitioners (GPs) on the number of GP consultations reported
over the previous 12 months. Results: The mean number of GP consultations over the previous 12 months
was 3.8 (standard deviation = 4.9). Multivariate analyses indicated that living in areas underserved in GPs
lead to a greater reduction in primary care utilisation for the elderly, and especially for the disabled
elderly, than for younger age groups. The disabled elderly had 244% more GP consultations (95% CI:
+79%, + 562%) when they lived in areas with high vs. low GP density (defined with the 10th and the 90th
percentiles as cut-offs). Conclusion: If further research confirms our findings, this increasingly disturbing
public health issue in industrialised countries where populations are ageing will require priority policy
measures. Ensuring that elderly people living in underserved areas have adequate access to primary care
may prevent future hospitalisations, use of home care services, and institutionalisation.
Keywords: Access to care, frail elderly, geography of health, primary health care.
In France and in other Western industrialised countries, several studies have shown that the uneven distribution
of physicians throughout the country leads to variations in the rate of consultations.1,2 Since the elderly have
special transport problems because of their impaired level of mobility,3,4 their access to physician services may
be more sharply reduced than average in areas where physician availability is low.5 The validation of this
hypothesis would highlight an increasingly disturbing public health issue in industrialised countries where
populations are ageing. Implementing policies to address this public health issue would not only be requisite for
attaining greater equity in access to healthcare; it may also be cost-efficient since ensuring that the elderly have
regular access to physicians may prevent future hospitalisations, use of home care services and
institutionalisations.4,6-8
Very few studies have adequately addressed this public health issue despite its importance. Several studies
have examined whether the access of the rural elderly to physician services is more restricted than their urban
counterparts’.9,10 However, since low physician availability is also often reported in deprived urban areas,11 the
rural − urban difference cannot be thought of as an adequate proxy of physician availability. Closer to our topic,
one US study has reported that Medicare beneficiaries (aged 65 or over) had a higher probability of using mental
health specialty care when they lived in counties with a higher density of psychiatrists.5 However, as the
magnitude of this effect was not estimated for the non-elderly, it was not possible to conclude whether the
density effect only affected the elderly or the entire population in the same way. This information, which when
lacking makes it hard to tailor an adequate public health response, was provided in a German study: the rate of
outpatient utilisation of psychiatric facilities was found to be significantly higher when the distance from
patients’ place of residence to the facility was short, and this association was about three times stronger among
patients over age 75.12 However, this finding was based on univariate analyses, and potential confounders of the
distance effect such as the individual socioeconomic status or the rural / urban environment of residence were
not considered.
1
While addressing these shortcomings, we chose to focus on access to primary care. Because the regular
access to primary care services allows for a continuity of care and a global management of patient health, it is
crucial to maintain health of the elderly over the long term, and thus may contribute to reducing the odds of
future hospitalisation, use of home care services or institutionalisation.6,7,13
As in most European countries,14,15 the elderly French are unlikely to face major income related barriers in
the access to primary care. Indeed, every legal resident in France is entitled to basic health coverage. User
charges that are not reimbursed by the national Social Security (6 euros for a GP consultation) are refunded by
supplementary elective insurance schemes (in 2000,16 93% of the population carried this extra insurance).
However, geographical variations in the density of GPs may lead to inequity of access to primary care. It would
be warranted to implement policies to address the issue of the disparities in the availability of primary care
services if the whole population were found to be affected to a certain extent, or alternatively if certain
subgroups had dramatically reduced odds of using primary care in underserved areas. Accordingly, to identify
the top priority subgroups that should be targeted by a policy addressing this public health issue, we investigated
the two following questions. Our first objective was to examine whether living in an area underserved in GPs
leads to a greater restriction in the access to primary care for the elderly than for younger age groups. Secondly,
we tested whether living in an area underserved in GPs affects the whole population of the elderly, or only the
mobility impaired, who may suffer specifically from this residential disadvantage. Beyond the size of the
medical density effect averaged for the whole population, it is particularly important to quantify the magnitude
of this effect for the subgroups that are expected to be particularly at risk, to identify major situations of
underutilisation that would require priority interventions.
METHODS
Data sources
We used data collected in 1999 by the French National Institute for Prevention and Health Education (INPES)
through the Baromètre Santé survey, a two level (households, individuals) random sample telephone survey (in
each selected household, one individual was randomly picked for an interview).17 The response rate was 0.69.
The study sample included 12 405 individuals aged 18-75. Each individual reported the number of consultations
with GPs he or she had had over the previous 12 months (office visits or house calls) as well as
sociodemographic characteristics. Weighting coefficients were computed a posteriori by the INPES to ensure
that the sample was representative of the French population.
The National Sickness Insurance Fund provided us with the number of GPs per 100 000 inhabitants (range:
69-135) in each of the 95 administrative French departments (henceforth designated as areas of residence). To
verify that the departments were homogeneous with respect to the medical density, we considered the 324 subdepartmental administrative areas and computed the intra-department correlation coefficient, which measures the
correlation of GP density between sub-departmental areas belonging to the same department.18 This coefficient
was very high (equal to 0.50) and highly significant (p < 0.0001).
Statistical analysis
We first used the nonparametric Jonckheere-Terpstra test19 (implemented with SAS, version 8.02, SAS Institute,
Cary, USA) to examine whether there was a monotonic relationship between the mean number of GP
consultations reported over the previous 12 months and the area-level number of GPs per 100 000 inhabitants
(first divided into quartiles).
Multilevel Poisson models18 with individuals nested within areas were then used to investigate the impact of
the area-level density of GPs on the number of GP consultations reported over the previous 12 months, while
appropriately taking into account the hierarchical structure of the data. Our models were adjusted for several
sociodemographic and health characteristics of the individuals (full details about the variables and the way they
were coded are given in table 1): age, gender, chronic disease status, disability, Duke health profile scores20
(physical, mental and perceived health, and incapacity), education, occupation, income, employment status,
marital status, type of municipality of residence (rural or urban) and gross domestic product per capita in the area
of residence (provided by INSEE, the French National Institute of Statistics and Economic Studies). To identify
areas where it may be particularly urgent to adopt measures, we defined contrasted classes of areas with respect
to the density of GPs: the sample was divided into three categories, with the 10th and the 90th percentiles as cutoffs.
Using a fully adjusted model fitted to the whole study sample, we first tested interaction effects between age
groups (60−69, 70−75) and area-level density of GPs. Secondly, in each age group taken separately (18−59,
60−69, 70−75), interaction effects were used to estimate whether the density effect was stronger for disabled
individuals (defined as those who reported a handicap leading to functional limitations) than for those who were
not disabled. Finally, the model was estimated in all age × disability status groups separately.
To verify that underconsultation of GPs in areas with a low density of GPs could not be attributed to a
higher consultation of specialist physicians in these areas (substitution), we estimated a fully adjusted model in
2
all age × disability status groups with the number of consultations of specialists over the previous 12 months as
the outcome variable.
Table 1 List of the variables used as adjustment factors in the models
Type of variable Categories of the qualitative variablea / unit of the quantitative
variable
Age
Qualitative
Less than 30; 30–44; 45–59; 60–69; 70–75
Gender
Qualitative
Male; female
Physical health (Duke scale)
Quantitative
A score from 0 to 100 (a high score indicates better health)
Mental health (Duke scale)
Quantitative
A score from 0 to 100 (a high score indicates better health)
Perceived health (Duke scale)
Quantitative
A score from 0 to 100 (a high score indicates better health)
Disability (Duke scale)
Quantitative
A score from 0 to 100 (a high score indicates greater dysfunction)
Chronic disease status
Qualitative
Reporting no chronic disease; reporting a chronic disease
Disability
Qualitative
Reporting no handicap; reporting a handicap leading to functional
limitations
Marital status
Qualitative
Married; never married; divorced; widowed
Educational achievement
Qualitative
University; secondary school; primary school or less; still at school
Employment status
Qualitative
Full-time employment; part-time employment; subsidised
employment; unemployment; other
Occupation
Qualitative
Upper white-collar worker; intermediate; low white-collar worker;
blue-collar worker; farmer; craftsman-shopkeeper
b
Household income per capita
Qualitative
Over €1351 per person; €1101–€1350; €611–€1100; €610 or less
Type of municipality of residence Qualitative
Large city (population over 200 000); medium sized town
(>20 000–200 000); small town (>2 000–20 000); rural municipality
Area-level gross domestic product Qualitative
First quartile; second quartile; third quartile; fourth quartile
per capita
Variable
a: The category of reference is in bold.
b: Monthly household income was divided by the number of units in the household (estimated with the method of the
Organisation for Economic Cooperation and Development).
Since some of the subgroups were small, the multilevel models parameters were estimated with the Markov
chain Monte Carlo estimation method implemented on MLwiN software (version 1.2, Institute of Education,
London), to obtain accurate interval estimates.21 Associations were expressed as percentage differences in the
number of consultations (95% confidence intervals were computed).
RESULTS
In the sample (12 405 individuals aged 18−75), the weighted proportion of individuals aged 60 or over was 0.20.
There were 12% of disabled individuals in the under 60 age group, 20% in the 60−69 age group, and 25% in the
70−75 age group. The mean number of consultations with a GP over the previous 12 months was 3.5 (standard
deviation (SD) = 4.9) in the under 60 age group, 5.0 (SD = 4.7) in the 60–69 age group, and 5.8 (SD = 4.9) in the
70–75 age group.
Figure 1 indicates that a statistically significant, and positive dose-response relationship between the mean
number of GP consultations and the area-level density of GPs was only found for the disabled in the 70–75 age
group (p = 0.005, bilateral Jonckheere-Terpstra test).
A fully adjusted model fitted to the whole sample indicated that women, individuals with poor health status,
the unemployed, people with low levels of educational attainment or low income reported a higher number of
GP consultations. In this model fitted to the whole sample, interaction effects indicated that the impact of the
area-level density of GPs was significantly stronger for individuals in the 60–69 age group than for those in the
under 60 age group, and still stronger for those in the 70–75 age group (results not shown). When the model was
fitted for each age group separately, the interaction term disability × density of GPs was only found to be
strongly significant for individuals in the 70–75 age group, indicating a stronger effect of the density of GPs for
the disabled vs. the non disabled in this age group (results not shown).
Analyses stratified by disability × age groups (see table 2) confirmed that the disabled elderly (age 70–75)
had a markedly higher number of GP consultations when they lived in areas with medium GP density (+115%,
95% CI: +21%, +282%) or high GP density (+244%, 95% CI: +79%, + 562%) vs. low GP density. Such a strong
3
effect was not found in any other group. As indicated in the model for the disabled elderly (table 3), the arealevel unexplained variations diminished by 27% when the contextual variables (type of municipality of
residence, gross domestic product per capita, and density of GPs) were added to the model containing individuallevel variables. At each step, the area-level residuals were estimated. These residuals were plotted on figure 2
(where they are represented in an ascending order from left to right). This graph shows that the variance of the
area-level residuals decreased when the contextual variables were introduced into the model.
The disabled elderly did not have a higher number of consultations of specialists when they lived in areas
with a high GP density vs. low GP density (results not shown).
Figure 1 Mean number of consultations with general practitioners (GPs) over the previous 12 months according
to the area-level density of GPs, France, 1999.
Table 2 Effect of the area-level density of general practitioners (GPs) on the number of GP consultations reported over the
previous 12 months in all age × disability status groups separately, France, 1999
Both
Percent differencesa
(95% CI)
Disabled
Percent differencesa
(95% CI)
Non disabled
Percent differencesa
(95% CI)
In the under 60 age group
Low density areasb
Medium density areasb
High density areasb
(n=9978)
0% (baseline)
+2% (-11%, +16%)
+14% (-3%, +34%)
(n=1172)
0% (baseline)
-5% (-32%, +31%)
+6% (-30%, +59%)
(n=8806)
0% (baseline)
+5% (-9%, +21%)
+22% (+3%, +44%)
In the 60-69 age group
Low density areasb
Medium density areasb
High density areasb
(n=1681)
0% (baseline)
+15% (-8%, +45%)
+26% (-5%, +66%)
(n=361)
0% (baseline)
+22% (-23%, +93%)
+37% (-23%, +141%)
(n=1320)
0% (baseline)
+17% (-9%, +50%)
+28% (-6%, +73%)
In the 70-75 age group
Low density areasb
Medium density areasb
High density areasb
(n=746)
0% (baseline)
+36% (+1%, +83%)
+67% (+17%,+139%)
(n=182)
0% (baseline)
+115% (+21%, +282%)
+244% (+79%, +562%)
(n=564)
0% (baseline)
+22% (-12%, +69%)
+40% (-6%, +109%)
a: Adjusted for age, gender, Duke Health Profile scores, chronic disease status, disability, marital status, education,
employment status, occupation, income, type of municipality of residence and area-level gross domestic product per capita.
b: Low density areas contain 10%, medium density areas contain 80% and high density areas contain 10% of the population,
with cut-offs equal to 73 and 115 GPs per 100 000 inhabitants.
4
Table 3 Random effects of the multilevel models estimated in all age × disability status groups
separately before and after including contextual variables
Individual-level modela
Model including contextual
variablesb
In the under 60 age group
Disabled
Non disabled
0.110 (0.021)***
0.019 (0.004)***
0.114 (0.023)***
0.018 (0.004)***
In the 60-69 age group
Disabled
Non disabled
0.161 (0.037)***
0.057 (0.013)***
0.177 (0.042)***
0.053 (0.013)***
In the 70-75 age group
Disabled
Non disabled
0.188 (0.052)**
0.096 (0.022)***
0.138 (0.044)*
0.093 (0.023)***
* p < 0.01; ** p < 0.001; *** p < 0.0001
a: The individual-level model included age, gender, Duke Health Profile scores, chronic disease
status, disability, marital status, education, employment status, occupation and income.
b: The contextual model further included type of municipality of residence, area-level gross
domestic product per capita and area-level density of general practitioners.
Figure 2 Area-level residuals from the individual-level model and from the contextual model for the disabled
elderly aged 70–75, France, 1999
DISCUSSION
To our knowledge, our study is the first to examine whether living in an area underserved in GPs leads to a
greater restriction in the access to primary care for the elderly and especially for the disabled elderly than for
younger age groups. Behind a moderate effect of the GP density for the whole population, we found that the
disabled elderly were dramatically affected in underserved areas. If our findings can be replicated in other
industrialised countries, addressing this public health issue through specific policies will have to be given
priority.
5
Limitations of the study and potential biases
As the study sample consisted of individuals aged 18–75, we were unable to assess the impact of living in an
area with low GP density for individuals over 75. Additional investigation is therefore required to examine
whether the magnitude of the medical density effect on access to primary care further increases with age over
age 75 for individuals not living in institutions.
We must consider whether potential biases may account for the strong effect of the area-level GP density,
which was found among the oldest (70–75) disabled in our sample. First, it may be argued that this effect
stemmed partly or entirely from a selective migration bias which would occur if individuals with significant
health concerns and a resulting high consumption of GP consultations moved from low to high medical density
areas.22 However, this bias is unlikely here since we adjusted for a wide set of health indicators. Secondly, GP
consultations were self-reported rather than drawn from medical records. However, since there is no reason to
suspect that consultations were particularly underreported in low medical density areas, the effect of the GP
density is unlikely to result from a measurement error.
Main findings
The extent to which living in an area with low GP density leads to a reduction in the number of GP consultations
reported over the previous 12 months increased with age. Moreover, for the oldest (70–75) individuals in the
study sample, we found that the medical density effect was mainly attributable to the disabled in this particular
group. Therefore, and after adjustment for a wide set of sociodemographic and health variables, our main finding
is that the disabled elderly reported a markedly lower number of GP consultations when they lived in an area
with low GP density.
This finding raises the following question: can we interpret the lower reported number of GP consultations
for the disabled elderly living in underserved areas in terms of underconsultation (underconsultation being
defined as a lower use of primary care services than would be recommended based on healthcare needs)? Even if
the kind of study undertaken here is not appropriate to decide whether a difference of use between two groups is
attributable to underconsultation in one of them or overconsultation in the other, some arguments can be put
forward in support of the hypothesis of underconsultation in low density areas. In areas with a medium level of
medical density (80% of the sample), the disabled elderly should not be suspected of overconsulting, since they
had slightly fewer GP consultations than individuals under 30 after adjustment for sociodemographic and health
variables (–10%, 95% CI: –3%, –16%, results not shown in tables). Therefore, in underserved areas where the
disabled elderly consulted significantly fewer times than in areas with medium GP density, the disabled elderly
may be expected to underconsult to a certain extent: in these underserved areas, they had 48% fewer
consultations over the previous 12 months (95% CI: 24%, 66%) than individuals under 30, after adjustment for
health needs and sociodemographic factors (results not shown in tables).
It is important to notice that the medical density effect among the disabled elderly is not confounded either
by the type of municipality of residence (rural or urban) or by the global wealth in the area of residence since our
models were adjusted for such potential confounders. Whereas living in a rural municipality vs. a large city had
no impact on access to primary care, living in an area underserved in GPs was a barrier to the access to primary
care for the disabled elderly.
Implications for policy, practice and research
It is important to verify whether our findings can be replicated in other industrialised countries. In countries
where GP density is lower than in France23 or where a markedly smaller percent of patient-physician contacts
takes place at patients’ homes,24-27 living in an area underserved in GPs may affect the access of the elderly to
primary care to a greater extent than in France. On the other hand, additional studies comparing the access to
care of the elderly and the non elderly would be required for a more comprehensive insight into the interrelated
impact of the personal ability to move, the availability of transport means (car, public transport) and the
availability of healthcare services.
Several policies may be suggested for implementation. A first option would be a policy aimed at reducing
geographic disparities in GP density, which have long prevailed in France.28 For instance, financial incentives for
physicians to set up their practice in low medical density areas may be suggested, but some analysts have warned
that this may not be sufficient.29 It has therefore recently been suggested that a regulation of the place where
physicians set up their practice may be required.30 Another different type of policy among other possibilities
would be a programme specifically targeted at the disabled elderly living in underserved areas. House calls for
health checks may be offered to the disabled elderly living in underserved areas, who would have been identified
as underconsulting by the local social services, and approaches used in the British annual health checks of the
over 75s to ensure that a high proportion of the elderly had a check should be considered (invitation letter to
undergo a check, follow-up of non responders by a telephone call or a visit31,32).
6
Our finding that living in an underserved area affected to a significant extent only the disabled elderly aged
70 or over – namely a small proportion of the population – should not be regarded as a sufficient evidence that a
global policy aimed at reducing geographic disparities in the availability of primary care services is unwarranted.
Indeed, our analysis stratified by age and disability status may have been unable to identify some other
subgroups that may benefit from this policy, such as subgroups with other mobility problems (with no car for
example) or with specific needs for regular follow-ups. More broadly, choosing the requisite intervention should
be based on a comparative analysis of the cost-effectiveness of each option. Therefore, recommending a definite
policy is beyond the scope of the present study.
CONCLUSION
Our study suggests that the elderly combining a personal disadvantage (impaired mobility) with a residential
disadvantage (living in an underserved area) have a dramatically reduced access to primary care. Therefore, if
further research confirms our findings, policymakers will be faced with a disturbing public health issue in
Europe and North America, even more so as the elderly are a growing fraction of the population. This would
justify the high priority rollout of policy measures to ensure that the elderly have an adequate access to primary
care, which may prevent future hospitalisations, use of home care services, and institutionalisation.
REFERENCES
1
Shannon GW, Bashshur RL, Lovett JE. Distance and the use of mental health services. Milbank Q
1986;64:302-330.
2 Lambert D, Agger MS. Access of rural AFDC Medicaid beneficiaries to mental health services. Health
Care Financ Rev 1995;17:133-145.
3
Saag KG, Doebbeling BN, Rohrer JE, Kolluri S, Mitchell TA, Wallace RB. Arthritis health service
utilization among the elderly: the role of urban-rural residence and other utilization factors. Arthritis Care Res
1998;11:177-185.
4 Vetter N, George M, Lewis P. A district-wide examination of 75-year olds suggests discrimination in the
provision of services. Aging (Milano) 1996;8:205-210.
5 Ettner SL, Hermann RC. Provider specialty choice among Medicare beneficiaries treated for psychiatric
disorders. Health Care Financ Rev 1997;18:43-59.
6 Hendriksen C, Lund E, Stromgard E. Consequences of assessment and intervention among elderly people:
a three year randomised controlled trial. BMJ (Clin Res Ed) 1984;289:1522-1524.
7
Parchman ML, Culler SD. Preventable hospitalizations in primary care shortage areas. An analysis of
vulnerable Medicare beneficiaries. Arch Fam Med 1999;8:487-491.
8 Niefeld MR, Braunstein JB, Wu AW, Saudek CD, Weller WE, Anderson GF. Preventable Hospitalization
Among Elderly Medicare Beneficiaries With Type 2 Diabetes. Diabetes Care 2003;26:1344-1349.
9 Blazer DG, Landerman LR, Fillenbaum G, Horner R. Health services access and use among older adults in
North Carolina: urban vs rural residents. Am J Public Health 1995;85:1384-1390.
10 Casey MM, Thiede Call K, Klingner JM. Are rural residents less likely to obtain recommended preventive
healthcare services? Am J Prev Med 2001;21:182-188.
11
Lucas-Gabrielli V, Tonnellier F. Déserts médicaux ou zones défavorisées? Démographie médicale et
indicateurs de besoins. Technologie et Santé 2001;45:32-38.
12 Dilling H, Weyerer S. Incidence and prevalence of treated mental disorders. Health care planning in a
small-town-rural region of Upper Bavaria. Acta Psychiatr Scand 1980;61:209-222.
13
Gulliford MC. Availability of primary care doctors and population health in England: is there an
association? J Publ Hlth Med 2002;24:292-298.
14 Halldorsson M, Kunst AE, Kohler L, Mackenbach JP. Socioeconomic differences in children's use of
physician services in the Nordic countries. J Epidemiol Community Health 2002;56:200-204.
15 McNiece R, Majeed A. Socioeconomic differences in general practice consultation rates in patients aged
65 and over: prospective cohort study. BMJ 1999;319:26-28.
16 Busse R, Dixon A, Healy J, Krasnik A, Leon S, Paris V, et al. Health care systems in eight countries:
trends and challenges. London: London School of Economics & Political Science, 2002.
17 Guilbert P, Baudier F, Gautier A, Goubert A, Arwidson P, Janvrin M. Baromètre Santé 2000. Méthodes.
Vanves: Editions CFES, 2001.
18 Leyland AH, Goldstein H. Multilevel modelling of health statistics. Chichester: Wiley, 2001.
19 Weller EA, Ryan LM. Testing for trend with count data. Biometrics 1998;54:762-773.
20 Guillemin F, Paul-Dauphin A, Virion JM, Bouchet C, Briancon S. [The DUKE health profile: a generic
instrument to measure the quality of life tied to health]. Sante Publique 1997;9:35-44.
21
Browne W. MCMC estimation in MLwiN. London: Center for Multilevel Modelling, Institute of
Education, University of London, 2002.
7
22 Gillanders WR, Buss TF. Access to medical care among the elderly in rural northeastern Ohio. J Fam
Pract 1993;37:349-355.
23 OECD Health Data 2002. Paris: Organization for Economic Cooperation and Development, 2002.
24 Auvray L, Dumesnil S, Le Fur P. Santé, soins et protection sociale en 2000. Paris: CREDES, 2001.
25 Unwin BK, Jerant AF. The home visit. Am Fam Physician 1999;60:1481-1488.
26
Aylin P, Majeed FA, Cook DG. Home visiting by general practitioners in England and Wales. BMJ
1996;313:207-210.
27 Boerma WGW, Groenewegen PP. GP home visiting in the 18 European countries: adding the role of
health system features. Eur J Gen Pract 2001;7:132-7.
28
Tonnellier F. Les inégalités géographiques de densités médicales sont stables depuis plus d'un siècle:
l'encombrement médical était déjà dénoncé en 1900. Solidarité Santé: Etudes Statistiques 1991;3:45.
29 Bensadon A-C. Perspectives de la démographie médicale. Paris: DGS, 2001.
30 Nicolas G, Duret M. Propositions sur les options à prendre en matière de démographie médicale. Paris:
DGS, 2001.
31 Chew CA, Wilkin D, Glendenning C. Annual assessment of patients aged 75 years and over: general
practitioners' and practice nurses' views and experiences. Br J Gen Pract 1994;44:263-267.
32 Brown K, Williams EI, Groom L. Health checks on patients 75 years and over in Nottinghamshire after
the new GP contract. BMJ 1992;305:619-621.
8
Reduced use of primary, specialty and preventive care services by
individuals residing with persons in poor health
BASILE CHAIX, MARYAM NAVAIE-WALISER, CECILE VIBOUD, ISABELLE PARIZOT, PIERRE
CHAUVIN *
* B. Chaix (PhD)1, M. Navaie-Waliser (DrPH)2, C. Viboud (PhD)3, I. Parizot (PhD)1, P. Chauvin (MD, DSc)1
1 Research Unit in Epidemiology and Information Sciences, National Institute of Health and Medical Research
(INSERM U444), France.
2 Center for Home Care Policy and Research, Visiting Nurse Service of New York, New York, U.S.A.
3 Fogarty International Center, National Institutes of Health, Bethesda, Maryland, U.S.A.
Correspondence: Basile Chaix, INSERM U444, Faculté de Médecine Saint-Antoine, 27 rue Chaligny, 75571
Paris Cedex 12, France, tel. +33 (0)1 44 73 84 43, fax +33 (0)1 44 73 84 62, e-mail : [email protected]
Background: Since household time resources and financial resources for healthcare are primarily spent
for the household members with the most urgent health needs, individuals residing with persons in poor
health may be at risk of underusing healthcare services. We examined whether they had increased risks of
underusing primary, specialty and preventive care. Methods: Data collected in 2000 from a representative
sample of 8,210 French individuals aged 18 years or older from 3,810 households were analysed with
logistic regression models adjusted for health, demographic and socioeconomic variables. Results: We
found that individuals residing with 1 other survey respondent had a higher risk of not using primary
care, specialty care and preventive care in the 12 months preceding the study when the health status of the
other survey respondent was poorer (fair or alternatively poor vs. good). Furthermore, individuals
residing with 2 other survey respondents had a higher risk of not using primary care, specialty care and
preventive care in the 12 months preceding the study when they resided with a higher number of
respondents in fair or poor health (1 or alternatively 2 vs. 0). Conclusion: Underuse of health services by
individuals residing with persons in poor health signals a need for health practitioners to broaden the
scope of care beyond their patients, and for policymakers to consider the long term impact of this
situation on the healthcare system.
Keywords: Family caregivers, family health, health service use.
Today it is much more common to find individuals residing with persons in poor health. This is due to medical
advances which enable people with serious and chronic illness to survive longer despite their health problems1-3
and to the current trends of healthcare systems to shorten hospital stays and expand outpatient care services4-7
Many individuals residing with persons in poor health play an important part as family caregivers for health care
delivery. Because of the hardship of their task, these family caregivers have increased risks of stress,1-3,8-14
distress,4,5 depressive symptoms,3,4,12,14-19 and poor physical health.4,6,12,17,19,20 Public health researchers have
extensively investigated the utilisation patterns of the services providing support to family caregivers.7,21,22
However, few studies have examined whether individuals residing with persons in poor health do receive
adequate health care for their own health concerns. Since household time resources1,3,4,8,23 and financial
resources3,11,24 for healthcare are primarily spent for the members with the most urgent health needs, we expected
that individuals residing with persons in poor health were at risk of underusing healthcare services.
The literature on this question is very scarce. A North American study has ascertained that caregivers of
senile dementia patients had a greater number of recent physician visits and a greater number of prescription
medication (for their own health concerns) than their matched non caregiver controls.12 On the other hand, a
Californian study of elderly members of a large health maintenance organisation reported no significant
difference in routine physical examinations between caregivers and noncaregivers.25 However in both studies,
measures of association were not adjusted for the individuals’ health status. Therefore, caregivers’ and
noncaregivers’ use of healthcare services cannot be appropriately compared since the two groups are not
comparable in terms of their health status and their resulting healthcare needs (see references above).
Considering the shortcomings in the literature, (a) we took into account the potential confounding effects of
the health status and sociodemographic characteristics; (b) we considered all the adults residing with persons in
poor health rather than just the effective family caregivers, so that our findings would have a widespread
generalizability; (c) we investigated utilisation patterns of several types of healthcare services. Our study
expands past research by examining whether individuals residing with persons in poor health have increased
risks of underusing primary, specialty and preventive care.
METHODS
Source of Data
Cross sectional data were collected in 2000 by the French National Institute of Statistics and Economic Studies
(INSEE) through a face to face interview survey. Households were randomly drawn from the INSEE census
based master sample. Survey questionnaires were completed by 5,413 (79%) out of the 6,824 selected
households. Up to 3 persons aged 15 years or older were surveyed in each household. When there were more
than 3 persons aged 15 years or older in the household, 3 of them were randomly selected for an interview.
During scheduled interview times, 28% of the preselected individuals were absent. Their questionnaires were
completed by another household member. Data were collected by trained interviewers using structured survey
questionnaires, which captured demographic characteristics, health characteristics, socioeconomic variables
including precise financial indicators, and information on healthcare utilisation.
For the purposes of this study, surveyed individuals aged under 18 years (n = 464) were excluded from the
study sample, so that individuals who may have little decision making power for healthcare utilisation were not
included. Individuals who had no other surveyed household member (n = 1,599) were also excluded from the
analyses. Twenty-one individuals were further excluded because of incomplete information on healthcare
utilisation. In the end, the study sample consisted of 8,210 individuals aged 18 years or older from 3,810
households with 2 or 3 survey respondents. Weighting coefficients were computed by INSEE to ensure that the
sample was representative of the French population in terms of age, gender, and employment status.
Statistical Analysis
Three binary outcome variables based on survey responses were defined. We considered whether each individual
had or had not used (a) primary care physician consultations, (b) specialist physician consultations, and (c)
preventive care (including preventive medical tests and preventive clinical examinations) in the 12 months
preceding the study (1 = no use; 0 = at least one utilisation).
Weighted multilevel logistic models26,27 with individuals nested within households were fitted for each
outcome variable. Health, demographic and socioeconomic adjustment factors were introduced in the models,
including many factors that have been shown repeatedly to be associated with healthcare utilisation. These
variables are listed and extensively detailed in table 1. Our purpose was to disentangle the effect of residing with
persons in poor health from other interactions between household members, such as mimicry of healthcare
utilisation behaviour between household members. Accordingly, for improved model adjustment, we considered
whether individuals residing with persons who did not use a given service in the 12 months preceding the study
had increased risks of not using that service (see bottom of table 1). Furthermore, we took into account the
potential confounding effect of other-reported rather than self-reported health service utilisation for the
individuals who were absent at the time of the interview: the models were adjusted for the presence/absence of
the individuals.
For every individual aged 18 years or older, we considered the other persons aged 15 years or older
surveyed in their household, to define the explanatory variable of interest (health status of the other persons
surveyed in the household). Therefore, a given individual was taken into account both as an individual from the
study sample and as a household member for 1 or 2 other individuals in the sample.
Separate regression models were fitted for individuals residing with 1 other survey respondent and for those
residing with 2 other survey respondents. The models were used to test the following hypotheses: (a) Individuals
residing with 1 other survey respondent had a higher risk of not using healthcare services in the 12 months
preceding the study when the health status of the other respondent was poorer (fair or alternatively poor vs.
good). (b) Individuals residing with 2 other survey respondents had a higher risk of not using healthcare services
in the 12 months preceding the study when they resided with a higher number of respondents in fair or poor
health (1 or alternatively 2 vs. 0).
All multilevel model parameters were estimated with MLwiN 1.2 software (Institute of Education, London,
UK). Adjusted odds ratios (ORs) and 95% confidence intervals (CIs) were computed.
Table 1 Variables used as adjustment factors in regression models
Variables
Age
Gender
Marital status
Health status
Chronic disease
Sick leave in the previous 12 months
Received home assistance in the
previous 12 months
Educational achievement
Employment status
Health insurance status
Number of other respondents with
only basic insurance
Unemployment allowance recipient
Allowance recipient
Unearned income recipient
Household income per capitab
Housing tenure
Score for ownership of several goodsc
Categories of the variable
Under 30 ; 30-44; 45-59; 60 or over
Malea; female
Marrieda; never married; divorced; widowed
Gooda; fair; poor
Noa; yes
Nonea; one week or less; one week to 1 month; more than 1 month
Noa; yes
a
Primary school or lessa; secondary school; university; still a student
Workinga; unemployed; student; retired; housewife; other
Supplementary insurancea; only basic insurance; fully insured for medical
reasons
For individuals residing with 1 other respondent: 0a; 1. For individuals
residing with 2 other respondents: 0a; 1; 2
Noa; yes
Noa; yes
Noa; yes
First quartilea; second quartile; third quartile; fourth quartile
Owner occupiera; tenant; non-rent paying occupant
Low score (4 goods or less out of 12)a; mid-low score (5 or 6 goods); midhigh score (7 or 8 goods); high score (9 goods or more)
Noa; yes
Financial problems for heating the
home
Family status
Couple with childrena; couple without children; single parent family
More than 3 persons aged 15 years or
Noa; yes
older in the householdd
Absence of the individual at the time
Noa; yes
of the interview
Number of other respondents who did
For individuals residing with 1 other respondent: 0a; 1. For individuals
residing with 2 other respondents: 0a; 1; 2
not use the service
a: This category is the reference category.
b: Household income was adjusted for household size.
c: Twelve goods were considered: refrigerator, freezer, refrigerator-freezer, washing machine, microwave oven,
television set, hi-fi system, Minitel (electronic directory), cell phone, car, laptop, desktop computer.
d: The variable was only introduced in the model for individuals residing with 2 other respondents.
RESULTS
In the sample, the weighted proportion of women was 0.50. The mean age was 45 (standard deviation = 17).
Twenty-seven percent of the individuals were in poor health and 45% in fair health. Eighteen percent of the
individuals did not use primary care services and 45% did not use specialty care services in the 12 months
preceding the study. Fifty-eight percent of the individuals did not use preventive care in the 12 months preceding
the study.
In all our models, individuals residing with survey respondents who did not use a given healthcare service
had increased risks of not using that service in the 12 months preceding the study (table 2).
Individuals residing with 1 other survey respondent had a higher risk of not using healthcare services in the
12 months preceding the study when the health status of the other survey respondent was poorer (fair or
alternatively poor vs. good) (table 2 and figure 1). The association was linear and significant for each of the 3
types of healthcare services (i.e., primary, specialty and preventive care). Moreover, individuals residing with 2
other survey respondents had a higher risk of not using healthcare services in the 12 months preceding the study
when they resided with a higher number of respondents in fair or poor health (1 or alternatively 2 vs. 0) (table 2
and figure 1). The association was linear and significant for each of the 3 types of healthcare services.
Table 2 Adjusted effect of (a) residing with persons in poor health and (b) residing with non users of healthcare services, on
the risk of not using primary, specialty and preventive care in the 12 months preceding the study. Fully adjusted odds ratio
(OR) and 95% confidence interval (CI)
For individuals residing with 1 other respondent
(n = 5423)
Health status of the other respondent
Good
Fair
Poor
Number of other respondents who did not use
the service
Zero
One
For individuals residing with 2 other respondents
(n = 2787)
Number of other respondents in poor or fair
health
Zero
One
Two
Number of other respondents who did not use
the service
Zero
One
Two
a: Adjusted for all the factors listed in table 1.
*p < 0.01; **p < 0.001
No primary care in
the previous 12
months
a
OR 95% CI
No specialty care in
the previous 12
months
a
OR
95% CI
No preventive care
in the previous 12
months
a
OR
95% CI
1.00
1.56** (1.21, 2.01)
1.89** (1.39, 2.56)
1.00
1.39** (1.16, 1.67)
1.69** (1.36, 2.10)
1.00
1.34* (1.11, 1.61)
1.67** (1.34, 2.07)
1.00
3.88** (2.92, 5.17)
1.00
1.38* (1.13, 1.67)
1.00
2.92** (2.45, 3.50)
1.00
1.24
1.69*
(0.83, 1.86)
(1.15, 2.48)
1.00
1.65* (1.13, 2.41)
1.77* (1.21, 2.58)
1.00
1.28
(0.97, 1.68)
1.70** (1.27, 2.26)
1.00
3.35** (2.34, 4.79)
5.52** (3.27, 9.34)
1.00
1.72* (1.22, 2.41)
2.60** (1.86, 3.63)
1.00
2.20** (1.47, 3.29)
5.33** (3.73, 7.61)
DISCUSSION
Our study addresses an important topic that has received minimal attention in the scientific literature, namely the
utilisation of healthcare services by individuals residing with persons in poor health. Building on earlier
literature, the study provides a broader outlook by showing that residing with persons in poorer health or with a
higher number of persons in fair or poor health has adverse and dose-response effects on the likelihood of using
3 different types of healthcare services (primary, specialty and preventive care).
Limitations of the study
There are several limitations to our study. First, for certain individuals in the study sample, we did not have
information for all the household members (household residents aged under 15 years and certain individuals in
households where there were more than 3 persons aged 15 years or older were not surveyed). Survey data with
information on all the household members would be useful to obtain more accurate estimates of the risks
incurred by individuals residing with persons in poor health.
Figure 1 Adjusted effecta of residing with persons in poor health on the risk of not using primary, specialty and
preventive care in the 12 months preceding the study.
a: Odds ratios are adjusted for all the factors listed in table 1.
Secondly, utilisation of healthcare services was other-reported rather than self-reported for the preselected
individuals absent at the time of the interview. The inclusion in models of a dummy variable for the
presence/absence of the individuals indicated that the individuals who did not personally complete the survey
questionnaire had a higher risk of being classified as non users of specialty care (the effect was not significant
for primary care and preventive care). Therefore, our estimates of the percentage of individuals who did not use
specialty care in the 12 months preceding the study may be biased towards overestimation. However, the impact
of residing with persons in poor health on specialty care utilisation remained unchanged after adjusting the
model for the presence/absence of the individuals.
Interpretation of the findings
Several causal pathways may be suggested for the associations between residing with persons in poor health and
the utilisation of healthcare services. One, individuals residing with persons in poor health may have to spend
less money for their own health to allow for the increased health expenses of their ill household members. The
financial barrier may be reinforced because individuals residing with persons in poor health may consider
spending money for their healthcare unwarranted in view of the more serious and urgent healthcare needs of
their ill household members. Two, other mechanisms that are not directly related to financial resources may also
play a part, i.e., residing with ill persons may be time consuming and draining on affective resources. The
caregiving literature reports that family caregivers experience subjective and objective burdens13,18,21,22,28-32
leading to the disruption of daily life and the restriction of activity.3-5,12,16,33-35 Therefore, certain individuals
residing with persons in poor health may be objectively and subjectively too overburdened by their caregiving
activity to mind their own health.23,36 Finally, individuals residing with persons in poor health may downplay the
importance of their own health problems in view of the problems of their ill household members. Therefore, they
may have a lower than expected utilisation of healthcare services.
Our findings may be generalized to the entire population of adults residing with persons in poor health, and
not only to the effective family caregivers. At least 2 of the 3 aforementioned causal pathways may affect
individuals residing with persons in poor health, whether they provide effective care or not. Future research
should compare the utilisation of healthcare for caregivers and non caregivers residing with ill persons.
Implications for policy and practice
We identified a risk factor for underuse of several types of healthcare services, which has almost never been
investigated in Europe or in North America. Findings similar to the ones reported here may be expected in other
industrialised countries, albeit with minor changes due to differences in the healthcare systems.
In a public health perspective, underuse of healthcare services by individuals residing with persons in poor
health signals a need for health practitioners to broaden the scope of care beyond the patients themselves and to
move toward a household centred model of care. For example, in accordance with a study that has underscored
that primary care physicians are in a good position to identify caregivers at risk,20 physicians may be reminded to
turn their attention to the individuals residing with their very ill patients.
In addition, policymakers should consider the long term impacts of the situation described in the present
study on the healthcare system. First, healthcare costs may be higher in an intervention driven model of care than
in a prevention driven model of care where individuals residing with persons in poor health could benefit from
the regular use of ambulatory care. Secondly, many individuals residing with persons in poor health play an
important role as family caregivers. Their underuse of healthcare may not allow them to stay healthy in the long
run and may lead to the increased use of the formal care system by the carereceiving household members or to
the institutionalisation of these carerecipients. Therefore, tailoring policies to ensure that individuals residing
with persons in poor health could benefit from the regular use of ambulatory care including preventive care may
be a cost saving strategy.
REFERENCES
1 Tak YR, McCubbin M. Family stress, perceived social support and coping following the diagnosis of a
child's congenital heart disease. J Adv Nurs 2002;39:190-8.
2 Hamlett KW, Pellegrini DS, Katz KS. Childhood chronic illness as a family stressor. J Pediatr Psychol
1992;17:33-47.
3 Pearlin LI, Mullan JT, Semple SJ, Skaff MM. Caregiving and the stress process: an overview of concepts
and their measures. Gerontologist 1990;30:583-94.
4 Weitzner MA, Haley WE, Chen H. The family caregiver of the older cancer patient. Hematol Oncol Clin
North Am 2000;14:269-81.
5
Schumacher KL, Dodd MJ, Paul SM. The stress process in family caregivers of persons receiving
chemotherapy. Res Nurs Health 1993;16:395-404.
6 Navaie-Waliser M, Feldman PH, Gould DA, Levine C, Kuerbis AN, Donelan K. When the caregiver needs
care: the plight of vulnerable caregivers. Am J Public Health 2002;92:409-13.
7
Emanuel EJ, Fairclough DL, Slutsman J, Alpert H, Baldwin D, Emanuel LL. Assistance from family
members, friends, paid care givers, and volunteers in the care of terminally ill patients. N Engl J Med
1999;341:956-63.
8
Hodapp RM, Fidler DJ, Smith AC. Stress and coping in families of children with Smith-Magenis
syndrome. J Intellect Disabil Res 1998;42 ( Pt 5):331-40.
9 Dyson LL. Response to the presence of a child with disabilities: parental stress and family functioning over
time. Am J Ment Retard 1993;98:207-18.
10 Beckman PJ. Influence of selected child characteristics on stress in families of handicapped infants. Am J
Ment Defic 1983;88:150-6.
11 Stewart MJ, Hart G, Mann K, Jackson S, Langille L, Reidy M. Telephone support group intervention for
persons with hemophilia and HIV/AIDS and family caregivers. Int J Nurs Stud 2001;38:209-25.
12
Haley WE, Levine EG, Brown SL, Berry JW, Hughes GH. Psychological, social, and health
consequences of caring for a relative with senile dementia. J Am Geriatr Soc 1987;35:405-11.
13
Poulshock SW, Deimling GT. Families caring for elders in residence: issues in the measurement of
burden. J Gerontol 1984;39:230-9.
14 Pinelli J. Effects of family coping and resources on family adjustment and parental stress in the acute
phase of the NICU experience. Neonatal Netw 2000;19:27-37.
15
Dura JR, Haywood-Niler E, Kiecolt-Glaser JK. Spousal caregivers of persons with Alzheimer's and
Parkinson's disease dementia: a preliminary comparison. Gerontologist 1990;30:332-6.
16 Weitzner MA, McMillan SC, Jacobsen PB. Family caregiver quality of life: differences between curative
and palliative cancer treatment settings. J Pain Symptom Manage 1999;17:418-28.
17
Baumgarten M, Hanley JA, Infante-Rivard C, Battista RN, Becker R, Gauthier S. Health of family
members caring for elderly persons with dementia. A longitudinal study. Ann Intern Med 1994;120:126-32.
18
Pruchno RA, Resch NL. Husbands and wives as caregivers: antecedents of depression and burden.
Gerontologist 1989;29:159-65.
19 Schulz R, Newsom J, Mittelmark M, Burton L, Hirsch C, Jackson S. Health effects of caregiving: the
caregiver health effects study: an ancillary study of the Cardiovascular Health Study. Ann Behav Med
1997;19:110-6.
20 Schulz R, Beach SR. Caregiving as a risk factor for mortality: the Caregiver Health Effects Study. JAMA
1999;282:2215-9.
21
Caserta MS, Lund DA, Wright SD, Redburn DE. Caregivers to dementia patients: the utilization of
community services. Gerontologist 1987;27:209-14.
22
Angold A, Messer SC, Stangl D, Farmer EM, Costello EJ, Burns BJ. Perceived parental burden and
service use for child and adolescent psychiatric disorders. Am J Public Health 1998;88:75-80.
23 Burton LC, Newsom JT, Schulz R, Hirsch CH, German PS. Preventive health behaviors among spousal
caregivers. Prev Med 1997;26:162-9.
24 Covinsky KE, Goldman L, Cook EF, Oye R, Desbiens N, Reding D, et al. The impact of serious illness on
patients' families. SUPPORT Investigators. Study to Understand Prognoses and Preferences for Outcomes and
Risks of Treatment. JAMA 1994;272:1839-44.
25
Scharlach AE, Midanik LT, Runkle MC, Soghikian K. Health practices of adults with elder care
responsibilities. Prev Med 1997;26:155-61.
26 Diez-Roux AV. Bringing context back into epidemiology: variables and fallacies in multilevel analysis.
Am J Public Health 1998;88:216-22.
27 Diez-Roux AV. Multilevel analysis in public health research. Annu Rev Public Health 2000;21:171-92.
28
Zarit SH, Todd PA, Zarit JM. Subjective burden of husbands and wives as caregivers: a longitudinal
study. Gerontologist 1986;26:260-6.
29
Hinrichsen GA, Ramirez M. Black and white dementia caregivers: a comparison of their adaptation,
adjustment, and service utilization. Gerontologist 1992;32:375-81.
30 Wackerbarth SB, Johnson MM. Essential information and support needs of family caregivers. Patient
Educ Couns 2002;47:95-100.
31 Cousineau N, McDowell I, Hotz S, Hebert P. Measuring chronic patients' feelings of being a burden to
their caregivers: development and preliminary validation of a scale. Med Care 2003;41:110-8.
32 Yaffe K, Fox P, Newcomer R, Sands L, Lindquist K, Dane K, et al. Patient and caregiver characteristics
and nursing home placement in patients with dementia. JAMA 2002;287:2090-7.
33 Navaie-Waliser M, Spriggs A, Feldman PH. Informal caregiving: differential experiences by gender. Med
Care 2002;40:1249-59.
34 Boaz RF. Full-time employment and informal caregiving in the 1980s. Med Care 1996;34:524-36.
35 Mant J, Carter J, Wade DT, Winner S. Family support for stroke: a randomised controlled trial. Lancet
2000;356:808-13.
36
O'Brien MT. Multiple sclerosis: health-promoting behaviors of spousal caregivers. J Neurosci Nurs
1993;25:105-12.
Chapitre III – Perspective multiniveau et perspective
spatiale en analyse contextuelle
La grande majorité des études d’analyse contextuelle en épidémiologie sociale se réfèrent
donc au paradigme de l’analyse multiniveau.4 Ce parti pris apparaît justifié pour autant qu’il
témoigne d’une opposition à l’approche d’analyse écologique qui travaille à partir de données
agrégées.16 Toutefois, au-delà des progrès qu’elle a permis dans la compréhension des effets
du contexte, l’approche d’analyse multiniveau présente d’importantes limites. Ainsi, par
exemple, différents auteurs ont émis des doutes sur la possibilité de distinguer les effets
individuels d’effets proprement contextuels, tant les uns et les autres sont en fait
enchevêtrés.30 Ce n’est toutefois pas à ce débat que nous souhaitons dans un premier temps
participer, nous intéressant plus particulièrement à une question qui a été nettement moins
discutée dans la littérature. En effet, nous partons de l’hypothèse que l’approche multiniveau
s’appuie sur une conception de l’espace qui limite sa compréhension, posant des œillères sur
le regard qu’elle porte aux variations géographiques des phénomènes de santé.
La notion de l’espace que se donne l’approche multiniveau est comme une négation de
l’espace lui-même. La plupart des études qui suivent cette approche ne prennent pas la peine
de représenter les variations spatiales du phénomène étudié.44, 48, 64 En général, les chercheurs
disposent d’un identifiant de la zone de résidence des différents individus considérés. D’autre
part, dans des bases de données séparées, provenant par exemple des recensements de
population, ils ont accès à des caractéristiques qui décrivent les différentes zones. En se
servant de l’identifiant présent dans les deux bases, les données contextuelles sont appariées
aux données individuelles. On utilise alors la base résultante pour conduire des analyses
multivariées au moyen desquelles on cherche à identifier des effets du contexte. Les analyses
sont ainsi conduites sans que les chercheurs n’aient à tenir compte de la façon dont
s’organisent les différentes zones dans l’espace, et l’espace reste ainsi une abstraction.
L’hypothèse principale de notre travail est que cette conception de l’espace impose des
limites à la connaissance à laquelle on peut parvenir sur la distribution spatiale des
phénomènes de santé. En effet, dans la plupart des cas, les phénomènes présentent une
certaine continuité sur le territoire, dont les modèles multiniveaux sont incapables de tenir
compte. Nous proposons donc d’utiliser des méthodes qui tiennent mieux compte de la
31
continuité de l’espace, et cherchons à montrer qu’on aboutit ainsi à des informations qui ont
une certaine utilité en santé publique, auxquelles on ne pourrait parvenir avec l’approche
multiniveau. Notre travail a d’une part consisté en une réflexion épistémologique comparative
sur l’analyse multiniveau et différentes approches d’analyse spatiale.111, 112, 113, 114 L’objectif
était alors de comparer les présupposés et conceptions de l’espace qui sous-tendent ces
méthodes, et d’évaluer en quoi certaines conceptions handicapent la connaissance. D’autre
part, à partir de différentes bases de données, nous avons cherché à appliquer les différentes
approches, afin de voir à partir de cas concrets si des méthodes qui tiennent compte de
l’espace dans sa continuité aboutissent à des informations utiles en santé publique auxquelles
on ne saurait parvenir en s’appuyant sur un territoire fragmenté en zones aux limites
arbitraires.
Nous avons réalisé une première étude à partir des données de l’Enquête sur la Santé et la
Protection Sociale (SPS) de l’IRDES.92 Nous étions en mesure de localiser les individus au
niveau de leur commune de résidence, soit de façon bien plus précise que lors de nos analyses
conduites à partir du Baromètre Santé. Nous avons cherché à décrire et expliquer les
variations spatiales des modes de recours aux soins sur le territoire métropolitain Français.
Une limite de ces analyses est liée à la taille de l’échantillon : entre 5000 et 8000 individus
selon les analyses fournissent en fait une information insuffisante lorsque l’on cherche à
étudier les variations d’un phénomène sur un territoire aussi étendu. Une seconde limite est
liée à l’impossibilité de localiser les individus plus précisément qu’au niveau de leur
commune de résidence, ce qui peut empêcher de capter certains processus opérant à un niveau
très local. Cette première étude a été soumise au Journal of Epidemiology and Community
Health. Les relecteurs du Journal ont souligné l’aspect innovant de notre travail, et ont
suggéré un certain nombre de corrections afin d’en améliorer la qualité et la lisibilité. Une
version corrigée est en cours d’examen par le Journal.
Une seconde étude a été réalisée à partir de données Suédoises issues du Registre de
Population. Sur le plan du matériel d’étude utilisé depuis le début de la thèse, nous nous
sommes efforcés de recourir à des données de plus en plus adaptées au travail d’analyse
contextuelle, la première étape étant liée à l’emploi des données d’individus au sein de
départements du Baromètre Santé,90, 91 la seconde étape étant franchie avec l’utilisation des
données d’individus au sein de communes de l’enquête SPS,92 et la dernière étape consistant
dans l’utilisation de ces données Suédoises. La base de données utilisée contient en effet des
informations sur l’ensemble des 270 000 habitants de la ville de Malmö. De plus, nous étions
32
en mesure de localiser l’ensemble de ces individus de façon très précise, au mètre près au
niveau de leur domicile de résidence. Nous intéressant aux troubles mentaux et
comportementaux liés à la consommation de substances psycho-actives, ces données quasiuniques au monde fournissent une puissance et une précision considérables permettant
d’identifier de façon fine les variations de prévalence dans l’espace de la ville. Nous venons
juste d’achever cette étude, en collaboration avec Juan Merlo de l’Hôpital Universitaire de
Malmö en Suède, SV Subramanian de l’Ecole de Santé Publique Harvard de Boston, et John
Lynch de l’Université du Michigan. Cette étude sera très prochainement soumise à un journal
d’épidémiologie.
Ces deux travaux ont pour point de départ nos réflexions épistémologiques sur l’utilité des
modèles multiniveaux en épidémiologie sociale, présentées dans le premier chapitre de ce
document. En effet, le fil conducteur de notre comparaison des approches d’analyse
multiniveau et spatiale a été trouvé dans la distinction entre mesures d’association et mesures
de variation ou de corrélation exposée ci-dessus.13, 68, 69 Par rapport aux mesures de variation,
nous avons pu montrer que la modélisation de la variance des phénomènes de santé fournit
des informations plus abouties sur la distribution spatiale des phénomènes lorsqu’on tient
compte de l’espace dans sa continuité plutôt que lorsqu’on le fragmente en une multitude de
zones aux limites arbitraires disjointes les unes des autres. Eu égard aux mesures
d’association, la prise en compte des facteurs du contexte au niveau des zones administratives
constitue également une limite importante, puisque ces zones administratives pourraient être
trop étroites ou au contraire trop larges pour capter certains effets du contexte sur la santé.38
Nous rapportons maintenant les principaux apports à ce sujet de nos deux études.
1) Description de la distribution spatiale des phénomènes
La planification des programmes de santé publique requiert une connaissance précise de la
distribution spatiale des phénomènes. Dans notre première étude, nous nous sommes
intéressés à la distribution spatiale des modes de recours aux soins sur l’ensemble du territoire
métropolitain Français. Nos analyses montrent que les modes de recours aux soins varient à
une échelle assez large sur le territoire métropolitain Français. En conséquence, des analyses
qui s’appuient sur un territoire fragmenté en zones de tailles modestes apparaissent incapables
de rendre compte de la cohérence géographique des modes de recours aux soins sur le
territoire. Dans notre seconde étude, nous nous sommes intéressés à la distribution spatiale
33
des troubles mentaux et comportementaux liés à la consommation de substances psychoactives dans la ville Suédoise de Malmö. Dans ce cas, les quartiers où le risque est élevé ont
tendance à se regrouper au nord et au centre de la ville. Utiliser des méthodes qui font
abstraction de cette cohérence spatiale aboutit à une perte d’informations importantes en santé
publique.
Beaucoup d’études qui s’inscrivent dans le cadre de l’approche multiniveau s’intéressent à
la distribution spatiale des phénomènes.43, 57 Toutefois, dans cette optique, la plupart de ces
études se bornent à fournir des informations sur l’amplitude des variations spatiales. Elles
demeurent par contre muettes sur la forme des variations spatiales, sur la manière dont se
configure cette variabilité dans l’espace. Il nous semble utile d’insister sur le fait que la
simple quantification de l’amplitude des variations géographiques d’un phénomène ne suffit
pas à rendre compte de sa distribution spatiale. D’un point de vue de santé publique, il est
également utile de savoir si les variations du phénomène dans l’espace sont aléatoires, ou si le
phénomène présente une cohérence géographique importante, de telle sorte que les zones à
risque soient regroupées en un ou plusieurs endroits, formant ainsi des espaces à risque qui
transcendent les limites des zones administratives considérées. Une telle information permet
d’évaluer si des interventions de santé publique gagnent à être coordonnées à une échelle
supérieure à celle des zones prises en compte.
La simple élaboration de cartes des variations spatiales des phénomènes fournit des
informations visuelles utiles à ce sujet.115,
116
Plutôt que de représenter directement des
moyennes ou des taux bruts, il est préférable de représenter les résidus de niveau zone des
modèles multiniveaux. Cette approche permet en effet de tenir compte de l’incertitude des
estimations réalisées dans les zones où les effectifs sont faibles117 et de produire des cartes qui
soient ajustées sur différents facteurs tels que le sexe ou l’âge. Par ailleurs, dans la seconde
des études entreprises, disposant d’informations sur le lieu de résidence exact des individus,
nous montrons que l’utilisation de modèles géoadditifs (qui tiennent compte des variations
spatiales à l’aide d’une fonction de lissage) permet d’aboutir à des cartes précises des
variations du phénomène indépendantes des frontières administratives.118,
119, 120
Toutefois,
quels que soient la précision et l’intérêt des différentes cartes, il n’est évidemment pas
possible de réaliser des inférences, tant sur l’amplitude des variations du phénomène que sur
la forme que prennent ces variations dans l’espace, à partir d’un jugement approximatif basé
sur des informations visuelles.
34
Au-delà de ces informations cartographiques, il est donc utile d’estimer des paramètres
qui renseignent sur la distribution spatiale des phénomènes. Dans cette optique, l’approche
multiniveau3, 50 présente certaines limites, que nous avons d’abord cherché à caractériser. Les
modèles multiniveaux fournissent un paramètre qui renseigne sur l’importance des variations
survenant d’une zone à l’autre.13 D’une part, ainsi que cela a été établi dans la littérature sur le
« modifiable areal unit problem » en géographie,70,
71, 72, 73
cette information dépend de la
taille et de la forme particulière des zones utilisées : en utilisant un zonage à une échelle plus
fine ou plus macro et en configurant les zones de façons différentes, on obtiendrait
certainement des informations différentes sur l’amplitude des variations inter-zones.121
Toutefois, au-delà de cet aspect, nous avons diagnostiqué une limite nettement plus
importante des modèles multiniveaux : ceux-ci tiennent compte de la similitude des individus
qui résident dans la même zone, mais ignorent complètement les relations spatiales entre les
zones, et s’avèrent donc complètement incapables d’examiner si des individus issus de zones
proches sur le territoire ont un niveau de risque plus similaire que des individus provenant de
zones plus éloignées. Les modèles multiniveaux permettent donc de réaliser des inférences sur
l’amplitude des variations inter-zones, mais ne permettent pas de tester l’hypothèse d’une
similitude de risque pour des zones proches sur le territoire, ni d’examiner à quelle échelle
existe une corrélation entre zones.
Dans nos deux travaux, nous avons exploré diverses options d’analyse spatiale afin
d’obtenir des informations moins partielles sur la distribution des phénomènes dans l’espace.
Dans l’étude des variations des modes de recours aux soins sur le territoire Français, nous
avons utilisé des modèles spatiaux mixtes, qui spécifient une structure de corrélation spatiale
au niveau individuel.79,
122
Ceux-ci ont permis de confirmer que les modes de recours aux
soins étaient corrélés sur le territoire à une échelle qui dépasse largement l’échelle des
communes de résidence. Toutefois, estimer une structure de corrélation spatiale au niveau
individuel est extrêmement lourd sur le plan calculatoire, et devient même rapidement
impossible dès que la taille de l’échantillon s’accroît. Cela nous a conduit à explorer d’autres
options de modélisation dans le second travail. Nous avons eu recours à un modèle
hiérarchique géostatistique très récemment développé,111, 112, 113, 114, 123 qui fournit différents
paramètres permettant d’évaluer si les variations spatiales du phénomène étudié sont
spatialement structurées ou au contraire complètement aléatoires. Ce modèle a permis de
confirmer que les quartiers à prévalence élevée de troubles liés à la consommation de
substances psycho-actives se trouvaient massés au centre et au nord de la ville, formant une
35
grappe de quartiers très statistiquement significative entre lesquels une collaboration pourrait
être utile si un programme d’intervention devait être mis en place.
2) Mesure des facteurs du contexte dans un espace continu centré sur le lieu de
résidence des individus
Au-delà de cette description de la distribution spatiale des phénomènes, l’objectif est de
comprendre l’origine de ces variations en cherchant des facteurs associés. Les facteurs
individuels démographiques et socio-économiques sont rarement répartis de façon homogène
dans l’espace, entraînant ainsi des effets de composition. Toutefois, de tels effets ne
permettent pas toujours d’expliquer l’ensemble des variations spatiales, et l’on cherche à voir
si les caractéristiques du contexte sont associées de façon indépendante aux phénomènes.3
L’approche dominante dans la littérature est de mesurer les facteurs du contexte au niveau
des zones administratives, pour lesquelles des données sont en général directement
disponibles. Une limite de cette approche est que les effets du contexte n’opèrent pas
nécessairement à l’échelle géographique qui est retenue pour les analyses.38 Dans bien des
cas, les facteurs du contexte sont susceptibles d’exercer leurs effets à un niveau bien plus
local qu’à celui des zones administratives utilisées pour les analyses. Au contraire, il est
également possible que les zones de tailles modestes habituellement retenues en analyse
contextuelle s’avèrent trop fines pour capter certains effets du contexte. Dans nos deux études
d’analyse spatiale, il semble que nous ayons été confrontés à ces deux cas de figure, ce qui
nous a conduit à proposer des approches de mesure des facteurs du contexte entièrement
innovantes.
Dans notre étude des variations spatiales des modes de recours aux soins sur le territoire
métropolitain Français, nous avons trouvé que les densités de médecins généralistes et de
spécialistes, ainsi que le niveau socio-économique du contexte de résidence étaient associés à
la propension des individus à compter en priorité sur leur médecin généraliste ou à consulter
au contraire divers spécialistes. Nous avons d’abord cherché à mesurer ces facteurs au niveau
de la commune de résidence des individus, puis au niveau de leur « zone d’emploi » de
résidence (l’INSEE ayant divisé le territoire en 348 zones d’emploi,124 qui sont donc
beaucoup plus vastes que les communes). Mesurer les densités de médecins au niveau
communal est certainement inadéquat, car les individus traversent fréquemment les frontières
de leur commune pour consulter un spécialiste.125 Ce raisonnement vaut en fait également
36
pour l’effet du niveau socio-économique du milieu de résidence.38 Puisque nous avons ajusté
sur divers facteurs socio-économiques individuels, notre hypothèse est que l’effet du niveau
socio-économique du contexte est lié aux valeurs, attitudes, et attentes à l’égard du système de
soins qui prévalent dans l’environnement de résidence. Or, les valeurs qui prévalent dans une
ville de 20 000 habitants ne sont certainement pas les mêmes si cette ville s’insère dans un
tissu urbain de communes de tailles plus importantes que si cette ville constitue l’unique pôle
urbain d’un espace à dominante rurale. Suivant cette hypothèse, on capte peut-être mieux les
effets du niveau socio-économique du contexte de résidence à un niveau plus large qu’à celui
de la commune de résidence. Toutefois, des mesures réalisées au niveau des zones d’emploi
n’offrent peut-être pas une solution satisfaisante. En effet, de telles mesures sont certainement
particulièrement inadéquates pour les individus qui résident sur les marges de ces zones, ne
permettant pas véritablement de capturer l’influence du contexte dans l’espace qui s’étend
autour de leur lieu de résidence.
Nous avons donc proposé une approche de mesure innovante du niveau socio-économique
du contexte de résidence : nous avons positionné des points tous les kilomètres sur l’ensemble
du territoire métropolitain, et avons attribué à chacun de ces points les caractéristiques de la
commune dans laquelle il était situé ; pour chaque individu localisé au centroïde de sa
commune, nous avons calculé le facteur contextuel en faisant la moyenne des valeurs
contextuelles aux points situés dans un espace circulaire centré sur l’individu dont la taille
excédait largement la surface de la commune de résidence. Lors du calcul de cette moyenne,
nous avons utilisé des pondérations afin de tenir compte du fait que des points situés à
proximité des individus avaient probablement un impact plus important sur leurs modes de
recours aux soins que des points situés à plus grande distance.14 L’article issu de ce travail,
que nous rapportons à la fin de ce chapitre, inclut une figure didactique qui permet de
visualiser les différences qui existent entre les différentes approches de mesure des facteurs du
contexte (figure 2). Les résultats de ce travail indiquent que nous sommes mieux parvenus à
expliquer les variations spatiales des modes de recours aux soins en mesurant les facteurs du
contexte dans un espace continu autour du lieu de résidence des individus plutôt qu’au niveau
des zones administratives.
Dans notre seconde étude, nous avons trouvé que la prévalence de troubles liés à la
consommation de substances psycho-actives augmentait avec le revenu moyen du quartier
administratif de résidence, après que l’on ait ajusté sur différents facteurs socio-économiques.
Contrairement à la précédente étude, il est apparu que l’on ne parvenait pas mieux à rendre
37
compte des variations de prévalence dans la ville de Malmö quand on tenait compte du niveau
socio-économique du contexte dans un espace allant au-delà des frontières administratives du
quartier de résidence. Disposant dans cette étude d’information sur le lieu de résidence exact
des individus, nous avons également cherché à voir si l’on parvenait mieux à rendre compte
des variations spatiales de prévalence en mesurant le revenu moyen dans le contexte de
résidence à un niveau plus local qu’à celui des quartiers administratifs. Les tests préliminaires
que nous avons réalisés semblaient confirmer cette hypothèse. Toutefois, mesurer le niveau
socio-économique au niveau de zones circulaires de faible dimension centrées sur le lieu de
résidence des individus pose un problème majeur : la répartition des habitants dans la ville de
Malmö étant très inégale, une telle approche aboutit à des valeurs manquantes ou à des
mesures basées sur une quantité d’information très faible pour les individus qui résident dans
les zones à faible densité de population. Nous avons finalement mis au point une procédure
innovante, qui résulte de l’adaptation à notre contexte des « spatially adaptive filters » utilisés
comme technique de lissage en géographie de la santé afin d’obtenir des cartes continues des
variations d’incidence de maladies.115, 126 Cette approche, qui tient compte de la population
environnante plutôt que de l’espace environnant, consiste à calculer le revenu moyen dans une
zone circulaire centrée sur chaque individu qui comporte le même nombre d’habitants. Ainsi,
la taille de la zone s’adapte à la densité de population, étant plus large dans les zones les
moins peuplées de la ville. Cette approche de mesure aboutit à des forces d’association
nettement plus importantes que lorsque le facteur du contexte est mesuré au niveau des
quartiers administratifs, et permet donc mieux d’identifier les lieux où la prévalence est la
plus élevée. Il apparaît ainsi que la prévalence de troubles mentaux et comportementaux liés à
la consommation de substances psycho-actives augmente très fortement dans les localisations
les plus défavorisées de la ville, que l’on peut repérer cartographiquement de façon très
précise.
Parce qu’elle s’appuie sur des données concernant l’ensemble des individus de la ville
géocodés à leur lieu exact de résidence, cette dernière étude nous a permis d’avancer de façon
significative dans la mise au point des approches à utiliser pour décrire et rendre compte de la
distribution spatiale des phénomènes de santé. Toutefois, une des limites importantes de
l’étude vient du caractère transversal des données utilisées. Nous étions en mesure d’identifier
des facteurs associés aux troubles considérés, ce qui est déjà utile d’un point de vue de santé
publique, mais ne pouvions pas tester l’hypothèse de relations causales entre les facteurs
socio-économiques du contexte de résidence et la survenue de troubles. Dans les prochains
38
mois, la base de données Suédoise que nous analysons dans le cadre de notre collaboration
avec l’Hôpital Universitaire de Malmö prendra en plus une dimension longitudinale, qui nous
permettra de prolonger ces premières analyses.
39
Comparison of a spatial approach with the multilevel approach for investigating place
effects on health: the example of healthcare utilisation in France
Basile Chaix, Juan Merlo, Pierre Chauvin
B Chaix, P Chauvin, Research Team on the Social Determinants of Health and Healthcare (INSERM U444),
National Institute of Health and Medical Research, Paris, France
J Merlo, Department of Community Medicine (Preventive Medicine), Malmö University Hospital, Lund
University, Malmö, Sweden
Correspondence to:
Basile Chaix
INSERM U444, Faculté de Médecine Saint-Antoine, 27 rue Chaligny, 75571 Paris Cedex 12, France
Tel: +33 (0)1 44 73 84 43; Fax: +33 (0)1 44 73 86 63; Email: [email protected]
Abstract:
Study objective: Most studies of place effects on health have followed the multilevel analytic approach,
which investigates geographic variations of health phenomena by fragmenting space into disconnected areas. We
examined whether analysing geographic variations across continuous space with spatial modelling techniques
and place indicators that capture space as a continuous dimension surrounding individual residences provided
more relevant information on the spatial distribution of outcomes. Healthcare utilisation in France was taken as
an illustrative example in comparing the spatial approach to the multilevel approach.
Design: Multilevel and spatial analyses of cross-sectional data.
Participants: 10 955 beneficiaries of the three main national health insurance funds, surveyed in 1998 and
2000 in mainland France.
Main results: Multilevel models showed significant geographic variations in healthcare utilisation.
However, the Moran’s I statistic indicated spatial autocorrelation unaccounted for by multilevel models.
Modelling the correlation between individuals as a decreasing function of the spatial distance between them,
spatial mixed models informed us not only on the magnitude, but also on the shape of spatial variations, and
provided more accurate standard errors for risk factors effects. The socioeconomic level of the residential
context and the supply of physicians were independently associated with healthcare utilisation. Place indicators
measured across continuous space, rather than within administrative areas, better explained spatial variations in
healthcare utilisation.
Conclusions: The conceptualization of space used during analysis influences our understanding of place
effects on health. Viewing space as a continuum may yield more relevant information on the spatial distribution
of outcomes in many contextual studies.
Key words: epidemiologic methods; logistic models; multilevel analysis; social environment; spatial
analysis
The past decade has seen a growing interest in the effects that places of residence have on health.[1, 2, 3, 4]
Most contextual studies conducted with individual-level data have followed the multilevel analytic approach
(based on usual random-coefficient multilevel models[5, 6] or alternating logistic regression[7, 8]). In this
approach, measures of association between contextual factors and health have their standard errors corrected for
the nonindependence of individuals within areas.[9] Furthermore, as Merlo has emphasized for some years,[10]
multilevel models provide measures of variation based on random effects (such as the area-level variance or the
variance partition coefficient) that inform us on the distribution of health outcomes across areas.[11, 12] In the
present study, as part of this project, we aim to show that the multilevel analytic approach fails to provide
optimal epidemiological information for both measures of association and measures of variation in many
analytic cases, due to dependence on a space fragmented into disconnected administrative areas.
Rewording the modifiable areal unit problem considered in geography,[13, 14, 15, 16] measures of variation
in multilevel models are dependent on the arbitrary size and shape of the areas.[17] More importantly for social
epidemiologists, even if appropriate size and shape are considered, the usual multilevel models neglect spatial
connections between areas, and assume independence for individuals from different areas, even if the areas are
close or adjacent.[9] Accordingly, the multilevel analytic approach fundamentally assumes that all spatial
correlation can be reduced to within-area correlation, and measures of variation only provide partial information
1
on the spatial distribution of health outcomes in quantifying the magnitude of correlation within areas but not the
range of correlation in space.
In order to obtain this epidemiologically relevant information, we suggest building continuous notions of
space into statistical models. This has been advocated in ecological studies for disease mapping,[18, 19, 20]
identification of clusters of disease,[19, 21] or implementation of spatial regression.[22, 23, 24, 25, 26, 27, 28]
However, there has been much less effort to do so in studies based on individual data.[29, 30, 31] Some authors
have modelled spatial variations of individual outcomes with nonparametric functions of the spatial location.[32,
33] However, this approach and others such as geographically weighted regression[34] do not provide the
parametric information of interest on the spatial distribution of outcomes. In any case, there has been almost no
attempt to examine whether investigating variations across continuous space provides more relevant information
than the multilevel approach in the social epidemiological field of contextual analysis.
Beyond individual factors, one generally uses contextual factors measured within administrative areas to
explain spatial variations of outcomes.[1] However, individuals may be affected not only by the characteristics
of their local administrative area of residence, but also by the context beyond these administrative boundaries,
since their social activities encompass a broader space.[35] Therefore, we propose a new approach for defining
the social factors of the context, an approach that considers spatial neighbourhoods, defined as continuous
spaces around individuals’ places of residence, rather than territorial neighbourhoods arbitrarily defined by
administrative boundaries.[36, 37, 38, 39]
In France, geographic variations in healthcare utilisation operate on a larger scale than the usual
administrative areas considered in multilevel analysis,[40] and therefore constitute an appropriate illustration of
the interest of considering space as a continuum, rather than as fragmented into disconnected areas. Individuals
in France can access specialist physicians directly, i.e., without any referral, as frequently as they wish, and
obtain partial or total reimbursement depending on their insurance status. Regarding utilisation behaviour,
although underuse of specialty care may result in suboptimal diagnosis and treatment options,[41, 42] frequent
self-referral to specialists without regular recourse to a primary care physician (PCP) leads to a lack of
coordination of care.[43, 44] We investigated whether the relative utilisation of PCPs or specialists was related
to the availability of physicians (a determinant in convenience of geographical access) and to the socioeconomic
level of the context (and related beliefs and expectations about the healthcare system).
Using a nationwide French survey sample, (a) we undertook a multilevel analysis of healthcare utilisation
and examined whether there was spatial autocorrelation unaccounted for by multilevel models; (b) we examined
whether spatial models (such as spatial mixed models[45]) better accounted for geographic variability and
provided more accurate information on spatial distributions than multilevel models; and (c) we investigated
whether measuring specific place characteristics across continuous space, rather than within administrative areas,
better explained the spatial variability of behaviour.
Methods
Datasets and outcomes
Our data came from the Survey on Health and Health Insurance conducted by the French Research and
Information Institute for Health Economics (IRDES).[46] Half of the sample was surveyed in 1998, the other
half in 2000. The nationwide population sample is representative of the persons insured through the three main
national health insurance funds (for salaried employees, farmers, self-employed people, and retirees in each
category, i.e., 96% of the population). After approval by the French National Commission for Data Protection,
survey data were merged with administrative files containing information on physician consultations for each
individual over a one-year period.
Two complementary binary outcomes were examined. The first indicated whether or not each individual had
a regular PCP. This outcome was derived from a question in the survey. Analysis of this outcome was restricted
to the individuals surveyed in 2000 (n = 5 227), the only year in which this question was asked. The second
binary outcome indicated whether more than 50% of an individual’s consultations over the course of the year
had been with specialists, rather than PCPs. This outcome was computed from the administrative data on
healthcare consumption. These data were successfully merged with survey data for 9 309 out of 10 955
individuals. Analysis of this outcome was undertaken among individuals who had had at least one consultation
over the one-year period (n = 8 102). We used a binary outcome because of the non-normality of the residuals in
a multilevel linear model for the proportion of specialist consultations expressed in its continuous form, which
also facilitated comparison with the model for the first binary outcome. Very similar results were obtained when
cut-offs other than 50% of specialist consultations were used to define this binary outcome. After excluding
individuals under 18 years of age, the final sample sizes were 5 217 for the PCP analyses, and 8 093 for the
analyses of the percentage of specialist consultations.
Municipality-level data, including socioeconomic data (from the 1999 census) and information on the
number of places where physicians could be consulted (from the ADELI database of the French Ministry of
Health), were linked to the samples described above.
2
Explanatory variables
Definition of contextual indicators in administrative areas: Mainland France is divided into 36 500
municipalities, as well as into 348 broad areas defined by aggregating adjacent municipalities between which
significant commuting occurrs.[47] In the dataset for the PCP outcome, 3 233 municipalities and 338 broad areas
were represented. In the dataset for the percentage of specialist consultations, 4 421 municipalities and 340 broad
areas were represented. Considering areas in the latter dataset, the median population size was 2 185
(interquartile range: 794–6 533) for municipalities, and 98 495 (61 818–178 720) for the broad areas.
Municipalities in which individuals had been surveyed were distributed across the entire territory of France
(figure 1).
Figure 1
Distribution of municipalities in which individuals were surveyed. Individual information on
healthcare utilisation was plotted over all of mainland France, providing a large quantity of information for
spatial regression analysis.
We did not have more precise locational information other than municipality affiliation (see the discussion
on this aspect). Individuals were located at the centroid of their municipality when computing contextual factors
across continuous space, but were randomly located within municipalities during spatial regression analysis (see
appendices 1 and 2 for rationale).
At the level of administrative areas, place indicators investigated were the percentage of inhabitants with
minimal education (incomplete low secondary schooling or less) and the densities of PCPs and specialists
(number of places of consultation per square kilometre). All these variables were computed at the municipality
level and at the broad area level.
Place indicators measured across continuous space: The three contextual factors were also measured across
continuous space surrounding each individual’s place of residence.[34, 39] This procedure is described in detail
in appendix 1. As illustrated on the bottom of figure 2, it consists in positioning points on every kilometre of
French territory, attributing to these points the socioeconomic characteristic of the municipality in which the
points are located, and computing the socioeconomic contextual factor for each individual as a weighted average
of the contextual values for the points located around that person. We used weights when computing this average
to indicate that points at a greater distance from an individual may impact that person less than points that are
closer.[34] Due to the weighting function used, our approach for computing contextual factors considers
contextual information in a radius of approximately 35 kilometres around individuals, a space far exceeding the
size of municipalities of residence. Therefore, as illustrated in figure 2, measures across continuous space clearly
differ from measures at either the municipality level or the broad area level.
As described in appendix 1, we also measured the supply of PCPs and supply of specialists across
continuous space by computing the weighted number of places of consultation within a radius of 50 kilometres
around an individual’s residence (we used weights to account for the fact that physicians at a great distance were
less accessible than closer ones).
Individual-level adjustment factors: The regression models were adjusted for health, demographic, and
socioeconomic variables that have repeatedly been shown to be associated with healthcare utilisation. Full details
on these individual variables are given in table 1.[48]
3
Figure 2 Measurement of the socioeconomic status of the context at the municipality level (above), at the broad
area level (middle), and across continuous space (bottom). Measures across continuous space, computed as a
weighted average of contextual information at surrounding points, take into account information in a much larger
space than the municipality of residence. For ease of illustration, only one point every 10 kilometres (rather than
every kilometre) is represented. The point size is a function of the weight value.
At the level of municipalities
At the level of broad areas
Measure across continuous space
Table 1 Individual-level variables used as adjustment factors in regression models for healthcare utilisation, France, 1998
and 2000
Variables
Categories
Age
Less than 30*; 30–44; 45–59; 60–74; 75 and older
Sex
Male*; female
Marital status
Married or living with partner*; single; divorced; widowed
Self-rated health (on a scale
Low score (from 0 to 6)*; medium-low score (equal to 7); medium-high
score (equal to 8); high score (equal to 9 or 10)
from 0 to 10)
Number of diseases†
Educational achievement level
Occupational status‡
0*; 1 or 2; 3 or 4; more than 4
Primary school or less*; secondary school; university; student
Unskilled blue-collar worker*; skilled blue-collar worker; lower-level
white-collar worker; mid-level position; upper-level white-collar worker
Employment status
Working*; unemployed; other
Household income per capita§
First quartile*; second quartile; third quartile; fourth quartile
Health insurance status
Basic insurance only*; supplementary insurance; payments waived for
medical reasons or due to poverty
*This is the reference category in the models. †A list of diseases was provided to individuals to assist in their reporting.
Physicians from the CREDES completed the list for each individual on the basis of their prescription drug usage and
consultation with health professionals. Dental diseases were excluded. ‡Occupational status was defined according to the
French List of Occupations and Social Categories published by the French National Institute of Statistics and Economic
Studies.[48] §Household income was adjusted for household size.
4
Statistical analysis
In order to rigorously compare the multilevel and spatial modelling approaches, the two-level multilevel
model (described in appendix 2) was fitted separately in two different ways: first with municipalities as the
second-level units, then with much larger areas, i.e., the broad areas mentioned above, as the second-level units.
In order to accurately estimate random variations between areas, the multilevel models were first estimated with
a Markov Chain Monte Carlo method (MLwiN 1.2, Institute of Education, London). To examine whether there
was spatial autocorrelation unaccounted for by multilevel models, we used Bivand’s R software package[49] to
compute Moran’s I statistics for the area-level residuals.[32, 50] In our case, the Moran’s I indicated whether
adjacent areas (i.e., areas sharing a common boundary) had more similar area-level residuals than would be
expected under spatial randomness. Moran’s I is approximately equal to 0 when there is no spatial
autocorrelation and positive when there is clustering.
In order to model geographic variations across continuous space, we used geostatistical spatial mixed
models that measure the correlation in healthcare utilisation between individuals as a decreasing function of the
spatial distance between them (see appendix 2 for details). Such models were fitted with the SAS macro
GLIMMIX (version 8.02, SAS Institute, Cary, NC, USA). In order to compare the fit of the empty multilevel
and spatial models, we refitted the multilevel models with GLIMMIX. We used the scaled deviance to compare
the different models. After including all individual-level variables, contextual variables were added to the
models, but were only retained if they were significantly associated with the outcomes in spatial mixed or
multilevel models. We successively estimated place effects as measured at the municipality level, at the broad
area level, and across continuous space. To compare the different measures, each indicator was divided into
quartiles.
Table 2 Summary of the different regression models fitted to the data in comparing an investigation
from a spatial perspective to the usual multilevel approach. Each cell of the table corresponds to a
different model, and contains references to locations where results of the model are displayed.
MunicipalityBroad area-level Spatial mixed
level multilevel
multilevel model model
model
Empty model
Table 3
Table 3
Table 3
Figure 5
Figures 3 and 5
Model with individual variables
Figure 5
Figure 5
Model with individual and
municipality-level contextual factors
Figure 5
Table 4
Figure 5
Model with individual and broad arealevel contextual factors
Figure 5
Table 4
Figure 5
Table 5
Figure 5
Tables 4 and 5
Figure 5
Model with individual and contextual
factors across continuous space
Table 5
Our spatial perspective comprises two different aspects: i) utilisation of spatial models, and ii) measurement
of contextual factors across continuous space. Obviously, it is necessary to test these two methods separately to
assess their own interest in contextual analysis. We therefore estimated multilevel models with contextual
variables measured within administrative areas, multilevel models with variables measured across continuous
space, and spatial mixed models with these two types of contextual measures. Table 2 provides summary
information on the different models fitted to the data.
Results
Twelve percent of the individuals reported they had no regular PCP, and 23% had had more than 50% of
their one-year consultations with specialists.
5
Table 3
2000
Results of the empty multilevel and spatial logistic models for healthcare utilisation, France, 1998 and
No regular primary care
physician
High percentage of
specialist consultations
0.382 (0.133)**
0.175 (0.059)**
0.33 (0.02)***
4738.9
0.20 (0.02)***
8841.2
0.249 (0.068)***
0.140 (0.030)***
0.24 (0.03)***
4056.3
0.32 (0.03)***
8625.6
σ² (SE)
0.032 (0.008)***
0.033 (0.010)***
σ1² (SE)
1.084 (0.023)***
1.116 (0.018)***
Municipality-level multilevel model†
Area-level variance σu² (SE)
Moran’s I for area residuals (SE)
Scaled deviance
Broad area-level multilevel model†
Area-level variance σu² (SE)
Moran’s I for area residuals (SE)
Scaled deviance
Spatial model‡
115.5 (64.7)*
ρ (SE)
Scaled deviance
3603.2
7840.6
*p < 0.05; **p < 0.01; ***p < 0.001 (p-values are two-sided). †The multilevel models parameter were estimated
by the Markov Chain Monte Carlo method (MLwiN). The Wald test was used for the area-level variance. To
compute the Moran’s I, we used the area-level residuals of 67% of the municipalities (n = 2 167) for the variable
regarding regular primary care physicians, and the residuals of 73% of the municipalities (n = 3 227) for the
variable regarding specialty care use (the other municipalities had no adjacent municipality in the sample). All
broad area residuals were used to compute the Moran’s I. Based on the assumption of normality for the arealevel residuals, the Moran’s I is normal under the null hypothesis, with a mean equal to 0 and a known variance.
We computed a two-tailed p-value for the Moran’s I. Scaled deviances come from multilevel models estimated
using the GLIMMIX macro. ‡Spatial model parameters were estimated with GLIMMIX. The Wald Z-test was
used for the covariance parameters. The parameter σ² is the partial sill, σ1² is the nugget effect, and three times
the parameter ρ is the range of the model (the distance beyond which the correlation is less than 5% of the
correlation at distance 0).
16.40 (9.64)*
Multilevel models indicated significant variations for both outcomes at the municipality level or at the broad
area level (two-sided p-value < 0.001; table 3). The Moran’s I for area-level residuals was significantly positive
in all multilevel models (two-sided p-value < 0.001), indicating unaccounted spatial autocorrelation between
adjacent areas (table 3).
From the empty spatial models (table 3), two individuals located in the same place had a correlation equal to
0.028 for not having a regular PCP (figure 3). Such a correlation was 46% lower for individuals 10 kilometres
apart, and 95% lower for individuals 50 kilometres apart. The correlation in having a high percentage of
specialist consultations was of similar magnitude for individuals in the same place, but decreased more gradually
with increasing distance between individuals (correlation was 5% and 23% lower, respectively, for individuals
10 and 50 kilometres apart). For both outcomes, the scaled deviance was markedly lower in the empty spatial
models than in the empty multilevel models, indicating a better fit for the spatial correlation structure to the data
(table 3).
A spatial mixed model adjusted for individual factors indicated that a lower percentage of minimally
educated inhabitants (i.e., a higher socioeoconomic status of the context) predicted higher odds of not having a
regular PCP (table 4). However, the supply of physicians was not associated with this outcome. A higher
socioeconomic status of the context and a greater supply of specialists independently predicted higher odds of
having a high percentage of specialist consultations. Although confidence intervals were wide, there was an
indication of consistently stronger associations between contextual variables and the outcomes when the
variables were measured across continuous space, rather than within administrative areas (table 4).
6
Figure 3 Correlation between individuals in healthcare utilisation behaviour as a function of the spatial distance
between them, as estimated by empty spatial mixed models, France, 1998 and 2000.
Outcome: No regular primary care physician
Outcome: High percentage of specialist consultations
The different approaches to measuring contextual variables illustrated in figure 2 lead to different
geographic representations when identifying places that do not share the same levels of exposure to these
characteristics. We illustrate this aspect in figure 4, where the socioeconomic level of the context, as defined by
the three different approaches, is mapped for a rectangular zone around the city of Paris.
We estimated the area-level variance and the Moran’s I in the consecutive multilevel models (including no
covariates, individual covariates, and finally contextual factors). We represented these indicators at the top of
figure 5 for the model for specialty care use with individuals nested within broad areas. The unexplained
heterogeneity between broad areas (expressed as the area-level variance) decreased when individual and
contextual variables were introduced into the model. Area-level variance was lowest when place indicators were
measured across continuous space. The Moran’s I similarly decreased with the increasing complexity of the
model, and again was lowest when place characteristics were measured across continuous space. The same
pattern was true for not having a regular PCP, and for the multilevel models with municipalities as the second
level (results not shown). In all multilevel models, the Moran’s I remained significant (two-sided p-value < 0.05;
results not shown) after the contextual variables were included, indicating unaccounted residual spatial
autocorrelation.
In the different consecutive spatial mixed models, we examined the correlation between the residuals, which
correlation is modelled as a decreasing function of the spatial distance between individuals. The case of the
models for the percentage of specialist consultations is shown at the bottom of figure 5. The residual spatial
autocorrelation was lowest when place indicators were measured across continuous space. The same pattern was
true for not having a regular PCP (results not shown).
7
Table 4 Place effects on healthcare utilisation from spatial models adjusted for individual-level characteristics (place
indicators were successively measured at the municipality level, at the level of broad areas, and across continuous
space), France, 1998 and 2000
Municipality-level
Broad area-level
Effects measured
effects*
effects*
across continuous
space*
OR
Outcome: No regular primary care
physician†
Percentage of minimally educated
inhabitants (vs. “high”, fourth quartile)
Medium-high (third quartile)
Medium-low (second quartile)
Low (first quartile)
95% CI
1.03 (0.80, 1.33)
1.27 (0.99, 1.63)
1.79 (1.38, 2.31)
OR
95% CI
0.97 (0.75, 1.27)
1.22 (0.93, 1.59)
1.86 (1.40, 2.46)
OR
95% CI
1.02 (0.75, 1.37)
1.49 (1.10, 2.00)
2.24 (1.61, 3.13)
Outcome: High percentage of specialist
consultations†
Percentage of minimally educated
inhabitants (vs. “high”, fourth quartile)
Medium-high (third quartile)
1.09 (0.92, 1.30)
1.10 (0.91, 1.34)
1.18 (0.97, 1.45)
Medium-low (second quartile)
1.30 (1.08, 1.57)
1.20 (0.97, 1.49)
1.38 (1.09, 1.74)
Low (first quartile)
1.50 (1.23, 1.84)
1.17 (0.90, 1.53)
1.62 (1.15, 2.28)
Supply of specialists
(vs. “low”, first quartile)
Medium-low (second quartile)
1.02 (0.86, 1.21)
1.20 (0.96, 1.50)
1.40 (1.05, 1.88)
Medium-high (third quartile)
1.19 (0.93, 1.53)
0.97 (0.71, 1.33)
1.70 (1.08, 2.67)
High (fourth quartile)
1.48 (1.05, 2.09)
1.95 (1.14, 3.33)
2.03 (1.13, 3.67)
Supply of primary care physicians
(vs. “low”, first quartile)
Medium-low (second quartile)
0.92 (0.77, 1.09)
0.89 (0.72, 1.11)
0.88 (0.66, 1.18)
Medium-high (third quartile)
0.87 (0.67, 1.12)
1.05 (0.78, 1.42)
0.69 (0.44, 1.09)
High (fourth quartile)
0.82 (0.57, 1.17)
0.66 (0.39, 1.10)
0.59 (0.33, 1.07)
*The odds ratios reported here were adjusted for all individual-level factors listed in table 1. †In the model for not
having a regular primary care physician, the only contextual variable retained was the percentage of minimally
educated inhabitants. In the model regarding the percentage of specialist consultations, we retained the three contextual
factors (each effect above has been adjusted for the others).
Finally, we examined whether multilevel models overestimated the significance level of contextual effects
due to ignored spatial autocorrelation. We considered models including place characteristics measured across
continuous space and divided the place effect parameters by their standard errors. The multilevel models
systematically overestimated the level of significance of place effects, as compared to the spatial models (table
5). As an example, the supply of PCPs was significantly associated with specialty care use in the multilevel
models, whereas this was not the case in the spatial model.
8
Figure 4 Variations of socioeconomic level in place of residence as measured at the municipality level, at the
level of broad areas, and across continuous space in a rectangular zone around the city of Paris (the boundaries
of Paris appear in bold at the centre of the maps). On the lower map (showing measures across continuous
space), the value plotted in each municipality was obtained by considering contextual information in a circular
space that far exceeds the area of the municipality. Therefore, the smoothed pattern that appears on the map
simply indicates that individuals residing in neighbouring municipalities share common contextual influences.
†Each indicator was divided into quartiles, with cut-offs from the study sample of 5 217 individuals for the
outcome variable regarding regular primary care physicians.
At the level of municipalities
At the level of broad areas of residence
Measure across continuous space
9
Figure 5
Explanation of geographic variations in the odds of having a high percentage of specialist
consultations with individual and contextual variables, France, 1998 and 2000. Top: area-level variance
estimated from multilevel models with individuals nested within broad areas, and Moran’s I statistic computed
from broad area-level residuals (bars: 95% confidence interval). Bottom: residual correlation between
individuals by spatial distance between them from spatial mixed models.
Multilevel model with individuals nested within broad areas
From the spatial mixed model
Table 5 Significance level (defined by dividing the parameters by their standard errors) of place
effects measured across continuous space on healthcare utilisation, estimated from multilevel models
with individuals nested within municipalities or broad areas, and spatial mixed models, France, 1998
and 2000
MunicipalityBroad area-level Spatial mixed
level multilevel
multilevel model model
model
Outcome: No regular primary care
physician
Percentage of minimally educated
inhabitants, lowest quartile
7.1
6.0
4.8
Outcome: High percentage of specialist
consultations
Percentage of minimally educated
inhabitants, lowest quartile
Supply of specialists, highest quartile
Supply of primary care physicians,
highest quartile
5.7
3.4
4.7
3.3
2.8
2.4
3.3
3.0
1.7
10
Discussion
We proposed a spatial perspective of investigation based on a continuous notion of space. Following the
seminal distinction of Merlo between measures of variation and measures of association with contextual
factors,[11, 12] we found in our case that both types of measures provided more relevant epidemiological
information when embedded in the spatial perspective than in the multilevel framework.
Limitations of the illustrative example
First, we did not have more precise locational information other than municipality affiliation. However, this
lack of precision may not be of critical importance in our study. Indeed, the 36 500 French municipalities
constitute more local areas than the municipalities in many other countries. Moreover, the interest of the spatial
approach over the multilevel approach was to better describe geographic variations in healthcare utilisation,
which operate at a much broader scale than the municipality level.[40] Therefore, in our case, it would certainly
be more undesirable to neglect geographic correlation between neighbouring municipalities than to ignore
geographic variations within municipalities.
Second, sample sizes could have been more important, especially when conducting municipality-level
multilevel analyses. However, our samples were sufficient to quantify geographic variations between
municipalities which, as expected, were of greater magnitude than broad area-level variations.
Investigating the magnitude and shape of spatial variations
The spatial correlation of outcomes, rather than a nuisance, is of direct interest, and needs to be modelled
properly to obtain relevant information. In neglecting spatial relationships between areas and only considering
the correlation of outcomes within areas, multilevel models were unable to provide complete information on the
spatial distribution of healthcare utilisation. Contrary to the assumption that areas of high risk and areas of low
risk were randomly distributed in space, the Moran’s I indicated that individuals residing in adjacent areas
exhibited more similarity of behaviour than would be expected under spatial randomness. Due to unaccounted
spatial autocorrelation, multilevel models overestimated the significance level of contextual variables and
resulted in incorrect inferences.
Viewing space as a continuum, spatial mixed models captured the spatial autocorrelation unaccounted for by
multilevel models. Modelling the correlation between individuals as a decreasing function of the spatial distance
between them, spatial models not only captured the magnitude of spatial variations but also the shape of spatial
variations (with information on the range of correlation in space), thus indicating geographic coherence in
healthcare utilisation at a much larger scale than the municipality level.
Measuring contextual factors across continuous space
Measures across continuous space allowed us to better explain spatial variations in healthcare utilisation
than municipality-level or broad area-level factors. Regarding municipality-level measures, individuals may be
affected not only by the characteristics of their municipality, but also by surrounding municipalities. Indeed,
residing in a deprived municipality may have a different impact on healthcare utilisation if the municipality
belongs to a globally affluent area than to a socially disadvantaged area. Obviously, such effects may be more
efficiently captured by measures that consider contextual influences in a space that exceeds municipality
boundaries than by municipality-level factors. On the other hand, the broad administrative areas considered in
our study are not centred on the individual residence, and may therefore not have allowed us to adequately
capture contextual effects. Measures across continuous space surrounding individuals were more appropriate in
reflecting contextual influences on healthcare utilisation that operate on a larger scale than the municipality
level.
Conclusions
Our study shows that the conceptualization of space used during analysis influences the understanding of
place effects on health. In our investigation of healthcare utilisation, both measures of variation and measures of
association between contextual factors and health were found to provide more relevant information when
viewing space as a continuum rather than as fragmented into disconnected areas. We are aware that the
multilevel approach may be appropriate when the context is defined in a way that is not strictly geographic way
(e.g., workplaces or schools);[17] when investigating processes operating at the scale of administrative areas
(e.g., related to public policies); or when spatial correlation can be reduced to the correlation within areas.
However, in many social epidemiological studies, investigating geographic variations across continuous space
using spatial modelling techniques and place indicators that capture space as a continuous dimension may be
more appropriate in describing and explaining spatial variability of health outcomes.
11
Appendix 1: Contextual measures across continuous space
We used a geographical information system (GIS) with municipalities georeferenced as polygons.
Individuals were positioned at the centroid of their municipality when computing contextual factors. Indeed,
there is no reason to attribute a different location, and accordingly a different contextual value, to individuals
from the same municipality, since we have no other locational information than municipality affiliation.
As figure 2 indicates, our approach for the socioeconomic contextual factor takes into account contextual
information at geographical points located in a circular space around individuals, which space far exceeds the
boundaries of the municipality of residence. The percentage of minimally educated inhabitants was available at
the municipality level and needs to be attributed to the geographical points. However, municipalities differ in
size, and only considering one point per municipality located around an individual’s residence would result in
overestimating the impact of smaller vs. larger municipalities on that individual. We, therefore, regularly
positioned points on every kilometre of French territory (resulting in a regular grid of 540 000 points), and
attributed to each point the socioeconomic characteristic of the municipality in which it was located, thereby
ensuring that all neighbouring locations were equally represented when computing the contextual factor.
Weights were needed to indicate the extent to which points that were situated further from individuals had
less of an impact on them than points that were closer.[34] We defined such weights by assuming that the extent
to which individuals at a given location were affected by surrounding locations was a function of the overall
movement of individuals regularly travelling between locations. Thus, we aimed to approximate the global
movement between locations as a function of the distance between locations in the territory. For the sake of
simplicity, we estimated a mean function for the whole of France, rather than employing a place-specific
weighting function. We approximately quantified the regular movement between locations by considering
distances covered by individuals in going to work. We used the 1999 French census that provided municipality
of residence and workplace municipality for the 22 million individuals employed in mainland France. The
straight-line distance between the centroids of the municipalities of residence and work (set to 0 for individuals
working in their residential municipality) followed an exponential distribution, with a density of probability
approximately equal to w(d) = 0.0799 × exp(–0.0799 × d), where d is the distance in kilometres. We used values
of this decreasing function of the distance as weights in the computation of contextual factors.
The socioeconomic contextual factor Si for individuals located at the centroid of a municipality i was
computed as a weighted average of the socioeconomic values at surrounding points j:
S i = ∑ wij s j
j
∑w
ij
with wij = w(d), and wij = 0 for w(d) < 0.05 × w(0)
Equation 1
j
where sj is the socioeconomic value attributed to point j of the one-kilometer grid, and wij is the weight of point j
on individuals from municipality i. Weights were defined with the decreasing function of the distance described
above, but were set to 0 when less than 5% of the weight for a point at distance 0. As indicated in figure 2,
practically this means that our approach considers contextual information in a circular space of 37.5 kilometres
of radius around individuals, a space markedly larger than any municipality. Our approach is an adaptation of the
spatial filters used in disease mapping to obtain smoothed maps of disease incidence.[20, 31]
For measuring the supply of physicians across continuous space, each place of consultation was randomly
located within its municipality (exact locations were not available). For individuals in municipality i, we
determined the weighted number of places of consultation j within a radius of 50 kilometres as:
Pi = ∑ wij with wij = w(d) for d < 50 kilometres; otherwise wij = 0
Equation 2
j
For example, for an individual having two physicians within 50 kilometres, at distances 10 and 30
kilometres, Pi would be equal to w(10) + w(30). Pi would be greater if more physicians were present, and if they
were closer to the individual’s residence. Such indicator was computed separately for PCPs and specialists.
Our approach consists in measuring contextual factors across continuous space, while allowing for the more
significant impact of nearer locations.[34] Slightly different approaches and refinements of the method may be
suggested to implement this general idea. For example, rather than directly attributing municipality
characteristics to the one kilometre grid points, a smoothed surface of the socioeconomic characteristic obtained
through the kriging approach might be considered. Similarly, other options exist to define the weighting
function, and different functions may be needed in different parts of the territory or for the different contextual
factors, but investigating these aspects will require sensitivity analyses.
Appendix 2: The multilevel and the spatial mixed models
Let yij be the value of the binary outcomes for individual i in area j. We first fitted empty multilevel logistic
models for these outcomes:[6]
yij = πj + eij
Equation 3
logit (πj) = β0 + uj
uj ~ N(0, σu²)
12
where uj is the random deviation of intercept β0 for area j. To account for the hierarchical structure of the data,
the multilevel model includes area-level residuals uj of variance σu².
We also modelled geographic variations across continuous space. As figure 1 shows, we did not have
individual information for all French municipalities. Accordingly, spatial lattice models,[24, 51] which usually
consider correlation between adjacent areas on the territory, were not adapted to our case. The use of a
geostatistical model considering locations on the territory proved more appropriate.
The spatial mixed models considered are not dependent on a space fragmented into areas.[45] Individuals
were randomly located in their municipality so that, in estimating the spatial correlation function, the distance
between individuals from the same municipality would not be set to 0. Results remained unchanged when
randomly relocating individuals within municipalities.
Let us consider a logistic model:
yi = πi + ei
Equation 4
logit (πi) = β0
Let dij be the spatial distance between places of residence for individual i and individual j. Spatial mixed
models do not take into account geographic correlation with area-level random effects, but specify a spatial
correlation structure for the individual residuals, assuming that the correlation between residuals ei and ej for
individuals i and j is a decreasing function of the distance dij:
Corr (ei, ej) = σ² [exp(-dij/ρ)] / (σ² + σ1²)
Equation 5
Exp(-dij/ρ) indicates that the correlation is proportional to the exponentiated distance between individuals. In
this model, two individuals located at the same place may have an estimated correlation below 1, since they do
not necessarily have identical healthcare utilisation behaviour.
REFERENCES
1 Diez-Roux AV. Multilevel analysis in public health research. Annu Rev Public Health 2000;21:171-92.
2 Diez-Roux AV. Bringing context back into epidemiology: variables and fallacies in multilevel analysis. Am
J Public Health 1998;88:216-22.
3 Pickett KE, Pearl M. Multilevel analyses of neighbourhood socioeconomic context and health outcomes: a
critical review. J Epidemiol Community Health 2001;55:111-22.
4
Merlo J, Asplund K, Lynch JW, et al. Population effects on individual systolic blood pressure - a
multilevel analysis of WHO MONICA project. Am J Epidemiol 2004;159:1168-79.
5 Goldstein H, Browne W, Rasbash J. Multilevel modelling of medical data. Stat Med 2002;21:3291-315.
6 Leyland AH, Goldstein H. Multilevel modelling of health statistics. Chichester, England: Wiley, 2001.
7
Bobashev GV, Anthony JC. Clusters of marijuana use in the United States. Am J Epidemiol
1998;148:1168-74.
8 Preisser JS, Arcury TA, Quandt SA. Detecting patterns of occupational illness clustering with alternating
logistic regressions applied to longitudinal data. Am J Epidemiol 2003;158:495-501.
9 Snijders T, Bosker R. Multilevel Analysis. An introduction to basic and advanced multilevel modelling.
London, England: Sage Publications, 1999.
10 Merlo J, Östergren PO, Hagberg O, et al. Diastolic blood pressure and area of residence: multilevel versus
ecological analysis of social inequity. J Epidemiol Community Health 2001;55:791-8.
11 Merlo J, Chaix B, Yang M, et al. A brief conceptual tutorial on multilevel analysis in social epidemiology
- linking the statistical concept of clustering to the idea of contextual phenomenon. J Epidemiol Community
Health 2004; in press.
12 Merlo J, Yang M, Chaix B, et al. A brief conceptual tutorial on multilevel analysis in social epidemiology
- investigating contextual phenomena in different groups of individuals. J Epidemiol Community Health 2004; in
press.
13 Fotheringham AS, Wong DWS. The modifiable areal unit problem in multivariate statistical analysis.
Environ Plan A 1991;23:1025-44.
14 Amrhein CG. Searching for the elusive aggregation effect: evidence from statistical simulations. Environ
Plan A 1995;27:105-19.
15
Martin D. An assessment of surface and zonal models of population. Int J Geographical Information
Systems 1996;10:973-89.
16 Holt D, Steel DG, Tranmer M. Area homogeneity and the modifiable areal unit problem. Geographical
Systems 1996;3:181-200.
17 Mitchell R. Multilevel modeling might not be the answer. Environ Plan A 2001;33:1357-60.
18 Carrat F, Valleron AJ. Epidemiologic mapping using the "kriging" method: application to an influenzalike illness epidemic in France. Am J Epidemiol 1992;135:1293-300.
13
19 Green C, Hoppa RD, Young TK, et al. Geographic analysis of diabetes prevalence in an urban area. Soc
Sci Med 2003;57:551-60.
20 Rushton G. Public health, GIS, and spatial analytic tools. Annu Rev Public Health 2003;24:43-56.
21 Sabel CE, Boyle PJ, Loytonen M, et al. Spatial clustering of amyotrophic lateral sclerosis in Finland at
place of birth and place of death. Am J Epidemiol 2003;157:898-905.
22 Kleinschmidt I, Sharp BL, Clarke GP, et al. Use of generalized linear mixed models in the spatial analysis
of small-area malaria incidence rates in Kwazulu Natal, South Africa. Am J Epidemiol 2001;153:1213-21.
23 English PB, Kharrazi M, Davies S, et al. Changes in the spatial pattern of low birth weight in a southern
California county: the role of individual and neighborhood level factors. Soc Sci Med 2003;56:2073-88.
24 Kleinschmidt I, Sharp B, Mueller I, et al. Rise in malaria incidence rates in South Africa: a small-area
spatial analysis of variation in time trends. Am J Epidemiol 2002;155:257-64.
25 Joines JD, Hertz-Picciotto I, Carey TS, et al. A spatial analysis of county-level variation in hospitalization
rates for low back problems in North Carolina. Soc Sci Med 2003;56:2541-53.
26
Werneck GL, Maguire JH. Spatial modeling using mixed models: an ecologic study of visceral
leishmaniasis in Teresina, Piaui State, Brazil. Cad Saude Publica 2002;18:633-7.
27
Leyland AH, Langford IH, Rasbash J, et al. Multivariate spatial models for event data. Stat Med
2000;19:2469-78.
28 Langford IH, Bentham G, McDonald AL. Multilevel modelling of geographically aggregated health data:
a case study on malignant melanoma mortality and UV exposure in the European Community. Stat Med
1998;17:41-57.
29 Gemperli A, Vounatsou P, Kleinschmidt I, et al. Spatial patterns of infant mortality in Mali: the effect of
malaria endemicity. Am J Epidemiol 2004;159:64-72.
30 Banerjee S, Wall MM, Carlin BP. Frailty modeling for spatially correlated survival data, with application
to infant mortality in Minnesota. Biostatistics 2003;4:123-42.
31 Rushton G, Peleg I, Banerjee A, et al. Analyzing geographic patterns of disease incidence: rates of latestage colorectal cancer in Iowa. J Med Syst 2004;28:223-36.
32 Burnett R, Ma R, Jerrett M, et al. The spatial association between community air pollution and mortality:
a new method of analyzing correlated geographic cohort data. Environ Health Perspect 2001;109 Suppl 3:37580.
33
Cakmak S, Burnett RT, Jerrett M, et al. Spatial regression models for large-cohort studies linking
community air pollution and health. J Toxicol Environ Health A 2003;66:1811-23.
34 Fotheringham AS, Charlton ME, Brunsdon C. Spatial variations in school performance: a local analysis
using geographically weighted regression. Geographical & Environmental Modelling 2001;5:43-66.
35 Morenoff JD. Neighborhood mechanisms and the spatial dynamics of birth weight. AJS 2003;108:9761017.
36 Treno AJ, Gruenewald PJ, Johnson FW. Alcohol availability and injury: the role of local outlet densities.
Alcohol Clin Exp Res 2001;25:1467-71.
37 Liu GC, Cunningham C, Downs SM, et al. A spatial analysis of obesogenic environments for children.
Proc AMIA Symp 2002:459-63.
38 Ali M, Emch M, Tofail F, et al. Implications of health care provision on acute lower respiratory infection
mortality in Bangladeshi children. Soc Sci Med 2001;52:267-77.
39
Grasland C, Mathian H, Vincent JM. Multiscalar analysis and map generalisation of discrete social
phenomena: Statistical problems and political consequences. Stat J UN Econ Comm Eur 2000;17:1-32.
40 Chaix B, Boëlle PY, Guilbert P, et al. Area level determinants of specialty care utilisation in France: a
multilevel analysis. Public Health 2004; in press.
41 Soloway B. Primary care and specialty care in the age of HAART. AIDS Clin Care 1997;9:37-9.
42
Baker DW, Hayes RP, Massie BM, et al. Variations in family physicians' and cardiologists' care for
patients with heart failure. Am Heart J 1999;138:826-34.
43 Grumbach K, Selby JV, Damberg C, et al. Resolving the gatekeeper conundrum: what patients value in
primary care and referrals to specialists. JAMA 1999;282:261-6.
44
Bodenheimer T, Lo B, Casalino L. Primary care physicians should be coordinators, not gatekeepers.
JAMA 1999;281:2045-9.
45 Littel RC, Milliken GA, Stroup WW, et al. SAS System for Mixed Models. Cary, North Carolina, USA:
SAS Institute, 1996.
46 Auvray L, Dumesnil S, Le Fur P. Santé, soins et protection sociale en 2000 [Health, healthcare and
insurance in 2000] (in French). Paris, France: CREDES, 2001.
47 Zonage d'Etudes [Geographic subdivisions of the territory] (in French). Paris, France: Institut National
de la Statistique et des Etudes Economiques
(http://www.insee.fr/fr/nom_def_met/nomenclatures/zonages_etudes/index.htm).
14
48
Nomenclature des professions et catégories socioprofessionnelles [List of professions and social
categories] (in French). Paris, France: Institut National de la Statistique et des Etudes Economiques, 1994.
49
Bivand R. Spatial dependence: weighting schemes, statistics and models. (http://cran.rproject.org/src/contrib/PACKAGES.html#spdep).
50 Walter SD. The analysis of regional patterns in health data. II. The power to detect environmental effects.
Am J Epidemiol 1992;136:742-59.
51 Richardson S, Thomson A, Best N, et al. Interpreting posterior relative risk estimates in disease-mapping
studies. Environ Health Perspect 2004;112:1016-25.
15
Comparison between a spatial perspective and the multilevel analytic approach in neighborhood studies:
the example of mental and behavioral disorders due to psycho-active substance use in Malmö, Sweden,
2001
B. Chaix, J. Merlo, S.V. Subramanian, J. Lynch, P. Chauvin
B. Chaix, P. Chauvin, Research Team on the Social Determinants of Health and Healthcare (INSERM U444),
National Institute of Health and Medical Research, Paris, France
J. Merlo, Department of Community Medicine (Preventive Medicine), Malmö University Hospital, Lund
University, Malmö, Sweden
S.V. Subramanian, Harvard School of Public Health, Boston MA, USA
J. Lynch, Department of Epidemiology, University of Michigan, Ann Arbor MI, USA
Abstract
Almost all studies of neighborhood effects on health have followed the multilevel analytic approach. However,
in such an approach, measures of variation and measures of association between contextual factors and health
may not provide optimal epidemiological information, due to the dependence on a space fragmented into
neighborhoods. Using data on all individuals aged 40-59 in the city of Malmö, Sweden, geolocated at their exact
residence to investigate the spatial distribution of mental and behavioral disorders due to psychoactive
substances, the authors compare a spatial perspective of investigation, which builds on a continuous conception
of space, to the multilevel analytic approach. A geoadditive model based on individual-level locational
information was used to obtain visual information on spatial risk variations independent of administrative
neighborhood boundaries. The multilevel model showed significant neighborhood-level variations in the risk of
substance-related disorders. The hierarchical geostatistical model provided information not only on the
magnitude but also on the shape of neighborhood variations, indicating significant correlation between
neighborhoods close to each other. After individual-level adjustment, the prevalence of substance-related
disorders increased with contextual deprivation, with much stronger associations when measuring contextual
deprivation in spatially adaptive areas of smaller size than the administrative neighborhoods centered on the
exact place of residence. In many neighborhood studies, viewing space in a continuous way may yield more
complete information on the spatial distribution of health outcomes.
During the past decade, there has been growing research interest in the impact of the neighborhood of residence
on health (1, 2). Most of the studies have followed the multilevel analytic approach (3, 4). Indeed, in multilevel
models, measures of associations between neighborhood factors and health have their standard errors corrected
for the nonindependence of individuals within neighborhoods (5, 6). Futhermore, as emphasized by Merlo,
multilevel models provide measures of variation based on random effects (such as the area-level variance or
variance partition coefficient) that provide information on the distribution of health phenomena across
neighborhoods (7, 8). However, as part of this project, we aim to show that in many neighborhood studies the
multilevel analytic approach may fail to provide optimal epidemiological information for both measures of
association and measures of variation, due to the notion of space on which it is grounded.
Indeed, measures of variation in the multilevel approach are affected by the so-called modifiable areal unit
problem (9-11): they are dependent on the particular size and shape of the administrative areas (12-15). More
importantly, even if appropriate scale and zoning are used, multilevel models neglect spatial connections
between neighborhoods and assume that all spatial correlation can be reduced to within-neighborhood
correlation. Accordingly, they only provide partial parametric information on the spatial distribution of
outcomes. They quantify the magnitude of neighborhood variations but do not provide indication on their shape,
i.e., on the extent to which the neighborhood variability follows spatially structured trends or alternatively
consists of unstructured random variations (16). Such descriptive information is interesting in an epidemiological
perspective, since the existence of spatial clusters of elevated risk exceeding administrative neighborhood
boundaries indicates that public health intervention efforts should be coordinated on a larger scale than the
neighborhood scale.
Mapping the neighborhood-level residuals of the multilevel model provides indication on the shape of
neighborhood variations (16), but the existence of statistically significant clusters of neighborhoods at risk
cannot be assessed with an approximate judgment based on visual information. The spatial scan statistic
approach has been proposed to identify clusters of areas with a higher risk of disease (17, 18). However, this
ecologic approach based on aggregated rates does not allow one to simultaneously investigate individual and
contextual effects on the outcome.
Beyond multilevel models, geographic variations of health at the individual level have recently been
investigated with geoadditive models that capture spatial variations with a two-dimensional (longitude / latitude)
smooth term (19). Whereas parametric approaches are often computationally unable to deal with information on
the exact spatial coordinates of the individuals, we illustrate in the example below that geoadditive models
1
allows using this precise locational information to produce smoothed maps of risk independent of neighborhood
boundaries (20-22). However, this approach only provides visual information, but no parametric information on
the magnitude and shape of spatial variations. Furthermore, it is never entirely clear whether the similar risk
level observed for surrounding neighborhoods in the estimated spatial surface of risk corresponds to the real
pattern of variations or simply results from the (over-) smoothing of data.
Many regression analyses based on aggregated data have emphasized the interest of modeling the spatial
autocorrelation of outcomes with parametric approaches (23-28). However, there has been much less effort to do
so with individual-level data (16, 29-31). In order to obtain epidemiologically relevant information on the shape
of spatial variations, we used a hierarchical geostatistical model (29), which allows splitting the neighborhood
variability in a spatially structured component and an unstructured component (16). This model includes
parameters that allow one to make statistical inferences not only on the magnitude of correlation within
neighborhoods, but also on the range of correlation in space (30, 31). Quantifying the extent to which the spatial
range of correlation of outcomes exceeds neighborhood boundaries provides parametric support to interpret
visual evidence of spatial clustering in the neighborhood risk level.
Regarding measures of association between contextual factors and health, measuring contextual variables
within administrative neighborhoods may be restrictive when individual-level locational information is available.
Indeed, the administrative scale of the neighborhoods may be too broad to capture the contextual effects at play
(32). Moreover, regardless of the scale, such measures made in fixed boundary areas may not really capture
contextual information in surrounding space for individuals residing on the margins of the administrative
neighborhoods. Therefore, we propose to measure contextual factors within small-size areas centered on the
exact place of residence of the individuals (i.e., within moving-window areas).
In the present study, we used data on all individuals aged 40-59 in the Swedish city of Malmö geocoded at
their exact place of residence to investigate the spatial distribution of mental or behavioral disorders due to
psychoactive substance. Using these data, 1) we examined whether multilevel models properly took into account
the spatial correlation in the risk of substance-related disorders; 2) we compared geoadditive models, multilevel
models, and hierarchical geostatistical models for gaining information on the spatial distribution of substancerelated disorders; and 3) we examined whether contextual factors measured within small-size moving-window
areas centered on the exact place of residence of the individuals allowed us to distinguish low-risk and high-risk
places better than contextual factors measured within administrative neighborhoods.
Methods
Data and measures
We used data from the Swedish database ‘Resource Allocation 2001’. It was formed by Statistics Sweden
after approval of the Data Safety Committee, by merging the Population Register and the Patient Administrative
Register with the unique individual identification number attributed to every Swedish resident. We considered
information on all 65830 individuals aged 40-59 in 2001 residing in the city of Malmö.
The original database comprises information on all inpatient or outpatient contacts with public or private
healthcare providers in 2001 including diagnoses made during these contacts. Based on the first three diagnoses
made at each contact, the Statistics Office at the County of Scania predefined variables indicating whether
individuals had had a diagnosis within different groups of diagnoses. In the present study, the binary outcome
investigated indicates whether a mental or behavioral disorder due to psychoactive substances (ICD-10 code:
F10-F19) had been diagnosed in 2001. We had more detailed information on the diagnoses in a separate
database, which could not be linked to the main database for reasons of confidentiality.
Regarding individual-level variables, we took into account the age, gender, marital status, educational level,
and individual income of the individuals. Age was divided in two categories (40-49, 50-59). Marital status was
coded as married or cohabiting individuals and others (single, divorced, widowed individuals). The educational
level was dichotomized (9 years of education or less, more than 9 years). The household income was not
available in the data; instead, we used the individual income, as a proxy for the socioeconomic position of the
individuals. The individual income was dichotomized, with the median value as a cut-off.
The city of Malmö is divided into 100 administrative neighborhoods. We were able to locate every
individual at her/his place of residence with the exact coordinates of the street adress in meters. Figure 1
indicates the spatial distribution of the 65830 individuals aged 40-59 years across 13730 different locations in
the city of Malmö (which locations may correspond to houses, buildings, or groups of buildings with a similar
street address). Figure 1 also provides basic information on the neighborhood structure.
As a contextual factor, we considered the mean income of individuals aged 25 years or over (as a proxy for
the mean socioeconomic position of the economically independent population). We first defined this variable at
the scale of the administrative neighborhoods. Second, in order to define it at a more local scale and avoid the
problem of individuals residing on the margins of the areas, we computed the mean income within areas of
smaller size than the neighborhoods centered on the exact place of residence of the individuals. We could have
defined these areas as a circular space of small radius centered on the individuals’ places of residence. However,
due to the uneven distribution of individuals in the city (figure 1), such an approach results in measurements
based on little information for individuals residing in sparsely populated areas. Therefore, we defined indicators
based on surrounding population rather than surrounding space: we computed the mean income in circular areas
centered on each individual, which areas comprised a fixed number of inhabitants aged 25 years or over. This
approach results in spatially adaptive areas, i.e., areas of greater size for individuals residing in sparsely
2
populated areas. This is an adaptation to our context of the spatially adaptive filters used in health geography to
obtain smoothed maps of disease incidence (33-35), which consist in computing local incidence rates in
different-sized areas of constant population size. We successively computed the mean income for the 100, 200,
500, 1000, and 1500 closest inhabitants aged 25 years or over, obtaining different contextual measures for each
of the 13730 locations in the city. The different contextual variables were divided into quartiles, to allow for the
comparison of the different measurement strategies.
FIGURE 1. Spatial distribution of the 65,830 individuals aged 40-59 years residing at 13,730 different locations
in the city of Malmö, as divided in 100 administrative neighborhoods, and neighborhood income. Each point
indicates the exact place of residence of individuals aged 40-59 years.
Median area of the neighborhoods: 0.5 km²
Median distance between neighborhoods’ centroids:
-between first-order neighbors: 913 m
-between second-order neighbors: 1759 m
-between third-order neighbors: 2703 m
Median number of inhabitants (all ages) in a neighborhood: 2046
Median number of individuals aged 40-59 in a neighborhood: 510
Statistical analyses
In order to produce precise smoothed map of risk based on individual-level locational information rather
than on neighborhood locations, we first estimated a semiparametric geoaditive model (19-22), with a nonparametric two-dimensional (latitude/longitude) smooth term for the spatial effect, and parametric effects for
individual-level and contextual factors (see the appendix for details). To obtain easily interpretable information
on the magnitude of spatial variations, we propose an indicator on the odds ratio scale, the interquartile odds
ratio, which approximately quantifies the odds ratio between an individual in the first quartile and an individual
in the fourth quartile of spatial risk. The median odds for individuals in the first and individuals in the last
quartiles of spatial risk are equal to exp(Σβ + t12.5) and exp(Σβ + t87.5) where t12.5 and t87.5 are the 12.5th and 87.5th
quantiles in the distribution of the spatial smooth term for the 65830 individuals of the sample. Therefore, the
interquartile odds ratio was computed as exp(t87.5 – t12.5).
In order to make statistical inferences on the magnitude of spatial variations, we then estimated a multilevel
logistic model (5, 6), with individuals nested within the 100 administrative neighborhoods (see the appendix for
details on the model). The neighbourhood-level variance σu² was used to assess the amount of variability
between neighborhoods in substance-related disorders (8).
Two neighborhoods are first-order, second-order, or n-order neighbors if at least one, two, or n boundaries
need to be crossed to go from one neighborhood to the other. We used the Moran’s I statistic (described in the
appendix) to assess whether there was spatial autocorrelation in the neighborhood residuals of the multilevel
model (36, 37). To assess whether spatial correlation decreased with increasing distance, we computed the
Moran’s I separately for first-order neighbors, for second-order neighbors, etc.
To gain further insight on the spatial distribution of the outcome, we estimated a hierarchical geostatistical
model (30, 31) with two sets of neighborhood-level random effects, including the usual set of unstructured
effects of variance σu², and an additional set of spatially correlated random effects of variance σs² (see the
appendix for details) (16, 29, 38). To assess the extent to which neighborhood variability was spatially
structured, we computed the proportion of total neighborhood-level variance attributable to the spatially
structured component of variability as σs² / (σu² + σs²) (38). To describe the spatial structure, we were interested
in the parameter φ, which quantifies the rate of correlation decay with increasing distance between
neighborhoods (with distance measured between the centroids of the neighborhoods). We computed the range of
spatial correlation (3/φ), defined as the distance beyond which the correlation is below 5 percent (see the
appendix for details) (30, 31).
As detailed in the appendix, we performed a simulation to investigate whether the hierarchical geostatistical
model was really able to disentangle spatially structured from unstructured neighborhood variations. We
disorganized the spatial structure of the data without modifying the multilevel structure of the database (i.e.,
spatial connections between neighborhoods were experimentally modified, but the same individuals were still
grouped together within neighborhoods), and examined the resulting changes in the neighborhood variance
3
parameters. The simulation indicated that the hierarchical geostatistical model was able to distinguish between
spatially structured and unstructured neighborhood variations, but showed that the percentage of spatially
structured variations [σs² / (σu² + σs²)] needs to be interpreted jointly with the spatial range of correlation (3/φ)
(see the appendix).
Multilevel models and hierarchical geostatistical models were estimated with Markov chain Monte Carlo
simulation (see the appendix for details) (39). We used the deviance information criterion (DIC) to compare the
different models (the smaller the DIC, the better the fit of the model) (40).
For each of the three modeling options (multilevel model, hierarchical geostatistical model, geoadditive
model), we first estimated an empty model (with no explanatory variables). We then introduced the individuallevel covariates, and the contextual variable in a third step.
Results
In our sample, 1.45 percent of the individuals sought care in 2001 for mental or behavioral disorders due to
psychoactive substance use. Alcohol was involved in the diagnoses for 80 percent of the individuals, opioids for
12 percent, and sedatives or hypnotics for 10 percent (multiple substances were implicated for 14 percent of the
individuals). Clinical conditions comprised a dependence syndrome for 89 percent of the individuals, a harmful
use for 13 percent, and a psychotic disorder for 3 percent of them.
FIGURE 2. Smoothed map of risk of having a psychoactive substance-related disorder (top part) and associated
standard errors (bottom part), estimated from the empty geoadditive model. The quantiles used to draw the maps
are derived from the distributions of the spatial smooth term and standard error for the 65,830 individuals of the
dataset
Spatial smooth term in the empty model
Standard error of the spatial smooth term
An empty geoadditive model based on individual-level locational information provided a precise smoothed
map of risk of substance-related disorders, represented in figure 2 with associated standard errors. The map
showed an increased prevalence in a large area in the center and north of the city, and allowed us to identify two
4
local sub-areas with a particularly higher risk. Considering the spatial smooth term estimated in the empty model
for the 65830 individuals, the interquartile odds ratio was equal to 3.96, which approximately quantifies the odds
ratio between an individual in the lowest quartile and an individual in the highest quartile of spatial risk.
The empty multilevel model showed important variations between neighborhoods in the risk of substancerelated disorders (table 1). The Moran’s I computed from the neighborhood-level residuals was significantly
positive for first-order neighbors, and to a lesser extent for second-order neighbors (figure 3), indicating spatial
correlation between neighborhoods unaccounted for by the multilevel model, which correlation decreased with
increasing distance between the neighborhoods.
TABLE 1. Results of the multilevel models for mental and behavioral disorders related to psychoactive substance
use, Malmö, Sweden, 2001
Empty model
Model with individual
Model with
factors
neighborhood income
Index
95% CI
Index
95% CI
Index
95% CI
Age: 50-59 vs. 40-49
1.10
0.96, 1.25
1.10
0.97, 1.25
Gender: male vs. female
2.31
2.01, 2.67
2.29
1.99, 2.64
Marital status: alone vs. other
4.27
3.61, 5.08
4.15
3.51, 4.93
Education: low vs. high
1.40
1.22, 1.61
1.38
1.20, 1.58
Individual income: low vs. high
3.60
3.06, 4.25
3.42
2.91, 4.05
Neighborhood mean income
Third quartile
1.40
1.00, 1.99
Second quartile
2.07
1.51, 2.86
First quartile
2.11
1.54, 2.93
Area-level variance σu²
Deviance Information Criterion
95% CI: 95% credible interval
0.646
0.433, 0.976
3894
0.173
3138
0.095, 0.298
0.111
0.052, 0.208
3125
FIGURE 3. Moran’s I statistic for the neighborhood-level residuals of the multilevel models, computed
separately for first-order neighbors, second-order neighbors, and third-order neighbors
The empty hierarchical geostatistical model fitted to the data indicated that 89 percent of the neighborhood
variability was spatially structured [(σs² / (σu² + σs²)] (table 2). Figure 4 displays the estimated neighborhood
variations, as split into the spatially structured and unstructured components of variability. Regarding the
spatially structured component, figure 5A indicates how the correlation in the neighborhood risk level decreases
with increasing distance between neighborhoods (solid line). As plotted on figure 5A, the range of spatial
correlation (3/φ) estimated from the empty model was equal to 3471 meters (a quarter of the maximum northsouth distance in Malmö), a distance that far exceeds the median distance between third-order neighbor
neighborhoods. The DIC was 10 points lower in the empty hierarchical geostatistical model than in the
multilevel model (tables 1 and 2), indicating a better fit to the data of the model that split neighborhood
variability into a spatially structured and unstructured components of variability.
5
TABLE 2. Results of the hierarchical geostatistical models for mental and behavioral disorders related to psychoactive
substance use, Malmö, Sweden, 2001
Empty model
Model with individual
Model with neighborhood
factors
income
Index
95% CI
Index
95% CI
Index
95% CI
Neighborhood mean income
Third quartile (OR)
1.38
0.97, 1.97
Second quartile (OR)
2.10
1.52, 2.93
First quartile (OR)
2.14
1.53, 3.04
σu² (unstructured component)
0.075
<0.001,0.346
0.045
<0.001, 0.176
0.030
<0.001, 0.125
σs² (structured component)
0.565
0.257, 1.142
0.127
0.036, 0.272
0.081
0.021, 0.194
φ (rate of correlation decay)
0.0009 0.0003, 0.0026
0.0032
0.0008, 0.0086
0.0037
0.0009, 0.0087
σs² / (σu² + σs²)
0.89
0.47, 0.99
0.75
0.23, 0.99
0.74
0.22, 0.99
3/φ (range of correlation in meters)
Deviance Information Criterion
3471
1157, 10 973
946
349, 3763
817
347 ,3349
3884
3131
3119
FIGURE 4. Neighborhood-level variations in the risk of having a psychoactove substance-related disorder, split
into a spatially structured component (top part) and an unstructured component (bottom part), as estimated from
the empty hierarchical geostatistical model. The quartiles used to draw the maps are derived from the distribution
of the random effects for the 65,830 individuals of the dataset.
Spatially structured component of neighborhood variability
Unstructured component of neighborhood variability
6
FIGURE 5. Spatially structured neighborhood variations in substance-related disorders estimated in the empty
hierarchical geostatistical model: correlation (figure 4A) and covariance (figure 4B) in the neighborhood risk
level. The stars on figure 4A indicate the spatial range of correlation (3/φ).
Correlation in the neighborhood level of risk
Covariance in the neighborhood level of risk
Both the individual variables and the neighborhood socioeconomic level allowed us to explain some part of
the spatially structured neighborhood variations in substance-related disorders, as indicated by a decrease of the
area-level variance and Moran’s I statistics in the multilevel model (table 1 and figure 3) and a decrease of the
spatially structured variance and spatial range of correlation in the hierarchical geostatistical model (table 2,
figure 5A). We clearly illustrate this aspect in figure 5B, which reports the covariance function of the spatially
structured component of variance of the hierarchical geostatistical model. In the geoadditive model, the
interquartile odds ratio dropped from 3.96 to 1.92 when including the individual variables, and to 1.67 after
inclusion of the neighborhood socioeconomic level.
Regarding the impact of the neighborhood mean income, we found that individuals residing in deprived
administrative neighborhoods had higher risks of substance-related disorders, beyond the impact associated with
individual-level effects (tables 1, 2 and 3). Measuring the mean income in spatially adaptive areas of smaller size
than the administrative neighborhoods indicated that the strength of association between contextual deprivation
and substance-related disorders markedly increased with decreasing size of the areas taken into account (table 3).
Geoadditive models indicated that the risk of substance-related disorders was 1.97 times higher (95% confidence
interval: 1.39, 2.79) in the highest vs. lowest quartiles of contextual deprivation when measuring the contextual
factor at the neighborhood level, but 4.12 times higher (95% confidence interval: 3.01, 5.64) when measuring it
on the 100 closest inhabitants aged 25 years or over. Considering an area in the center of Malmö, figure 6
indicates whether each individual-level location belonged or not to the lowest income quartile, as measured with
the different approaches. It shows that measuring contextual income within spatially adaptive areas allowed us to
identify highly deprived places located in non deprived neighborhoods, which places exhibited a particularly
increased prevalence of disorders.
Discussion
In the present study, we investigated mental or behavioral substance-related disorders in the Swedish city of
Malmö to compare a spatial perspective to the usual multilevel analytic approach. When viewing space as a
continuum rather than as fragmented into areas disconnected from each other, measures of variation or
correlation provided a more accurate description of the spatial distribution, and measures of association better
identified areas with a high prevalence of disorders. Such a spatial perspective may be more appropriate than the
multilevel analytic approach in many neighborhood studies.
7
TABLE 3. Contextual effect of the mean income successively
measured within neighborhoods and small-size areas, estimated in
geoadditive models adjusted for the individual-level covariates,
Malmö, Sweden, 2001
OR
95% CI
Neighborhood income*
Third quartile
1.08 0.78, 1.51
Second quartile
1.92 1.40, 2.63
First quartile
1.97 1.39, 2.79
Income of the 1500 closest individuals
Third quartile
1.12 0.85, 1.46
Second quartile
1.69 1.26, 2.26
First quartile
2.02 1.47, 2.77
Income of the 1000 closest individuals
Third quartile
1.06 0.80, 1.0
Second quartile
2.04 1.53, 2.72
First quartile
2.19 1.60, 2.98
Income of the 500 closest individuals
Third quartile
1.00 0.75, 1.33
Second quartile
1.84 1.39, 2.43
First quartile
2.29 1.70, 3.08
Income of the 200 closest individuals
Third quartile
0.97 0.72, 1.31
Second quartile
2.16 1.63, 2.85
First quartile
2.83 2.12, 3.80
Income of the 100 closest individuals
Third quartile
1.45 1.06, 2.00
Second quartile
2.73 2.02, 3.70
First quartile
4.12 3.01, 5.64
* The median number of inhabitants aged 25 years or over in a
neighborhood was equal to 1484
Investigating the magnitude and shape of spatial variations of health outcomes
Regarding the spatial distribution of outcomes, it is epidemiologically relevant to obtain information on the
magnitude of neighborhood variations, which informs one on the need to include a contextual dimension in
public health programs. Moreover, we emphasized the interest of assessing whether neighborhoods close to each
other share a similar level of risk, which indicates whether public health efforts should be coordinated at a larger
scale than the one of the administrative neighborhoods.
A convenient way to use data on the exact residential locations of the individuals to gain information on the
spatial distribution of outcome was to fit a geoadditive model, with a nonparametric smooth term for the spatial
effect (19, 22). This approach allowed us to produce a smoothed map of risk independent of neighborhood
boundaries (20, 21), which was much more precise than the maps obtained using poor locational information at
the neighborhood level. This method can also be used to derive smoothed maps of risk adjusted for a given set of
factors. However, a drawback of this approach for epidemiologists is that it only provides visual information, but
no parametric effect that would allow one to make statistical inferences on the spatial distribution of the
outcome. We only obtained quantitative information on the magnitude of spatial variations, expressed on the
odds ratio scale with an interquartile odds ratio based on risk estimates at the 13730 individual locations.
Therefore, we considered parametric options such as the multilevel model (5) or the hierarchical
geostatistical model (29) to make inferences on the spatial distribution of the outcome. Since computational
resources make it intractable to estimate a parametric spatial correlation structure for an important number of
locations, these analyses were based on the 100 neighborhood locations rather than on the 13730 individual-level
locations, which constitutes a dramatic waste of information.
Using multilevel models (5, 6) in neighborhood studies is based on the assumption that all spatial correlation
can be reduced to within-neighborhood correlation. The multilevel analytic approach considers individuals’
neighborhood affiliation but neglects spatial relationships between neighborhoods, and implicitly assumes that
neighborhood variations are spatially unstructured, with neighborhoods of high risk and neighborhoods of low
risk completely distributed at random in space (16). Therefore, beside dependence on the scale and zoning used,
measures of variation in multilevel models (such as the area-level variance) only provide partial insight on the
spatial distribution of health outcomes, in allowing one to make statistical inferences on the magnitude of
variations but not on the shape of neighborhood variations.
8
FIGURE 6. Contextual income at each individuals’ place of residence in an area in the center of Malmö, as
measured at the neighborhood level (top part) and in spatially adaptive areas including the 100 closest
inhabitants aged 25 years or over (bottom part). Measurements made in spatially adaptive areas allowed us to
identify particularly deprived locations located in non-deprived neighborhoods.
Contextual income measured within administrative neighborhoods
Contextual income measured within spatially adaptive areas
In order to obtain more complete parametric information on the spatial distribution of the outcome, we used
the hierarchical geostatistical model (30, 31). Using two sets of random effects, our specific model splits the
neighborhood variability into a spatially structured component and an unstructured component, and provides
information on the spatially structured pattern of variability (16, 29, 38). We found it more informative to use a
geostatistical formulation rather than a lattice formulation of the spatial correlation structure: rather than only
estimating a parameter for the correlation between neighborhoods with a common boundary, we expressed the
correlation between neighborhoods as a decreasing function of the spatial distance between them, which allows
one to estimate the spatial range of correlation (31).
First, the hierarchical geostatistical model has a heuristic interest. Disentangling the spatially structured
variability from other more chaotic sources of neighborhood variations, it may allow researchers to generate
hypotheses on causal mechanisms, since many contextual factors follow a strong spatial structure (16). We used
this approach when comparing the spatially structured variations of substance-related disorders plotted in figure
4 to the geographic distribution of contextual income reported in figure 1.
More importantly, the hierarchical geostatistical model allowed us to make inferences not only on the
magnitude of spatial variations but also on the shape of spatial variations, which provides useful indications for
public health planning. Information on the spatial range of correlation allowed us to confirm hypotheses based
on visual information that spatial variations in the prevalence of substance-related disorders occurred at a quite
large scale. Therefore, coordinating public health interventions between administrative neighborhoods close to
each other may be an efficient strategy.
9
Measuring contextual factors across continuous space around the individual place of residence
It may be a limitation to rely on administrative boundaries to define contextual factors. In our study, we
found much stronger associations between contextual deprivation and substance-related disorders when
measuring the contextual factor in local areas of much smaller size than the administrative neighborhoods. We
measured the contextual income within spatially adaptive areas (i.e., overlapping circles of variable width,
containing a fixed number of inhabitants, and centered on the exact place of residence of the individuals), which
consist of an importation in our context of the spatially adaptive filters used in disease mapping (33-35). This
approach seems theoretically appropriate, since we consider a populational factor related to the characteristics of
the inhabitants rather than to the features of the physical environment. Moreover, due to the uneven distribution
of individuals through the city, spatially adaptive areas have a technical advantage over fixed-width areas:
computing the contextual income within fixed-size circular areas of small radius would result in missing values
or unreliable measurements for individuals residing in sparsely populated areas; on the other hand, considering
fixed-width areas wide enough to avoid this problem of unreliable measures would prevent from investigating
effects of the contextual income at a very local scale (35).
In our cross-sectional study, the causal relationships for the association between contextual deprivation and
substance-related disorders are likely to play in both directions. On one hand, deprivation may have a negative
impact on mental well-being and, on the other hand, selective migration processes may contribute to the
clustering of individuals with substance-related disorders in the most deprived places of the city. Despite this
uncertainty, our finding indicates that the hypothesis of a causal contextual effect of deprivation on the incidence
of substance-related disorders deserves to be investigated in a longitudinal design study; furthermore, it shows
that interventions focused at individuals with substance-related disorders may be particularly useful in the hot
spot of contextual deprivation identified at a finer scale than the one of the administrative neighborhoods.
Conclusion
On one hand, our spatial perspective indicated that spatial autocorrelation in the risk of substance-related
disorders operated at a larger scale than the administrative neighborhood scale. However, despite these large
scale variations due to the spatial clustering of low socioeconomic status individuals in the center and north of
the city, we also found more local variations in the prevalence level that were attributable to differences in the
intensity of deprivation.
We are aware that multilevel models may be appropriate when the context is defined in a way that is not
strictly geographical (e.g., as workplaces or schools) (15) or when spatial correlation can be reduced to withinarea correlation. Similarly, it is certainly adequate to measure contextual factors within administrative
boundaries when investigating effects operating at these scales (e.g., related to public policies). However, in
many neighborhood studies, both measures of variation and measures of association may yield more complete
information on the spatial distribution of health outcomes when viewing space in a more continuous way.
Appendix
Geoadditive logistic model. Based on the work of Simon Wood (19, 22), our geoadditive logistic model can
be defined as logit(pi) = β0 + Xiβ + t(xi, yi) where pi is the probability of having a substance-related disorder for
individual i, Xiβ refers to the strictly parametric part of the linear predictor, and t(xi, yi) is a two-dimensional
smooth function of the exact latitude and longitude (xi, yi) of the individuals. The two-dimensional smooth term
was defined using a thin plate regression spline. Because of the radially symmetric nature of the basis function
used, the thin plate regression spline is an isotropic smoother. One of its key advantages is to avoid the problem
of knot placement that arises with conventional regression spline. To avoid over fitting of the spatial term, the
model was estimated by a penalized maximum likelihood approach (41), with a smoothing parameter controlling
the tradeoff between the goodness of fit and the smothness of the final surface of risk. Rather than arbitrarily
choosing the number of degrees of freedom of the smooth term, it was selected by the minimisation of an
Unbiased Risk Estimator (UBRE minimization method). All geoadditive models were fitted to the data with the
“mgcv” R package (42).
Multilevel logistic model. In order to model variations in the probability pij of having substance-related
disorders for individual i in neighborhood j, we fitted a multilevel logistic model to the data as
logit(pij) = β0 + Xijβ + uj, where Xij is the vector of explanatory variables, and uj is a normally distributed random
intercept of variance σu² (5, 6).
In order to assess spatial autocorrelation in the neighborhood residuals, we computed the Moran’s I statistic
(36, 37):
⎞ ⎛ N
⎛ N N , j ≠1
⎞
I = ⎜⎜ N ∑ ∑ wij ui u j ⎟⎟ / ⎜ S 0 ∑ ui2 ⎟
⎠ ⎝ i =1 ⎠
⎝ i =1 j =1
where N equals the number of neighborhoods (100), wij is a weight related to the spatial relation between
neighborhoods i and j (see below), ui and uj are the residuals for neighborhoods i and j, and S0 is the sum of the
weights wij:
N
N
S 0 = ∑∑ wij , i ≠ j
i =1 j =1
10
We computed the Moran’s I separately for first-order neighbors (with wij equal to 1 for pairs of adjacent
neighborhoods, to 0 otherwise), second-order neighbors (with wij only equal to 1 for second-order neighbours),
and third-order neighbors.
Multilevel models were estimated with a Markov chain Monte Carlo simulation (see below) (39). In this
Bayesian setting, we computed the Moran’s I for each set of sampled values of the neighborhood-level residuals
retained for final analysis. We report the median of the posterior distribution of the Moran’s I, and use the 2.5th
and 97.5th quantiles of the distribution to define a 95% Bayesian credible interval. The Moran’s I has a small
negative expectation when applied to regression residuals (37), and is significantly positive in case of spatial
autocorrelation.
Hierarchical geostatistical logistic model. We used a logistic model including independent neighborhoodlevel random effects uj, and neighborhood-level spatially correlated random effects sj (16, 29, 38). For an
individual i in the neighborhood j, the model was defined as logit(pij) = β0 + Xijβ + uj + sj. Random effect uj are
mutually independent and Gaussian, with mean zero and variance σu². Let S = (s1, s2, ..., s100) be the vector of
spatial effects for the 100 neighborhoods. The distribution of S may be expressed as: S ~ N(0, V), with Vkl
defined as a parametric function of distance dkl in meters between the centroids of neighbourhoods k and l. In our
case, assuming an isotropic spatial process (in which the strength of spatial correlation do not depend on the
direction), Vkl was defined as Vkl = σs²ρkl with an exponential correlation function ρkl = exp(-φdkl) (30). The range
of correlation (beyond which correlation is below 5 percent) can be defined as 3/φ. The proportion of
neighborhood variance that is spatially structured is computed as σs² / (σu² + σs²).
We examined whether the hierarchical geostatistical model was really able to disentangle spatially
structured variations from the neighborhood unstructured variability. In five successive experiments, we
randomly selected 10, 25, 50, 75, and 90 neighborhoods out of 100, and randomly assigned all individuals
together from each of these neighborhoods to one other selected neighborhood, while no changes were made for
the other non-selected neighborhoods. We therefore did not modify the multilevel structure of the data since the
same individuals were still grouped together within neighborhoods, but progressively disorganized the spatial
structure of the neighborhoods. Fitting a hierarchical geostatistical model to each of these datasets, we observed
that the proportion of neighborhood variations attributable to the spatially structured component [(σs² / (σu² +
σs²)] decreased with increasing number of neighborhoods selected for random reassignment of inhabitants (table
4). However, spatially structured variations still constituted an important part of neighborhood variability when
completely disorganizing the spatial structure of the data (i.e., 59 percent when random reassignment of
inhabitants was performed between 90 neighborhoods). However, it is worth noting that the spatial range (3/φ)
also considerably and regularly decreased with increasing disorganization of the neighborhood spatial structure
(table 4). In the model in which random reassignment of inhabitants was performed between 90 neighborhoods,
the spatial range of correlation was equal to 528 meters (vs. 3471 in the real data), indicating that the spatially
structured and unstructured components of variability were no longer intrinsically different. This experiment
therefore confirms a certain ability of the hierarchical geostatistical model to disentangle spatially structured
variations from unstructured neighborhood variations, and indicates that the proportion of spatially structured
variations σs² / (σu² + σs²) and the spatial range of correlation 3/φ need to be interpreted jointly.
TABLE 4. Results of the hierarchical geostatistical models when randomly reassigning
all individuals together from one neighborhood to another for 10, 25, 50, 75, and 90
neighborhoods out of 100.
Percentage of spatially Range of correlation (3/φ)
structured variations
σs² / (σu² + σs²)
Reassignment was made for:
- 0 neighborhood (real data)
- 10 neighborhoods
- 25 neighborhoods
- 50 neighborhoods
- 75 neighborhoods
- 90 neighborhoods
0.89
0.88
0.74
0.64
0.64
0.59
0.47, 0.99
0.39, 0.99
0.23, 0.99
0.21, 0.99
0.21, 0.99
0.21, 0.99
3471
2488
927
564
544
528
1157, 10 973
732, 7721
351, 3037
341, 1751
341, 1459
340, 1525
Multilevel models and hierarchical geostatistical models were estimated in a Bayesian setting using
Winbugs 1.4 (39). For all parameters including variance parameters, we used noninformative uniform priors.
Using WinBUGS’s adaptive rejection sampler, we ran a single chain, with a burn-in period of 100,000 iterations.
After ensuring that the chain has converged, we retained every 10th iteration until a sample size of 10,000 had
been attained. For each parameter, we report the median of the distribution and provide a 95% credible interval.
11
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
Kawachi I, Berkman LF. Neighborhoods and Health. New York: Oxford University Press, 2003.
Pickett KE, Pearl M. Multilevel analyses of neighbourhood socioeconomic context and health
outcomes: a critical review. J Epidemiol Community Health 2001;55:111-22.
Diez-Roux AV. Multilevel analysis in public health research. Annu Rev Public Health 2000;21:171-92.
Bingenheimer JB, Raudenbush SW. Statistical and substantive inferences in public health: issues in the
application of multilevel models. Annu Rev Public Health 2004;25:53-77.
Leyland AH, Goldstein H. Multilevel modelling of health statistics. Chichester, England: Wiley, 2001.
Snijders T, Bosker R. Multilevel Analysis. An introduction to basic and advanced multilevel modelling.
London, England: Sage Publications, 1999.
Merlo J. Multilevel analytical approaches in social epidemiology: measures of health variation
compared with traditional measures of association. J Epidemiol Community Health 2003;57:550-2.
Merlo J, Chaix B, Yang M, Lynch JW, Rastam L. A brief conceptual tutorial on multilevel analysis in
social epidemiology - linking the statistical concept of clustering to the idea of contextual phenomenon.
J Epidemiol Community Health 2004; in press.
Fotheringham AS, Wong DWS. The modifiable areal unit problem in multivariate statistical analysis.
Environ Plan A 1991;23:1025-1044.
Amrhein CG. Searching for the elusive aggregation effect: evidence from statistical simulations.
Environ Plan A 1995;27:105-119.
Holt D, Steel DG, Tranmer M. Area homogeneity and the modifiable areal unit problem. Geographical
Systems 1996;3:181-200.
Reijneveld SA, Verheij RA, de Bakker DH. The impact of area deprivation on differences in health:
does the choice of the geographical classification matter? J Epidemiol Community Health 2000;54:306313.
O'Campo P. Invited commentary: Advancing theory and methods for multilevel models of residential
neighborhoods and health. Am J Epidemiol 2003;157:9-13.
Boyle MH, Willms JD. Place effects for areas defined by administrative boundaries. Am J Epidemiol
1999;149:577-85.
Mitchell R. Multilevel modeling might not be the answer. Environ Plan A 2001;33:1357-1360.
Borgoni R, Billari FC. Bayesian spatial analysis of demographic survey data: An application to
contraceptive use at first sexual intercourse. Demogr Res 2003;8:online journal.
Sabel CE, Boyle PJ, Loytonen M, et al. Spatial clustering of amyotrophic lateral sclerosis in Finland at
place of birth and place of death. Am J Epidemiol 2003;157:898-905.
Green C, Hoppa RD, Young TK, Blanchard JF. Geographic analysis of diabetes prevalence in an urban
area. Soc Sci Med 2003;57:551-60.
Wood SN. Thin plate regression splines. J. R. Stat. Soc. Ser. B Stat. Methodol. 2003;65:95--114.
Cakmak S, Burnett RT, Jerrett M, et al. Spatial regression models for large-cohort studies linking
community air pollution and health. J Toxicol Environ Health A 2003;66:1811-23.
Burnett R, Ma R, Jerrett M, et al. The spatial association between community air pollution and
mortality: a new method of analyzing correlated geographic cohort data. Environ Health Perspect
2001;109 Suppl 3:375-80.
Wood SN. Stable and efficient multiple smoothing parameter estimation for generalized additive model.
J Am Stat Assoc 2004; in press.
Joines JD, Hertz-Picciotto I, Carey TS, Gesler W, Suchindran C. A spatial analysis of county-level
variation in hospitalization rates for low back problems in North Carolina. Soc Sci Med 2003;56:254153.
Leyland AH, Langford IH, Rasbash J, Goldstein H. Multivariate spatial models for event data. Stat Med
2000;19:2469-78.
Langford IH, Leyland AH, Rasbash J, Goldstein H. Multilevel modelling of the geographical
distributions of diseases. J R Stat Soc Ser C Appl Stat 1999;48:253-68.
Kleinschmidt I, Sharp BL, Clarke GP, Curtis B, Fraser C. Use of generalized linear mixed models in the
spatial analysis of small-area malaria incidence rates in Kwazulu Natal, South Africa. Am J Epidemiol
2001;153:1213-21.
Kleinschmidt I, Sharp B, Mueller I, Vounatsou P. Rise in malaria incidence rates in South Africa: a
small-area spatial analysis of variation in time trends. Am J Epidemiol 2002;155:257-64.
English PB, Kharrazi M, Davies S, Scalf R, Waller L, Neutra R. Changes in the spatial pattern of low
birth weight in a southern California county: the role of individual and neighborhood level factors. Soc
Sci Med 2003;56:2073-88.
Diggle P, Moyeed R, Rowlingson B, Thomson M. Childhood Malaria in the Gambia: A Case–Study in
Model–Based Geostatistics. J R Stat Soc Ser C Appl Stat 2002;51:493-506.
Gemperli A, Vounatsou P, Kleinschmidt I, Bagayoko M, Lengeler C, Smith T. Spatial patterns of infant
mortality in Mali: the effect of malaria endemicity. Am J Epidemiol 2004;159:64-72.
Banerjee S, Wall MM, Carlin BP. Frailty modeling for spatially correlated survival data, with
application to infant mortality in Minnesota. Biostatistics 2003;4:123-42.
12
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
Diez Roux AV, Merkin SS, Hannan P, Jacobs DR, Kiefe CI. Area characteristics, individual-level
socioeconomic indicators, and smoking in young adults: the coronary artery disease risk development in
young adults study. Am J Epidemiol 2003;157:315-26.
Bithell JF. An application of density estimation to geographical epidemiology. Stat Med 1990;9:691701.
Talbot TO, Kulldorff M, Forand SP, Haley VB. Evaluation of spatial filters to create smoothed maps of
health data. Stat Med 2000;19:2399-408.
Tiwari C, Rushton G. Using spatially adaptive filters to map late stage colorectal cancer incidence in
Iowa. In: Fisher P, ed. Developments in spatial data handling. Germany: Springer-Verlag, 2005:665676.
Walter SD. The analysis of regional patterns in health data. II. The power to detect environmental
effects. Am J Epidemiol 1992;136:742-59.
Congdon P. Applied Bayesian Modelling. Chichester, England: Wiley & Sons Ltd, 2003.
Banerjee S, Gelfrand AE, Carlin BP. Hierarchical Modeling and Analysis for Spatial Data. Boca Raton,
FL, USA, 2003.
Smith AFM, Roberts GO. Bayesian computation via the Gibbs sampler and related Markov chain
Monte Carlo methods. J R Stat Soc Ser (B) 1993;55:3-23.
Spiegelhalter DJ, Best N, Carlin BP, Linde AVD. Bayesian measures of model complexity and fit. J R
Stat Soc Ser (C) 2002;64:583-639.
Wood SN. Modelling and smoothing parameter estimation with multiple quadratic penalties. J. R. Stat.
Soc. Ser. B Stat. Methodol. 2000;62:413--428.
Spatial
dependence:
weighting
schemes,
statistics
and
models:
(http://cran.rproject.org/src/contrib/PACKAGES.html#spdep).
13
Conclusion générale et perspectives
L’analyse des effets du contexte sur la santé a connu un développement considérable au
cours des dix dernières années,4, 127 et est aujourd’hui considérée comme l’une des principales
voies à suivre pour avancer dans la compréhension des disparités sociales de santé.5 Les
premières générations d’analyses contextuelles ont confirmé qu’il y a un réel intérêt à tenir
compte des effets du contexte de résidence lors de l’étude d’un certain nombre de
phénomènes de santé. Toutefois, l’analyse contextuelle telle qu’elle est pratiquée aujourd’hui
a certainement atteint ses limites, et est ainsi mise en demeure de réviser ses schémas
d’analyse pour permettre de nouvelles avancées dans la compréhension des effets du contexte
sur la santé.10 Les limites de la méthodologie utilisée renvoient : 1) à l’utilisation d’une
distinction trop rigide entre effets individuels et effets contextuels,30 2) à un intérêt peut-être
trop exclusif pour un petit nombre de facteurs contextuels (souvent de type socioéconomique), sans que des efforts suffisants n’aient encore été mis en œuvre pour identifier
de façon plus précise les mécanismes d’influence du contexte sur la santé,38 3) à l’utilisation
de données transversales, qui s’ajoute au problème des biais de confusion pour compromettre
définitivement l’identification d’effets causaux du contexte,128 et enfin 4) à l’utilisation de
l’approche d’analyse multiniveau, qui ne permet peut-être pas de décrire et de rendre compte
efficacement des variations inter-zones des phénomènes de santé.
La période d’engouement pour cette thématique des effets du contexte sur la santé, qui
correspond peut-être à l’enfance de ce champ d’analyse, doit maintenant faire place à une
période à la fois critique et constructive, au cours de laquelle les différentes limites évoquées
ci-dessus devront être discutées et résolues. Ce n’est qu’en s’attachant véritablement à
surmonter ces difficultés qu’il sera possible d’accroître l’intérêt de ces analyses contextuelles
pour le champ de la santé publique. Au cours de notre travail de thèse, nous avons pris
conscience de la nécessité d’avancer dans cette voie, et avons commencé à proposer certaines
solutions. Nous nous sommes tout particulièrement intéressés à la question des méthodes
d’analyse des variations spatiales et des méthodes de mesure des facteurs du contexte.
L’objectif de la série d’articles réalisés à la demande du Journal of Epidemiology and
Community Health sous la direction de Juan Merlo13, 68 est de souligner que la modélisation
de la variance inter-zone fournit elle-même des informations d’intérêt en santé publique, audelà de la modélisation des associations entre facteurs contextuels et phénomènes de santé.69,
40
85
Un tel message n’avait probablement pas été délivré de façon aussi claire ni aussi complète
dans la littérature d’épidémiologie sociale qu’au travers de la série d’articles que nous avons
écrite de façon collaborative.
Cette distinction entre modélisation des associations et modélisation de la variance a servi
de point de départ pour la suite de notre travail, puisque nous nous efforçons de montrer que
dans beaucoup de cas d’analyses, à la fois les mesures d’association et les mesures de
variation ne fournissent pas le maximum d’informations utiles en santé publique lorsqu’elles
sont mises en œuvre au travers de l’approche d’analyse multiniveau. Dans beaucoup de cas,
ces deux types d’indicateurs permettent d’aller plus avant dans la description et la
compréhension de la distribution spatiale des phénomènes de santé lorsque l’on s’appuie sur
une vision plus continue de l’espace. Au travers de deux études, nous avons donc cherché à
proposer une perspective d’analyse spatiale des variations géographiques des phénomènes de
santé.
Toutefois, plutôt que d’apporter des réponses définitives, de tels travaux laissent
simplement entrevoir que les processus contextuels qui influent sur la santé des individus se
distribuent de façon complexe dans l’espace. Dans le cadre d’un post-doctorat en Suède au
Département de Médecine Communautaire de l’Hôpital Universitaire de Malmö, puis à notre
retour en France dans l’Equipe Avenir « Déterminants Sociaux de la Santé et du Recours aux
Soins » de l’unité 444 de l’INSERM, nous chercherons à aller plus avant dans l’étude des
distributions spatiales des phénomènes de santé et des mécanismes contextuels qui en sont à
l’origine.
Perspectives de recherche
Notre projet de recherche futur nous conduira tout particulièrement à nous interroger sur
les méthodes d’analyse à mettre en œuvre et les données à utiliser pour avancer dans la
compréhension des effets du contexte sur la santé. L’objectif de ces progrès méthodologiques
est de fertiliser de nouvelles approches d’intervention dans le champ de la santé publique.
Notre projet de recherche s’articulera autour des interrogations suivantes :
1) Quelles approches faut-il mettre en œuvre pour décrire la distribution spatiale des
phénomènes de santé ?
Au cours de nos premiers travaux, nous avons utilisé différentes méthodes de régression
spatiale afin d’obtenir des informations sur la distribution spatiale des phénomènes.79, 111, 112,
41
114, 122
D’autres approches statistiques existent que nous n’avons pas encore testées, et
certaines méthodes d’analyse sont en cours de constitution. Nous projetons d’engager une
comparaison générale des différentes approches d’analyse. L’objectif essentiel sera alors
d’évaluer l’intérêt relatif des résultats qu’elles fournissent sur un plan de santé publique.
Cherchant à voir si les différentes méthodes permettent de capter la spécificité des
distributions spatiales des phénomènes, il sera intéressant de comparer différents phénomènes
de santé dont les distributions spatiales seraient largement différentes. La question sera alors
de voir si les approches d’analyse envisagées parviennent à fournir des informations assez
spécifiques sur chacune des distributions spatiales, qui puissent s’avérer utiles lors de la mise
en place de programmes de santé publique adaptés à chaque problème de santé.
2) Quelles approches faut-il mettre en œuvre pour mesurer les facteurs du contexte
qui influent sur la santé ?
Dans notre recherche future, nous nous efforcerons de travailler à une meilleure
identification des processus par lesquels le contexte est susceptible d’influer sur la santé.
Nous chercherons à voir si l’on parvient à clarifier le mode opératoire des effets contextuels
en distinguant différents types de facteurs, et notamment différentes dimensions économiques
et sociales du contexte de résidence. Toutefois, une condition primordiale pour avancer dans
cette voie est de raffiner les méthodes de mesure des facteurs du contexte, à la fois dans
l’espace et dans le temps :
2A – Mesure dans l’espace : Un second volet de notre travail a consisté à proposer
différentes méthodes de mesure des facteurs du contexte dans un espace continu autour du
lieu de résidence des individus. Une telle approche est complètement nouvelle et devra être
perfectionnée. Des analyses de sensibilité devront être conduites à chaque fois, afin
d’examiner si le surcroît de complexité dans les approches de mesure apparaît justifié au
regard des résultats obtenus. Par exemple, est-il justifié d’introduire des pondérations dans le
calcul des indicateurs contextuels, pondérations qui permettent de tenir compte du fait que les
individus sont avant tout affectés par les localisations les plus proches et dans une moindre
mesure par les localisations plus éloignées ? Dans cette voie, on peut même imaginer de
mettre au point des procédures d’estimation qui permettent de sélectionner la pondération la
plus appropriée, ce qui permettrait de savoir dans quelle mesure une localisation située à
proximité a un impact plus important sur les individus qu’une localisation située deux fois
plus loin.
42
Nos approches de mesure des facteurs du contexte tiennent compte d’un espace continu
autour du lieu de résidence des individus. Cela constitue un progrès par rapport aux mesures
classiques qui négligent l’espace au-delà des limites administratives de la zone de résidence
des individus. Toutefois, des limites physiques et sociales existent dans l’espace, et
l’hypothèse d’une parfaite continuité spatiale n’est certainement pas adéquate. Dans nos
études à venir des variations des phénomènes de santé dans la ville de Malmö en Suède, nous
proposons de considérer l’intégralité du réseau routier de la ville, qui nous permettra de
découper l’espace en pâtés de maison au sein desquels nous calculerons les indicateurs
contextuels. Nous chercherons à voir si ces mesures contextuelles qui réintroduisent des
discontinuités dans un espace continu permettent encore mieux de rendre compte des
variations spatiales des phénomènes de santé.
2B – Mesure dans le temps : Une limite importante de la littérature est que les associations
entre facteurs contextuels et variables de santé ont souvent été mises en évidence à partir de
données transversales.128 Dans la suite de nos travaux, ayant accès à des données
longitudinales tant Françaises que Suédoises, nous chercherons à étudier l’impact que les
facteurs du contexte peuvent avoir sur l’incidence ultérieure de problèmes de santé. Dans
cette perspective, nous chercherons à voir si les effets du milieu de résidence jouent de façon
cumulative dans le temps : une exposition prolongée à un facteur contextuel donné est-elle à
l’origine d’un effet plus important sur les individus qu’une exposition moins durable ? Dans
cette optique, connaissant l’histoire résidentielle des individus, il sera d’une part nécessaire de
tenir compte des caractéristiques des lieux de résidence successifs des individus, mais
également des changements de caractéristiques des différents lieux de résidence au cours du
temps. L’objectif sera de voir si cette mesure cumulative de l’exposition au contexte permet
mieux de rendre compte des variations des phénomènes de santé que l’approche classique qui
la capture à un moment donné.
3) Dans quelle mesure la moindre exhaustivité et la moindre précision spatiale de
certaines sources de données utilisées conduisent-elles à une perte d’informations
pertinentes en santé publique ?
Dans les prochaines années, l’Equipe Avenir sur les « Déterminants Sociaux de la Santé et
du Recours aux Soins » sera amenée à renforcer sa coopération avec le Département de
Médecine Communautaire de l’Hôpital Universitaire de Malmö, au travers de ma
collaboration avec Juan Merlo. Pour nos travaux futurs, nous aurons donc accès à des données
Suédoises, caractérisées par l’exhaustivité quant à la population couverte, par une très grande
43
précision spatiale, et par une dimension longitudinale. Concernant la situation Française, nous
comptons notamment travailler à partir des données de l’Echantillon Démographique
Permanent de l’INSEE, qui pour riches qu’elles soient, n’atteignent pas des degrés
d’exhaustivité et de précision spatiale comparables. Cherchant à comparer diverses méthodes
d’analyse des variations spatiales et de mesure des facteurs du contexte, nous examinerons à
chaque fois si la moindre précision des données Françaises aboutit à une perte d’informations
qui seraient utiles d’un point de vue de santé publique.
S’attachant à remplir ces objectifs de recherche, nos travaux futurs devraient contribuer à
lever des limites inhérentes à l’analyse contextuelle telle qu’elle est pratiquée aujourd’hui en
épidémiologie sociale, et permettre ainsi d’affirmer l’importance de son rôle en santé
publique.
44
Liste de publications
PUBLICATIONS DANS DES JOURNAUX A COMITES DE LECTURE
Chaix B, Bobashev G, Merlo J, Chauvin P. Re: “Detecting patterns of occupational illness
clustering with alternating logistic regressions applied to longitudinal data”. Am J
Epidemiol 2004;160(5):505-506 (letter).
Merlo J, Chaix B, Yang M, Lynch J, Råstam L. A brief conceptual tutorial on multilevel
analysis in social epidemiology: linking the statistical concept of clustering to the idea of
contextual phenomenon. J Epidemiol Community Health 2004, sous presse.
Merlo J, Yang M, Chaix B, Lynch J, Råstam L. A brief conceptual tutorial on multilevel
analysis in social epidemiology: investigating contextual phenomena in different groups
of individuals. J Epidemiol Community Health 2004, sous presse.
Chaix B, Boëlle PY, Guilbert P, Chauvin P. Area level determinants of specialty care
utilisation in France: a multilevel analysis. Public Health 2004, sous presse.
Chaix B, Veugelers PJ, Boëlle P-Y, Chauvin P. Access to general practitioner services: the
disabled elderly lag behind in underserved areas. Eur J Public Health 2004, sous presse.
Chaix B, Guilbert P, Chauvin P. A multilevel analysis of tobacco use and tobacco
consumption levels in France: are there any combination risk groups? Eur J Public Health
2004; 14: 186-190.
Chaix B, Chauvin P. Tobacco and alcohol consumption, sedentary lifestyle and
overweightness in France: a multilevel analysis of individual and area-level determinants.
Eur J Epidemiol 2003;18(6):531-8.
Chaix B, Chauvin P. L’apport des modèles multiniveau dans l’analyse contextuelle en
épidémiologie sociale : une revue de littérature. Rev Epidemiol Santé Publ 2002; 50: 489499.
CHAPITRES DE LIVRE
Chaix B. L’apport des modèles multiniveaux en analyse contextuelle : intérêt et limites. In :
La mesure des évolutions dans les enquêtes de santé. Paris : Editions INPES, 2004 :
243p.
45
Chaix B, Merlo J, Gaignard J, Lithman T, Boalt A, Chauvin P. The social and spatial
distribution of mental and behavioral disorders related to psychoactive substance use in
the city of Malmö, Sweden, 2001. In: Colombus F, ed. Focus on Lifestyle and Health
Research. New York, USA: Nova Science Publishers, in press.
COMMUNICATIONS
Chaix B, Merlo J, Chauvin P. Investigating place effects on health: a spatial approach vs. a
conventional contextual approach. 132nd Annual Meeting of the American Public Health
Association (APHA), Washington, 6-10 novembre 2004, livre d’abstracts: n° 78941.
Chaix B, Merlo J. Using measures of clustering in logistic regression to investigate contextual
effects. 132nd Annual Meeting of the American Public Health Association (APHA),
Washington, 6-10 novembre 2004, livre d’abstracts: n° 78944.
Chaix B, Chauvin P, Merlo J. Using measures of clustering in logistic regression to
investigate contextual effects – an example on healthcare utilisation in Sweden. 12th
European Public Health Conference, Oslo, 7-9 octobre 2004, Eur J Public Health,
2004;14(Suppl):p29.
Chaix B, Chauvin P, Merlo J. Using measures of clustering in logistic regression to
investigate contextual effects – an example on healthcare utilisation in Sweden. Congress
of the European Epidemiology Federation (IEA), Porto, 8-11 septembre 2004, J
Epidemiol Community Health, 2004;58(Suppl 1):A14.
Chaix B, Merlo J, Chauvin P. Investigating place effects on health: a spatial approach vs. the
multilevel approach. Congress of the European Epidemiology Federation (IEA), Porto, 811 septembre 2004, J Epidemiol Community Health, 2004;58(Suppl 1):A14.
Chaix B, Merlo J, Diez-Roux AV, Chauvin P. Comparaison d’une approche spatiale à
l’approche multiniveau dans l’analyse des effets du contexte sur la santé : un exemple sur
les modes d’utilisation des soins en France. Congrès de l’Association Des
Epidémiologistes de Langue Française, Bordeaux, 15-17 septembre 2004, Rev Epidemiol
Santé Publ 2004; 52:1S124 (poster).
Chaix B, Merlo J, Chauvin P. Investigating place effects on health: a spatial approach vs. the
multilevel approach. International Conference on Statistics in Health Sciences, Nantes,
France, 23-25 juin 2004, livre d’abstract p.279.
Chaix B, Merlo J, Diez-Roux AV, Chauvin P. Mesure et explication des variations
géographiques des modes de recours aux soins : comparaison de l’approche multiniveau
46
et de l’approche spatiale. XXVIIème Journées des économistes français de la santé, Paris,
17-18 juin 2004.
Chaix B, Chauvin P. People living with sick co-residents are at increased risks of
underconsultation. Congress of the European Epidemiology Federation (IEA), Tolède,
Espagne, 1–4 octobre 2003, Gaceta Sanitaria 2003;17(Supl. 2): abstract no. 483.
Chaix B, Veugelers P, Boëlle PY, Chauvin P. Access to general practitioner services: disabled
elderly individuals lag behind in underserved areas. Congress of the European
Epidemiology Federation (IEA), Tolède, Espagne, 1-4 octobre 2003, Gaceta Sanitaria
2003;17(Supl. 2): abstract no. 204.
Chaix B, Chauvin P. Tobacco and alcohol consumption, and overweightness in France: a
multilevel analysis of individual and area-level determinants. Congress of the European
Epidemiology Federation (IEA), Tolède, Espagne, 1-4 octobre 2003, Gaceta Sanitaria
2003;17(Supl. 2): abstract no. 484.
Chaix B, Chauvin P. Mesure du degré de similitude des comportements de recours aux
médecins spécialistes au sein du ménage : une utilisation des modèles multiniveau et
ALR. Congrès Biométrie et Epidémiologie, Lille, 15–16 septembre 2003, actes p.70.
Chaix B, Chauvin P. Tobacco and alcohol consumption in France: a comparative analysis of
risk of consumption and level of consumption. 16th World Congress of Epidemiology,
Montreal, Canada, 18–22 août 2002, livre d’abstracts: MP152 (poster).
COMMUNICATIONS INVITEES
Chaix B, Merlo J, Chauvin P. Mesure de la tendance des phénomènes à survenir en grappe à
partir de la régression logistique en épidémiologie sociale : un exemple sur l’utilisation
des soins en Suède. Cours doctoral « Modélisation des observations corrélées », Ecole
doctorale « Epidémiologie, Sciences Sociales, et Santé Publique », Villejuif, 17-18 mai
2004.
Chaix B, Merlo J, Diez-Roux AV, Chauvin P. Comparison of a spatial approach with the
multilevel approach for investigating place effects on health: the example of healthcare
utilization in France. International workshop on multilevel models in public health
research, Paris, Credes, 1er mai 2004.
Chaix B. Comparison of a spatial approach with the multilevel approach for investigating
place effects on health: an example on healthcare utilization in France. Séminaire au
47
Department of Community Medicine, Malmö University Hospital, Malmö, Suède, 17
février 2004.
Chaix B. L’analyse des variations géographiques des modes de recours aux soins :
comparaison de l’approche contextuelle et de l’approche spatiale. Ateliers de l’INED,
Paris, 16 décembre 2003.
Chaix B. Stratégie d’utilisation des modèles multiniveaux en épidémiologie sociale.
Séminaire de l’IFR 69 (INSERM) « Interface Biostatistique / Epidémiologie », Villejuif,
26 novembre 2002.
Chaix B. L’apport des modèles multiniveaux dans l’analyse contextuelle en épidémiologie
sociale. Séminaire de la Société Française de Statistiques, Paris, 24 octobre 2002.
48
Bibiographie
1
Diez-Roux AV. Bringing context back into epidemiology: variables and fallacies in
multilevel analysis. Am J Public Health 1998;88:216-22.
2
Duncan C, Jones K, Moon G. Health-related behaviour in context: a multilevel modelling
approach. Soc Sci Med 1996;42:817-30.
3
Diez-Roux AV. Multilevel analysis in public health research. Annu Rev Public Health
2000;21:171-92.
4
Pickett KE, Pearl M. Multilevel analyses of neighbourhood socioeconomic context and
health outcomes: a critical review. J Epidemiol Community Health 2001;55:111-22.
5
Kawachi I, Berkman LF. Neighborhoods and Health. New York: Oxford University
Press, 2003.
6
Mooij T. Pupil-class determinants of aggressive and victim behaviour in pupils. Br J Educ
Psychol 1998;68:373-85.
7
Palmer RF, Graham JW, White EL, et al. Applying multilevel analytic strategies in
adolescent substance use prevention research. Prev Med 1998;27:328-36.
8
Johnson RA, Hoffmann JP. Adolescent cigarette smoking in U.S. racial/ethnic subgroups:
findings from the National Education Longitudinal Study. J Health Soc Behav 2000;41:392407.
9
Kivimaki M, Vahtera J, Pentti J, et al. Factors underlying the effect of organisational
downsizing on health of employees: longitudinal cohort study. BMJ 2000;320:971-5.
10
O'Campo P. Invited commentary: Advancing theory and methods for multilevel models
of residential neighborhoods and health. Am J Epidemiol 2003;157:9-13.
11
Merlo J, Östergren PO, Hagberg O, et al. Diastolic blood pressure and area of residence:
multilevel versus ecological analysis of social inequity. J Epidemiol Community Health
2001;55:791-8.
12
Merlo J, Lynch JW, Yang M, et al. Effect of neighborhood social participation on
individual use of hormone replacement therapy and antihypertensive medication: a multilevel
analysis. Am J Epidemiol 2003;157:774-83.
13
Merlo J, Chaix B, Yang M, et al. A brief conceptual tutorial on multilevel analysis in
social epidemiology - linking the statistical concept of clustering to the idea of contextual
phenomenon. J Epidemiol Community Health 2004; in press.
49
14
Fotheringham AS, Charlton ME, Brunsdon C. Spatial variations in school performance:
a local analysis using geographically weighted regression. Geographical & Environmental
Modelling 2001;5:43-66.
15
Diez-Roux AV. Investigating neighborhood and area effects on health. Am J Public
Health 2001;91:1783-9.
16
Green C, Hoppa RD, Young TK, et al. Geographic analysis of diabetes prevalence in an
urban area. Soc Sci Med 2003;57:551-60.
17
Pikhart H, Prikazsky V, Bobak M, et al. Association between ambient air concentrations
of nitrogen dioxide and respiratory symptoms in children in Prague, Czech Republic.
Preliminary results from the Czech part of the SAVIAH Study. Small Area Variation in Air
Pollution and Health. Cent Eur J Public Health 1997;5:82-5.
18
Mohr CD, Armeli S, Tennen H, et al. Daily interpersonal experiences, context, and
alcohol consumption: crying in your beer and toasting good times. J Pers Soc Psychol
2001;80:489-500.
19
Von Korff M, Koepsell T, Curry S, et al. Multi-level analysis in epidemiologic research
on health behaviors and outcomes. Am J Epidemiol 1992;135:1077-82.
20
Fiscella K, Franks P. Poverty or income inequality as predictor of mortality: longitudinal
cohort study. BMJ 1997;314:1724-7.
21
Scribner RA, Cohen DA, Fisher W. Evidence of a structural effect for alcohol outlet
density: a multilevel analysis. Alcohol Clin Exp Res 2000;24:188-95.
22
O'Campo P, Rao RP, Gielen AC, et al. Injury-producing events among children in low-
income communities: the role of community characteristics. J Urban Health 2000;77:34-49.
23
Pampalon R, Duncan C, Subramanian SV, et al. Geographies of health perception in
Quebec: a multilevel perspective. Soc Sci Med 1999;48:1483-90.
24
Reijneveld SA. The impact of individual and area characteristics on urban
socioeconomic differences in health and smoking. Int J Epidemiol 1998;27:33-40.
25
Reijneveld SA, Schene AH. Higher prevalence of mental disorders in socioeconomically
deprived urban areas in The Netherlands: community or personal disadvantage? J Epidemiol
Community Health 1998;52:2-7.
26
Finch BK, Vega WA, Kolody B. Substance use during pregnancy in the state of
California, USA. Soc Sci Med 2001;52:571-83.
27
Diez-Roux AV, Nieto FJ, Muntaner C, et al. Neighborhood environments and coronary
heart disease: a multilevel analysis. Am J Epidemiol 1997;146:48-63.
50
28
Driessen G, Gunther N, Van Os J. Shared social environment and psychiatric disorder: a
multilevel analysis of individual and ecological effects. Soc Psychiatry Psychiatr Epidemiol
1998;33:606-12.
29
Diez Roux AV. Estimating neighborhood health effects: the challenges of causal
inference in a complex world. Soc Sci Med 2004;58:1953-60.
30
Oakes JM. The (mis)estimation of neighborhood effects: causal inference for a
practicable social epidemiology. Soc Sci Med 2004;58:1929-52.
31
Shouls S, Congdon P, Curtis S. Modelling inequality in reported long term illness in the
UK: combining individual and area characteristics. J Epidemiol Community Health
1996;50:366-76.
32
Humphreys K, Carr-Hill R. Area variations in health outcomes: artefact or ecology. Int J
Epidemiol 1991;20:251-8.
33
Haynes R, Bentham G, Lovett A, et al. Effect of labour market conditions on reporting
of limiting long-term illness and permanent sickness in England and Wales. J Epidemiol
Community Health 1997;51:283-8.
34
Merlo J, Ostergren PO, Broms K, et al. Survival after initial hospitalisation for heart
failure: a multilevel analysis of patients in Swedish acute care hospitals. J Epidemiol
Community Health 2001;55:323-9.
35
Yen IH, Kaplan GA. Neighborhood social environment and risk of death: multilevel
evidence from the Alameda County Study. Am J Epidemiol 1999;149:898-907.
36
Blakely TA, Lochner K, Kawachi I. Metropolitan area income inequality and self-rated
health - a multi-level study. Soc Sci Med 2002;54:65-77.
37
Diez-Roux AV, Link BG, Northridge ME. A multilevel analysis of income inequality
and cardiovascular disease risk factors. Soc Sci Med 2000;50:673-87.
38
Morenoff JD. Neighborhood mechanisms and the spatial dynamics of birth weight. AJS
2003;108:976-1017.
39
O'Campo P, Xue X, Wang MC, et al. Neighborhood risk factors for low birthweight in
Baltimore: a multilevel analysis. Am J Public Health 1997;87:1113-8.
40
Macintyre S, Ellaway A, Cummins S. Place effects on health: how can we conceptualise,
operationalise and measure them? Soc Sci Med 2002;55:125-39.
41
Kalff AC, Kroes M, Vles JS, et al. Neighbourhood level and individual level SES effects
on child problem behaviour: a multilevel analysis. J Epidemiol Community Health
2001;55:246-50.
51
42
Reading R, Langford IH, Haynes R, et al. Accidents to preschool children: comparing
family and neighbourhood risk factors. Soc Sci Med 1999;48:321-30.
43
Duncan C, Jones K, Moon G. Smoking and deprivation: are there neighbourhood
effects? Soc Sci Med 1999;48:497-505.
44
Kleinschmidt I, Hills M, Elliott P. Smoking behaviour can be predicted by
neighbourhood deprivation measures. J Epidemiol Community Health 1995;49 Suppl 2:S72S7.
45
Sundquist J, Malmstrom M, Johansson SE. Cardiovascular risk factors and the
neighbourhood environment: a multilevel analysis. Int J Epidemiol 1999;28:841-5.
46
Karvonen S, Rimpela A. Socio-regional context as a determinant of adolescents' health
behaviour in Finland. Soc Sci Med 1996;43:1467-74.
47
Tuinstra J, Groothoff JW, van den Heuvel WJ, et al. Socio-economic differences in
health risk behavior in adolescence: do they exist? Soc Sci Med 1998;47:67-74.
48
Karvonen S, Rimpela AH. Urban small area variation in adolescents' health behaviour.
Soc Sci Med 1997;45:1089-98.
49
Huston SL, Evenson KR, Bors P, et al. Neighborhood environment, access to places for
activity, and leisure-time physical activity in a diverse North Carolina population. Am J
Health Promot 2003;18:58-69.
50
Snijders T, Bosker R. Multilevel Analysis. An introduction to basic and advanced
multilevel modelling. London, England: Sage Publications, 1999.
51
Goldstein H, Browne W, Rasbash J. Multilevel modelling of medical data. Stat Med
2002;21:3291-315.
52
Goldstein H. Multilevel Statistical Models. 2nd ed. London: Edward Arnold, 1995.
53
Leyland AH, Goldstein H. Multilevel modelling of health statistics. Chichester, England:
Wiley, 2001.
54
Bobashev GV, Anthony JC. Clusters of marijuana use in the United States. Am J
Epidemiol 1998;148:1168-74.
55
Bobashev GV, Anthony JC. Use of alternating logistic regression in studies of drug-use
clustering. Subst Use Misuse 2000;35:1051-73.
56
Preisser JS, Arcury TA, Quandt SA. Detecting patterns of occupational illness clustering
with alternating logistic regressions applied to longitudinal data. Am J Epidemiol
2003;158:495-501.
57
Petronis KR, Anthony JC. A different kind of contextual effect: geographical clustering
of cocaine incidence in the USA. J Epidemiol Community Health 2003;57:893-900.
52
58 Searle SR, Casella G, McCulloch CE. Variance components. New York: Wiley, 1992.
59
Goldstein H. Multilevel mixed linear model analysis using iterative generalized least
squares. Biometrika 1986;73:43-56.
60
Raudenbush SW, Bryk AS. A hierarchical model for studying school effects. Sociology
of Education 1986;59:1-17.
61
Duncan C, Jones K, Moon G. Psychiatric morbidity: a multilevel approach to regional
variations in the UK. J Epidemiol Community Health 1995;49:290-5.
62
O'Campo P, Gielen AC, Faden RR, et al. Violence by male partners against women
during the childbearing year: a contextual analysis. Am J Public Health 1995;85:1092-7.
63
Carr-Hill RA, Rice N, Roland M. Socioeconomic determinants of rates of consultation in
general practice based on fourth national morbidity survey of general practices. BMJ
1996;312:1008-12.
64
Diez Roux AV, Merkin SS, Hannan P, et al. Area characteristics, individual-level
socioeconomic indicators, and smoking in young adults: the coronary artery disease risk
development in young adults study. Am J Epidemiol 2003;157:315-26.
65
Kennedy BP, Kawachi I, Glass R, et al. Income distribution, socioeconomic status, and
self rated health in the United States: multilevel analysis. BMJ 1998;317:917-21.
66
Chaix B, Chauvin P. L'apport des modèles multiniveau dans l'analyse contextuelle en
épidémiologie sociale : une revue de la littérature. Rev Epidemiol Sante Publique
2002;50:489-99.
67
Wang J, Siegal HA, Falck RS, et al. Needle transfer among injection drug users: a
multilevel analysis. Am J Drug Alcohol Abuse 1998;24:225-37.
68
Merlo J, Yang M, Chaix B, et al. A brief conceptual tutorial on multilevel analysis in
social epidemiology - investigating contextual phenomena in different groups of individuals. J
Epidemiol Community Health 2004; in press.
69
Merlo J. Multilevel analytical approaches in social epidemiology: measures of health
variation compared with traditional measures of association. J Epidemiol Community Health
2003;57:550-2.
70
Fotheringham AS, Wong DWS. The modifiable areal unit problem in multivariate
statistical analysis. Environ Plan A 1991;23:1025-44.
71
Amrhein CG. Searching for the elusive aggregation effect: evidence from statistical
simulations. Environ Plan A 1995;27:105-19.
72
Martin D. An assessment of surface and zonal models of population. Int J Geographical
Information Systems 1996;10:973-89.
53
73
Holt D, Steel DG, Tranmer M. Area homogeneity and the modifiable areal unit problem.
Geographical Systems 1996;3:181-200.
74
Mitchell R. Multilevel modeling might not be the answer. Environ Plan A 2001;33:1357-
60.
75
Kleinschmidt I, Sharp BL, Clarke GP, et al. Use of generalized linear mixed models in
the spatial analysis of small-area malaria incidence rates in Kwazulu Natal, South Africa. Am
J Epidemiol 2001;153:1213-21.
76
English PB, Kharrazi M, Davies S, et al. Changes in the spatial pattern of low birth
weight in a southern California county: the role of individual and neighborhood level factors.
Soc Sci Med 2003;56:2073-88.
77
Kleinschmidt I, Sharp B, Mueller I, et al. Rise in malaria incidence rates in South Africa:
a small-area spatial analysis of variation in time trends. Am J Epidemiol 2002;155:257-64.
78
Joines JD, Hertz-Picciotto I, Carey TS, et al. A spatial analysis of county-level variation
in hospitalization rates for low back problems in North Carolina. Soc Sci Med 2003;56:254153.
79
Werneck GL, Maguire JH. Spatial modeling using mixed models: an ecologic study of
visceral leishmaniasis in Teresina, Piaui State, Brazil. Cad Saude Publica 2002;18:633-7.
80
Leyland AH, Langford IH, Rasbash J, et al. Multivariate spatial models for event data.
Stat Med 2000;19:2469-78.
81
Langford IH, Leyland AH, Rasbash J, et al. Multilevel modelling of the geographical
distributions of diseases. J R Stat Soc Ser C Appl Stat 1999;48:253-68.
82
Treno AJ, Gruenewald PJ, Johnson FW. Alcohol availability and injury: the role of local
outlet densities. Alcohol Clin Exp Res 2001;25:1467-71.
83
Liu GC, Cunningham C, Downs SM, et al. A spatial analysis of obesogenic
environments for children. Proc AMIA Symp 2002:459-63.
84
Ali M, Emch M, Tofail F, et al. Implications of health care provision on acute lower
respiratory infection mortality in Bangladeshi children. Soc Sci Med 2001;52:267-77.
85
Chaix B, Bobashev GV, Merlo J, et al. Re: "Detecting patterns of occupational illness
clustering with alternating logistic regressions applied to longitudinal data". Am J Epidemiol
(in press).
86
Chaix B, Guilbert P, Chauvin P. A multilevel analysis of tobacco use and tobacco
consumption levels in France: are there any combination risk groups? Eur J Public Health
2003;in press.
54
87
Chaix B, Chauvin P. Tobacco and alcohol consumption, sedentary lifestyle and
overweightness in France: a multilevel analysis of individual and area-level determinants. Eur
J Epidemiol 2003;18:531-8.
88
Chaix B, Boëlle PY, Guilbert P, et al. Area level determinants of specialty care
utilisation in France: a multilevel analysis. Public Health 2004; in press.
89
Chaix B, Veugelers PJ, Boëlle PY, et al. Access to general practitioner services: the
disabled elderly lag behind in underserved areas. Eur J Public Health in press.
90
Guilbert P, Baudier F, Gautier A. Baromètre Santé 2000. Résultats. Vanves: Edition
CFES, 2001.
91
Guilbert P, Baudier F, Gautier A, et al. Baromètre Santé 2000. Méthodes. Vanves:
Editions CFES, 2001.
92
Auvray L, Dumesnil S, Le Fur P. Santé, soins et protection sociale en 2000 [Health,
healthcare and insurance in 2000] (in French). Paris, France: CREDES, 2001.
93
Hosseini M, Carpenter RG, Mohammad K. Growth of children in Iran. Ann Hum Biol
1998;25:249-61.
94
Rice N, Carr-Hill R, Dixon P, et al. The influence of households on drinking behaviour:
a multilevel analysis. Soc Sci Med 1998;46:971-9.
95
Merlo J, Asplund K, Lynch JW, et al. Population effects on individual systolic blood
pressure - a multilevel analysis of WHO MONICA project. Am J Epidemiol 2004;159:116879.
96
Goldstein H, Browne W, Rasbash J. Partitioning variation in generalised linear
multilevel models. Understanding Statistics 2002;1:223-32.
97
Larsen K, Petersen JH, Budtz-Jorgensen E, et al. Interpreting parameters in the logistic
regression model with random effects. Biometrics 2000;56:909-14.
98
Larsen K, Merlo J. Appropriate assessment of neighborhood effects on individual health
- integrating random and fixed effects in multilevel logistic regression. Am J Epidemiol in
press.
99
Katz J, Carey VJ, Zeger SL, et al. Estimation of design effects and diarrhea clustering
within households and villages. Am J Epidemiol 1993;138:994-1006.
100
Katz J, Zeger SL, West KP, Jr., et al. Clustering of xerophthalmia within households
and villages. Int J Epidemiol 1993;22:709-15.
101
6.
55
Rosenthal TC, Fox C. Access to health care for the rural elderly. JAMA 2000;284:2034-
102
Lambert D, Agger MS. Access of rural AFDC Medicaid beneficiaries to mental health
services. Health Care Financ Rev 1995;17:133-45.
103
Halldorsson M, Kunst AE, Kohler L, et al. Socioeconomic differences in children's use
of physician services in the Nordic countries. J Epidemiol Community Health 2002;56:200-4.
104
Casey MM, Thiede Call K, Klingner JM. Are rural residents less likely to obtain
recommended preventive healthcare services? Am J Prev Med 2001;21:182-8.
105
Shannon GW, Bashshur RL, Lovett JE. Distance and the use of mental health services.
Milbank Q 1986;64:302-30.
106
Duncan TE, Duncan SC, Hops H. Latent variable modeling of longitudinal and
multilevel alcohol use data. J Stud Alcohol 1998;59:399-408.
107
Tak YR, McCubbin M. Family stress, perceived social support and coping following
the diagnosis of a child's congenital heart disease. J Adv Nurs 2002;39:190-8.
108
Pearlin LI, Mullan JT, Semple SJ, et al. Caregiving and the stress process: an overview
of concepts and their measures. Gerontologist 1990;30:583-94.
109
Weitzner MA, Haley WE, Chen H. The family caregiver of the older cancer patient.
Hematol Oncol Clin North Am 2000;14:269-81.
110
Covinsky KE, Goldman L, Cook EF, et al. The impact of serious illness on patients'
families. SUPPORT Investigators. Study to Understand Prognoses and Preferences for
Outcomes and Risks of Treatment. JAMA 1994;272:1839-44.
111
Gemperli A, Vounatsou P, Kleinschmidt I, et al. Spatial patterns of infant mortality in
Mali: the effect of malaria endemicity. Am J Epidemiol 2004;159:64-72.
112
Diggle P, Moyeed R, Rowlingson B, et al. Childhood Malaria in the Gambia: A Case–
Study in Model–Based Geostatistics. J R Stat Soc Ser C Appl Stat 2002;51:493-506.
113
Banerjee S, Gelfrand AE, Carlin BP. Hierarchical Modeling and Analysis for Spatial
Data. Boca Raton, FL, USA, 2003.
114
Banerjee S, Wall MM, Carlin BP. Frailty modeling for spatially correlated survival
data, with application to infant mortality in Minnesota. Biostatistics 2003;4:123-42.
115
Rushton G, Lolonis P. Exploratory spatial analysis of birth defect rates in an urban
population. Stat Med 1996;15:717-26.
116
Carrat F, Valleron AJ. Epidemiologic mapping using the "kriging" method: application
to an influenza-like illness epidemic in France. Am J Epidemiol 1992;135:1293-300.
117
Burton P, Gurrin L, Sly P. Extending the simple linear regression model to account for
correlated responses: an introduction to generalized estimating equations and multi-level
mixed modelling. Stat Med 1998;17:1261-91.
56
118
Wood SN. Thin plate regression splines. J. R. Stat. Soc. Ser. B Stat. Methodol.
2003;65:95--114.
119
Burnett R, Ma R, Jerrett M, et al. The spatial association between community air
pollution and mortality: a new method of analyzing correlated geographic cohort data.
Environ Health Perspect 2001;109 Suppl 3:375-80.
120
Cakmak S, Burnett RT, Jerrett M, et al. Spatial regression models for large-cohort
studies linking community air pollution and health. J Toxicol Environ Health A
2003;66:1811-23.
121
Boyle MH, Willms JD. Place effects for areas defined by administrative boundaries. Am
J Epidemiol 1999;149:577-85.
122
Littel RC, Milliken GA, Stroup WW, et al. SAS System for Mixed Models. Cary, North
Carolina, USA: SAS Institute, 1996.
123
Borgoni R, Billari FC. Bayesian spatial analysis of demographic survey data: An
application to contraceptive use at first sexual intercourse. Demogr Res 2003;8:online journal.
124
Zonage d'Etudes [Geographic subdivisions of the territory] (in French). Paris, France:
Institut
National
de
la
Statistique
et
des
Etudes
Economiques
(http://www.insee.fr/fr/nom_def_met/nomenclatures/zonages_etudes/index.htm).
125
Lucas-Gabrielli V, Tonnellier F, Vigneron E. Une typologie des paysages socio-
sanitaires en France. Paris: CREDES, 1998.
126
Rushton G. Public health, GIS, and spatial analytic tools. Annu Rev Public Health
2003;24:43-56.
127
Bingenheimer JB, Raudenbush SW. Statistical and substantive inferences in public
health: issues in the application of multilevel models. Annu Rev Public Health 2004;25:53-77.
128
Curtis S, Southall H, Congdon P, et al. Area effects on health variation over the life-
course: analysis of the longitudinal study sample in England using new data on area of
residence in childhood. Soc Sci Med 2004;58:57-74.
57