The Relative Impact of Interviewer Effects and Sample Design Effects... Author(s): Colm O'Muircheartaigh and Pamela Campanelli
Transcription
The Relative Impact of Interviewer Effects and Sample Design Effects... Author(s): Colm O'Muircheartaigh and Pamela Campanelli
The Relative Impact of Interviewer Effects and Sample Design Effects on Survey Precision Author(s): Colm O'Muircheartaigh and Pamela Campanelli Source: Journal of the Royal Statistical Society. Series A (Statistics in Society), Vol. 161, No. 1 (1998), pp. 63-77 Published by: Wiley for the Royal Statistical Society Stable URL: http://www.jstor.org/stable/2983554 . Accessed: 16/04/2013 08:25 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . Wiley and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to Journal of the Royal Statistical Society. Series A (Statistics in Society). http://www.jstor.org This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions J. R. Statist.Soc. A (1998) 161, Part1,pp. 63-77 The relativeimpactof interviewer effectsand sample design effectson survey precision Colm O'Muircheartaight London School of Economics and PoliticalScience, UK and Pamela Campanelli Social and CommunityPlanningResearch, London, UK [ReceivedMay1996. RevisedFebruary 1997] indatacollected sourcesoferror from structured face-to-face interSummary.One oftheprincipal viewsis theinterviewer. The othermajorcomponent ofimprecision insurveyestimates is sampling variance.Itis rare,however, tofindstudiesinwhich thecomplex sampling varianceandthecomplex interviewer varianceare bothcomputed. This papercomparesthe relative impactof interviewer and sampledesigneffects on surveyprecision use ofan interpenetrated effects bymaking primary unit-interviewer whichwas designedbytheauthorsforimplementation inthe sampling experiment secondwaveoftheBritish Household PanelStudyas partofitsscientific Italso illustrates programme. theuse ofa multilevel (hierarchical) approachinwhichtheinterviewer andsampledesigneffects are ina substantive estimated whilebeingincorporated modelofinterest. simultaneously Interviewer Keywords:Interviewer effect; variance;Multilevel models;Responsevariance 1. Introduction The intervieweris seen as one of the principalsources of errorin data collected from structuredface-to-faceinterviews.Surveystatisticianshave expressedthe effectin formal statisticalmodels of two kinds.In theanalysis-of-variance the errors (ANOVA) framework are seen as net biases fortheindividualinterviewers and theeffectis seen as the increasein variancedue to thevariability amongthesebiases.The alternative approachis to considerthe interviewer effectto arise fromthe creationof positivecorrelationsbetweenthe response deviationscontainedin (almostall) surveydata; theincreasein thevarianceof a mean is due to the positivecovariance among these deviations.Studies of interviewer variabilitydate fromthe 1940s (see, forexample,Mahalanobis (1946)). The ANOVA model in thiscontext was expounded by Kish (1962) and developedby Hartleyand Rao (1978) and others;the correlationmodel was firstpresentedby Hansen et al. (1961)- theCensus Bureau modeland extendedby Fellegi (1964, 1974). The othermajor componentof imprecisionin surveyestimatesis samplingvariance.It is known that for most complex sample surveydesigns the precisionof estimatorsis low comparedwiththatof simplerandomsampledesignsof thesame size. Area clusterstypically formthe samplingunits for complex sample designsand the loss of precisionis due to positivecorrelationsbetweenpeople belongingto the same area clusters. tAddressfor correspondence:Methodology Institute,London School of Economics and Political Science, Houghton Street,London, WC2A 2AE, UK. E-mail:[email protected] ? 1998 RoyalStatistical Society This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions 0964-1998/98/161063 64 and P. Campanelli C. O'Muircheartaigh There are many other sources of measurementerror in surveys.Some (e.g. coder to estimatethrougheitherreplicationor intervariance) are relativelystraightforward penetration.Others (e.g. question wording effects)require special interventionsin the reviewmay be foundin Biemeret A comprehensive surveyprocess fortheirinvestigation. al. (1989). Though thereare some studiesin whichthe complexsamplingvarianceand thecomplex variance are both computed(Bailey et al. (1978) for the US National Crime interviewer Surveyin Lesotho and Peru Surveyand O'Muircheartaigh(1984a,b) fortheWorld Fertility are examples),such studiesare rare. This is due to a combinationof designand analytic interviewsurveysin both the USA and the UK is to challenges.The normforface-to-face have theworkloadfroma givenprimarysamplingunit(PSU) assignedto a singleinterviewer workin onlyone PSU. This confoundsthesampling and, moreover,to have each interviewer designin and non-samplingvariances.Such confoundingis removedby an interpenetrated Owing to cost considerations, which respondentsare assignedat random to interviewers. surveys.Even fortelephonesurveys,where thesedesignsare rarelyemployedin face-to-face (see Grovesand Magilavy(1986)), thepracticalproblemsare less severe,thoughnon-trivial such studiesare uncommon. and cluster Whereasithas beenpossibleto carryout a simultaneousanalysisof interviewer effectsfor sample means and othersimplestatistics,it is only recentlythat softwarehas whileincorporatsimultaneously and clustereffects becomeavailable to estimateinterviewer directlyintoa substantivemodel of interest.This is possiblethroughtheuse ing theseeffects multilevelmodelusingthesoftwarepackage MLn (Rasbash et al., 1995); of a cross-classified analysisare VARCL (Longford,1988)and HLM (Bryket alternativeprogramsformultilevel means and proportionsestimatedfromsurveydata are al., 1986). (Note that,technically, ratio estimatesas thereis uncontrolledvariationin the sample size. For the BritishHousehold Panel Study (BHPS) the selectionof PSUs withprobabilityproportionalto size and equal probabilitiesoverall,thisvariationis fairlytightlycontrolled.) on effects and sampledesigneffects This paper comparestherelativeimpactof interviewer whichwas experiment PSU-interviewer surveyprecisionby makinguse of an interpenetrated in the second wave of the BHPS. Section 2 designedby the authorsfor implementation describesin detail the data and methodsused. Section3 exploresthe resultsover all BHPS variables and illustratesthe use of a multilevel(hierarchical)approach in which the whilebeingincorporated are estimatedsimultaneously and sampledesigneffects interviewer in a substantivemodel of interest.Finally,Section4 summarizesand discussesour findings and theirimplicationsforsurveyresearchpractice. 2. Data and methods Design 2.1. The BritishHousehold Panel Study and the Interpenetrated The data sourceforthisprojectis theBHPS whichis conductedby theEconomicand Social ResearchCouncil (ESRC) CentreforMicro-socialChange at the Universityof Essex, UK. on theBHPS began in 1991and is scheduledto continuein annual waves until Interviewing clusterdesigncoveringall of Great at least 1998. The surveyused a multistagestratified compriseda shorthousehold level questionnaire Britain.The wave 2 surveyinstrument schedulewithevery 45-minuteinterview and shortself-completion followedby a face-to-face adult in thehousehold.Topics coveredincludedhouseholdorganization,incomeand wealth, labour marketexperience,housingcosts and conditions,healthissues,consumptionbehaviour,educationand training,socioeconomicvalues and marriageand fertility. This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions Interviewer Effectsand Sample DesignEffects 65 An interpenetrated designwas implemented in a sampleof PSUs in wave 2 of thesurvey. Owing to fieldrequirementsand travelcosts, a constrainedformof randomizationwas adopted in which addresses were allocated to interviewers at random withingeographic 'pools'; thesepools are sets of two or threePSUs. EveryPSU whose centroidwas no more than 10km fromthe centroidof at least one otherPSU was eligiblefor inclusionin the design. 153 of the 250 PSUs in the BHPS sample were eligible.Mutually exclusiveand exhaustivecombinationsof these153 eligiblePSUs wereformed;thisprocessresultedin 70 pools of PSUs, mostwithtwo,and some withthree,PSUs each. A systematicsample of 35 pools was thenselectedforinclusionin theinterpenetrating sampledesign.GreatBritainwas partitionedforthe sample designinto 18 regions;onlytwo of thesedid not includeat least one selectedgeographicpool. Of the35 geographicpools formed,fourprovedto be ineligibleas thesame interviewer was needed to cover all the PSUs in the pool and one proved to be effectively ineligiblefor analysisas one interviewer was needed to cover three-quarters of the geographicpool. An examinationof the 30 areas in whichthe designwas implementeddoes not indicateany systematicabnormality.To the extentthat an abnormalitydid exist,it would affectour resultsonlyif it wereto interactwiththeeffectof interviewers or withthe designeffect. 25 of the 30 usable geographicpools includedtwo interviewers and two PSUs and five includedthreeinterviewers and threePSUs. WithinPSUs in a givenpool, householdswere randomlyassignedto theinterviewers workingin thosePSUs. The samplesize foranalysisof the 30 geographicpools was 1282 householdsand 2433 individualrespondents. 2.2. Analytic methods Our initialfocuswas on thecalculationof intraclasscorrelationcoefficients p foreach of the componentsfromtheinterpenetrated design.These includedtheinterviewer (pi) and thePSU wereestimatedforall variablesin thedata setforwhichtherewere700 (ps). These coefficients or moreresponses.(In general,the multivariate ANOVA (MANOVA) analyseswhichwere used required74 degreesof freedom.A rough rule of thumbto ensuresufficiently stable estimatesis to set n greaterthan or equal to thedegreesof freedomtimes10. Applyingthis ruleto thecurrentmodelssuggestsan n of approximately 740.) Categoricaland mostordinal variablesweretransformed into binaryvariablesbeforetheanalyses;ordinalattitudescales (Likert scales) were, however,treatedas continuous.HierarchicalANOVAs were then carriedout foreach of thesevariablesusingthe SPSS MANOVA option.The use of SPSS allowed us to explorethislargenumberof variablesmorequicklyand efficiently thanwould have been feasiblewithMLn. These hierarchicalANOVAs wererestricted to cases fromthe 2 x 2 geographicpools as the programwould not handle the simultaneouscalculationof 2 x 2 and 3 x 3 geographicpools (note, however,that this is feasible with MLn). The eliminationof the3 x 3 geographicpools resultedin a reductionin samplesize of 21% at the householdlevel (to 1010 households)and 22% at theindividuallevel (to 1903 individuals). The sums of squares were partitionedusing a 'regressionapproach' in whicheach term is correctedfor everyothertermin the model. This makes sense substantivelyand also facilitatescomparisonwith MLn. It also means that the values for pi and p, which are reportedare conditionalon each other.(As our designis not balanced,the sums of squares forthe various componentsof themodel willnot add up to the total sum of squares. Also hierarchicalANOVA assumes a continuousdependentvariable. For proportionsbetween 0.20 and 0.80, however,theapproximationshouldbe fairlyclose.) Data fromthehierarchical ANOVA runswerethenassembledto createa metadata set of p-estimates constructedfrom This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions 66 and P. Campanelfi C. O'Muircheartaigh was added to theresultsof the 820 separateanalysesof theoriginaldata. Otherinformation checks)and thisdata set such as questiontype(attitudes,facts,quasi-factsand interviewer topic area of the questionnaire. 2.3. Cross-classifiedmultilevelmodels An alternativeconceptualizationof the analysisis as a multilevel(hierarchical)model in whichthe interviewer, PSU and geographicpool are hierarchicalpartitionsand the terms correspondingto themin themodel are consideredto be randomeffects.It is onlyrecently that cross-classifiedmultilevelanalysis has become feasible (see Goldstein (1995) and Rasbash et al. (1995)); the designis implementedin MLn by viewingone memberof the cross-classification as an additional level above the other. A basic multilevelvariance withingeographic by PSU cross-classification componentsmodel to capturethe interviewer pool can be definedas Yi(jk)l = Cl + 3Xi(jk)l + Uj + Uk + Ul + ei(jk)I (1) withinthe Ith forthe ith surveyelement,withinthejth PSU crossedby thekthinterviewer, geographicpool, where Yi(jk)lis a functionof an appropriateconstant al, explanatory /3,and an individualerrortermei(jk)l. Here uj is a variable(s) x and associated coefficients k, and ul is the randomdeparturedue to PSU j, Ukis a randomdeparturedue to interviewer random departuredue to geographicpool 1. Each of these termsand ei(jk), are random quantitieswhosemeansare assumedto be equal to 0. In cases wherethedependentvariableis a dichotomy,Yi(jk)l would be replacedin equation (1) by log{17rjk),/(l -7i(jk)l)}, where 'i(jk)l exp(ce+ f3Xi(jk)l+ uj + Uk+ u,) Uj + Uk + u,) 1 + exp(al + ,3xi(jk),+ When the dependentvariableis continuous,p can be calculateddirectlyfromthe variance estimates in a variance componentsmodel (e.g. interviewervariance divided by total variance).When the dependentvariableis dichotomous,the variancecomponentsare given on the logisticscale and a more complex computationis required.We generaterandom normal deviates withvariance given by the componentestimate.These deviatesare then values is calculated transformed (takingtheanti-logit)and thevarianceof thesetransformed directlyto give the numeratorforp. and PSU effectsas randomeffectsratherthan as fixed The treatmentof the interviewer effects(which is more common in the surveysamplingliterature)postulates a 'superused in thestudyweredrawnand an fromwhichtheinterviewers population' of interviewers we can considertheinference infinitely largepopulationof PSUs. In thecase of interviewers fromwhomthesurveyinterviewers as beingmade to thepopulationof potentialinterviewers were drawn. For the PSUs the assumptioninvolves essentiallyignoringa small finite in the relative populationcorrection(see, forexample,Kalton (1979)). As we are interested and the sample design magnitudesof the componentsof variancedue to the interviewers under the same essentialsurveyconditionsthis treatmentwill not affectour conclusions materially. demonstrated An added advantageof multilevelmodellingin general,as recently (see Hox to covariates al. is the directlyinto et al. (1991) and Wigginset facility incorporate (1992)), the the factors as of such work we are able to examine the analysis.For our interviewer, age was presentforbothwave gender,lengthof service,statusand whetherthesame interviewer 1 and wave 2 of thepanel survey.We can also includecharacteristics of therespondents.We This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions Interviewer Effectsand SampleDesignEffects 67 based on a matchto censussmallarea statisticsin due plan to add area levelcharacteristics linearmodelshave ofcoursebeen used to analysesurveydata. Such noncourse.Single-level hierarchicalmodels ignorethe way in whichthe clusteringin the sample design and the may affectthe variance-covariance clusteringof responsesgeneratedby the interviewers structureof the observations. 3. Results 3.1. Findingsfromhierarchicalanalysis of variance The designeffectis the most commonlyused measureof the effectof within-PSUhomogeneityon surveyresults;this is deff= 1 + ps(b- 1) wheres denotes the clusteringin the correlationand b is the average numberof elements samplingframe,Ps is the intracluster selectedfroma cluster(the clustertake). We presentthe resultsof thisanalysisin termsof for interviewers and PSUs. Both measurethe withinthe intraclasscorrelationcoefficients unit (interviewer or PSU) homogeneityof the observations.Within-PSUhomogeneityis a characteristicof the true values of the elementsin the population. Within interviewer and his or workloadsthe homogeneityresultsfromthe interactionbetweenthe interviewer her respondents;the effecton the varianceof an estimatemay,however,be expressedin a form that is identical with that for the design effect. The interviewereffect is correlation inteff= 1 + pi(m- 1) wherei denotesthe interviewer, pi is the intra-interviewer workload workload.The clustertake and the interviewer and m is the average interviewer arise as a resultof decisionsby the designerof the survey;p, and pi are quantitiesthatare As such thelatterare and to thequalityof interviewers. intrinsicto thepopulationstructure more portable than the variance components themselves;the variance components themselvescan of course be calculatedonce the p-valuesare known. Duringthepast 30 yearsor so evidencehas accumulatedabout theorderof magnitudeof correlationcoefficient correlationcoefficient and the intra-interviewer both the intracluster in sample surveysin the USA and elsewhere.Though it is impossibleto generalizewith confidence,theevidencesuggeststhatvalues of pi greaterthan0.1 are uncommon.(Thereis numbersof interviewers, difficulty in comparingacross studiesas each involvesdifferent reportthe different typesofvariables.In addition,someresearchers samplesizesand different negativevalues ofpi whichoccurand otherssettheseto 0.) Also, as indicatedby themeansin Table 1, themajorityof values tendto be less than0.02 (all thesevalues are estimates,which accounts for the negativevalues in Table 1). There is also some evidence,althoughthis is in different ways;attitude by interviewers typesof variablesare affected mixed,thatdifferent effectthan itemsand complexfactualitemsare consideredmoresensitiveto an interviewer simplefactualitemsare (see, forexample,Collinsand Butcher(1982), Feather(1973), Fellegi (1964), Gray (1956) and Hansen et al. (1961)). The range of values reportedin the literatureforPs is similarto that for Pi, thoughwe would expectpi to have morevalues near 0. Again,theevidencesuggeststhatvalues greater than 0.1 are uncommonand thatpositivevalues are almostuniversal.The largevalues tend to be forcertaintypesof demographicvariables,notablytenureand ethnicorigin.This is to be expectedsinceadjacentgroupsof housesin a smallarea willtendto be of similartypeand tenure,and people of similarethnicoriginoftenliveclose to each other(Lynnand Lievesley, 1991). Other demographicvariablessuch as sex and maritalstatustend to show verylow values. It is typicallyfoundthatbehaviouraland attitudinalvariableshave p,-valuesthatare somewherebetweentheseextremes,withattitudinalvariablesshowingslightlylowervalues than behaviouralvariables. In the World FertilitySurvey(see Verma et al. (1980)), the This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions 68 C. O'Muircheartaigh and P. Campanell Table1. Summary ofotherinterviewer varianceinvestigations Study Valuesof pi Neighbour noiseandillness(UK) (Gray,1956) Television habits(UK) (GalesandKendall,1957) Census(USA) (Hansonand Marks,1958) Blue-collar workers (USA) (Kish,1962) Firststudy Secondstudy:interview Secondstudy:self-completion Census(Canada)(Fellegi,1964) Healthsurvey (Canada)(Feather,1973) Mentalretardation (USA) (FreemanandButler,1976) Aircraft noise(UK) (O'Muircheartaigh andWiggins, 1981) Consumer attitude survey (UK) (CollinsandButcher, 1982) 9 telephone surveys (USA) (GrovesandMagilavy, 1986) Mean -0.018 to 0.10t (0.00)to 0.05,0.19$ -0.00 to 0.061$ 0.015t ? 0.011$ -0.031 to 0.092 -0.005 to 0.044 -0.024 to 0.040 (0.00)to 0.026 -0.007 to 0.033 -0.296 to 0.216 (0.00)to 0.09 -0.039 to 0.119 -0.042 to 0.171 0.020 0.014 0.009 0.008 0.006 0.036 0.020 0.013 0.009 fromF-ratios tCalculated byusingtheformula supplied byKish(1962). availablethrough Kish(1962). tNumbers ?Meancannotbe computed: GalesandKendall(1957)didnotreport all thevariables analysed. medianPs across variouscountrieswas 0.02 forvariousnuptiality and fertility variables.The median was muchhigher(around 0.08) forvariablesconcerningcontraceptiveknowledge. In comparing these two sources of variability,Hansen et al. (1961) found that the interviewer variance was oftenlargerthan the samplingvariance. Bailey et al. (1978), in contrast,found responsevariance componentsthat were at least 50% of theirsampling varianceforonlya quarterof theirstatistics. We includedin the analysis 820 variables,some representing subcategoriestaken from BHPS items.Of these,98 were attitudequestions,574 were factual,88 were interviewer checks (itemscompletedby the interviewers withouta formalquestion)and 60 werequasifacts(mostlyon a self-completion form).Fig. 1 showsthecumulativefrequency distributions for ps and pi. The orders of magnitudefor the two coefficients were strikingly similar. As these values are themselvesestimatestheyare subject to imprecision;using a test of significance at the 5% levelfourin 10 of thevalues of p, and threein 10 of the values of pi % Cumulative 100 90 80 70 50 - 40 -1 / 20 XY-10 -0.02 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 Valueofp and intracluster ofpi ( Fig. 1. Intra-interviewer correlations: cumulative distribution This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions ) and p, (.. and Sample DesignEffects Effects Interviewer 69 as positivevaluesare than0. In thecase ofp, thisis notsurprising greater weresignificantly is that,within thestudy, surprising Whatis somewhat variables. formostsurvey expected pi For thesedata,becauseofthewaythattheinvestigation is ofthesameorderofmagnitude. takewerethesame; workloadand theaveragecluster theaverageinterviewer was designed, ofthesampledesignandtheinterviewers thattheeffects ofPs andpi imply thusourestimates werealso aboutthesame. valuesof pi. For attitudequestions,28% All typesof questionsshowsomesignificant greaterthan0; forfactualquestionsit was 26%; of the valuesof pi weresignificantly questions,25% (withthe 58%; forthequasi-factual checks,a staggering forinterviewer for ofthefindings is thesimilarity items).Whatis interesting oftheself-completion exclusion There ofsomestudies. withthefindings whichis incontrast andfactualitems, theattitudinal item.Amongthoseitemsbased on Likert is somevariationbetweentypesof attitudinal valuesof Pi; thiscompareswith25% oftheotherattitude scales,33% showedsignificant items. 32% oftheitemsin bysourceofthequestion.Forexample, We also lookedfordifferences than0. The samewas greater whichweresignificantly theindividual schedulehad pi-values items,27% ofthecoversheetitems,28% ofthederived truefor17% oftheself-completion itemsand questionnaire 32% ofthehousehold questionnaire, variablesfromtheindividual's The notabledifference here 34% ofthederivedvariablesfromthehouseholdquestionnaire. itemsand thosethatare is between theself-completion effects to interviewer in susceptibility effect at all on the selfThe factthatthereis an interviewer administered. interviewer such to suggest foundlittleevidence Kish(1962),forexample, formis interesting. completion and Wiggins thathe examined.O'Muircheartaigh on thewritten questionnaires an effect ofthe in thepresence fora healthsupplement completed didfindan effect (1981),however, items). (as weretheBHPS self-completion interviewer in theproportion of significant Therewas also basicallyno difference pi-valuesbetween health,marriageand fertility, demographics, sectionsof the questionnaire: the different valuesand incomeand householdallocation(withthe history, employment employment, thesectionat theendofthe from22% to 35%). In contrast ranging significant percentage was highlysusceptibleto for interviewers to recordtheirobservations questionnaire sectionshowedsignificant observation 76% oftheitemsintheinterviewer interviewer effects. witha and continuous variables, dummy between valuesof Pi. Therewas also a difference variables. ofeffects beingnotedforthecontinuous higherproportion of0.35between therewas a clearpositivecorrelation Furthermore, pi and Ps,A positive thatshowlargeintracluster homogeneity thatvariables correlation between p, andpi implies to differential substantial amongtruevalues)are also sensitive clustering (showrelatively been observed has not,to our knowledge, Such a correlation frominterviewers. effects the are themselves variables, in thecomputation ofthiscorrelation before.As theelements to have a large maybe becauseit is necessary absenceof suchevidencein theliterature coefficient withanyprecision.In our numberof variablesto estimatesucha correlation acrosstypesofvariables. showsremarkable consistency analysisthecorrelation itis reasonable to oneanother; whoaresimilar containindividuals clusters Homogeneous in withsimilarvaluesforthevariable questionmayrespondin a to suggestthatindividuals tobearintheinterviewer-respondent brings similarwayto whatever qualitiestheinterviewer would intracluster thatmanifested homogeneity Thiswouldmeanthatvariables interaction. intra-interviewer to be homogeneity. display on balancebe morelikelythanothervariables (see An alternative maybe foundin someof theearlyworkon interviewers explanation obtained the known to influence of interviewers are responses Hyman(1954)).Expectations This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions 70 C. O'Muircheartaigh and P. Campanelli For a variableto havea relatively within a largevalueofP. theindividuals byinterviewers. values;itis possiblethatthisconsistency willaffect cluster willhaverelatively homogeneous workloadprogresses, leadingto enhanced theinterviewers' expectations as theinterviewer's correlations withininterviewer workloads. withthe technicalinterpretation of the correlation These explanations are consistent in between theresponse andthesampling deviation fora singlevariablepostulated deviation inHansenet al. (1961),Fellegi(1964)andBaileyet al. theCensusBureaumodelandincluded at thiscorrelation directly fora singlevariablewithout (1978).It is notpossibleto estimate ofpi. inthestandard modelestimate leasttwowavesofdatacollection, thoughitis included mayarisefora singlevariable. Hansenet al. (1961)gavean exampleofhowthiscorrelation 3.2. Findingsfrommultilevel models For illustration, we includethreeMLn models,oneforeachofthemaintypesofvariables: We TheseareshowninTables2-4 respectively. interviewer checkitems,factsand attitudes. whether (single-level) modelto discover havealso shownthecorresponding non-hierarchical willbe affected thedata structure approconclusions whenwe incorporate oursubstantive in theanalysis. priately whether children were ThevariablemodelledinTable2 is a binarysubcategory indicating From sectionoftheinterview, as notedbytheinterviewer. present duringthedemographics thehierarchical p-valuesforthischildren presentsubanalysesof variance,theestimated categorywerepi = 0.171 and p, = 0.062 (n = 725). modelshowingthe The hierarchical versionof model1 is a basicvariancecomponents standarderrorsof the of PSU and interviewer. theestimated Although cross-classification of therandomparameters is randomparameters are includedin Table 2, thesignificance of thestandarderrors as thedistribution based on a contrasttest.(Thisis recommended fromnormality, forthe randomparameters especiallyin small maydepartconsiderably modeloftheinterviewer checkitem:children presentt Table2. Multilevel logisticregression Model3 Model2 ModelI Hierarchical Hierarchical NonHierarchical NonNonhierarchical hierarchical hierarchical Fixedeffects Grand mean No. of childrenin household Respondent'sgender (female) Interviewer's gender (female) -1.05 (0.08) -1.05 (0.14) -3.24 (0.37) 1.20 (0.10) 0.62 (0.21) -3.30 (0.41) 1.23 (0.11) 0.59 (0.22) -5.42 (0.94) 1.23 (0.10) 0.62 (0.21) 1.11 (0.43) -5.49 (1.27) 1.25 (0.11) 0.60 (0.22) 1.14 (0.62) Variance components Randomeffects: source Respondent PSU Interviewer 1 0.09 (0.12) 0.49 (0.20)t 1 0.08 (0.17) 0.89 (0.32)$ tStandard errorsare givenin parentheses. randomparametersbased on a contrasttest. $ Significant This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions 1 0.08 (0.17) 0.81 (0.31)t Effects and Sample DesignEffects Interviewer 71 but not betweenPSUs. In the samples.)We foundsignificant variationbetweeninterviewers model the estimateforvariationbetweengeographicpools forthisvariablewas 0; thiswas not of coursethecase forall variables.Parametersclose to 0 are oftenconstrainedto 0 by the MLn program;in thiscase theparameterremains0 evenwhenemployingthe 'second-order MLn estimationprocedure'.(In theestimationof randomparametersin a logisticregression, uses a weightedgeneralizedleast squares estimationprocedurewhich requiresthe quantitiesto be estimatedto be in the linear part of the model. A series expansion is used to approximatea linear form.Simulationand theoryhave suggestedthat the first-order of thepaxameters.In manymodelsthe estimationprocedurescan lead to an underestimation underestimation is negligible.However,in some models wherepredictedprobabilitiesare can be severe. extreme,or wherethereare fewlevel 1 unitsper level2 unit,underestimation estimationprocedure. Thereis an optionin MLn whichallows theselectionofa second-order This procedure,however,is less computationallyrobust.See Woodhouse (1995) for a full of themodeltheindividualvariation descriptionof thismatter.)In thestandardformulation is assumed to have a binomial distributionand is constrainedto 1. (The validityof this assumptioncan be testedin MLn by relaxingthisconstraint.) In model 2, we have includedtheindividuallevelexplanatoryvariable,numberof children differences betweeninterviewers in household,as it is desirableto controlforany systematic take place in housein thecompositionof theirworkloads;an interviewer whose interviews whose on thisitemfromthoseinterviewers holds withoutchildrenwouldbe expectedto differ workloadscontaineda largenumberof householdswithchildren.This controlvariablehas a in thehierarchicalmodel. (For fixedeffects coefficient significance may be judged significant by comparingthe estimatewithits standarderrorin the usual way.) Also included is the individual level explanatoryvariable 'respondent'sgender'. We expected that the presenceof childrenduringthe interviewwould be a functionof the respondent'sgender,withwomenrespondentsbeingmorelikelyto have childrenwiththem than male respondentsare. As can be seen by the values in Table 2, thisexpectationwas confirmed. in the hierarchical to note that the random coefficient for interviewers It is interesting version of model 2 increasesin comparisonwith model 1. This suggeststhat it is not workloadsthatexplainsthisinterviewer variability,but haphazard variationin interviewer in recordingthepresenceof childrenis greater ratherthatthevariationbetweeninterviewers when opportunity(i.e. childrenin the household) is taken into account as well as the respondent'sgender.The basic conclusionwhichcan be drawnfrommodel2 is thesame for versionsof the model. both the hierarchicaland thenon-hierarchical age, We thenadded severalinterviewer explanatoryvariables.These includedinterviewer supervisoror area manager)and yearswiththe gender,status(whethera basic interviewer, had visitedthe company. Also includedwas a measure of whetherthe same interviewer Of thesevariouscharacteristics, onlyinterviewer householdforthepreviousyear'sinterview. in thenon-hierarchical model and genderis consideredin model 3. It was clearlysignificant that in this case only approached significancein the hierarchicalmodel. It is interesting different conclusionsmighthave been reacheddependingon whichmodel was considered. We also investigatedthe possibilityof an interactionbetween interviewergender and undereitherversionof model 3. was not significant respondentgender.This coefficient in thiscase. effect Thereare at leasttwopossibleexplanationsforthecorrelatedinterviewer to arrangethe in the abilityof interviewers First,it is quite likelythatthereis a difference in circumstancesof the interviewso that the respondentis alone at the time- flexibility emphasizesthe need for an making appointments,the degree to which the interviewer This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions 72 C. O'Muircheartaighand P. Campanelli undisturbedsettingfor the interview,etc. There is also the possibilitythat most of the variabilityis due to differencesin the extent to which, or the between-interviewer in which,interviewers recordthepresenceof children;one sourceof variation circumstances of othersbeing'present'. could be in the definition The keycontrasthereis betweenthemessagethatwe would obtainfrompi and ps and the message fromthe multilevelanalysis. With the formerwe would be concernedthat the estimated.In thiscase to therelationships standardanalysiswould givespurioussignificance - thoughpresentforthe dependentvariable- does effect at least, however,an interviewer not affectthe substantiveanalysis. Table 3 deals withone of the respondentlevel factualitems,newspaperreadership.The variablemodelledis a binarysubcategoryindicatingwhetheror not therespondenttypically reads the Independent.From the hierarchicalANOVAs, the estimatedp-values for this readershipsubcategorywerepi = 0.129 and p, = 0.106 (n = 1268). checkitem(see model 1), Unlikethevariancecomponentsmodelshownfortheinterviewer variationbetween thebasic variancecomponentsmodel givenin model4 showsa significant variationbetween For thisalso therewas no significant PSUs as wellas betweeninterviewers. geographicpools. In model 5, we have includedtheindividuallevelexplanatoryvariable'respondent'sage'. Several otherexplanatoryvariableshad also been exploredin both the hierarchicaland the witha political versionsof themodel(e.g. gender,social class,identification non-hierarchical party and income) but only respondent'sage was significant.With this addition, the the respondent reads the whether logisticregressionmodelof newspaperreadership: Table 3. Multilevel Independentr Model4 ModelS Model6 Hierarchical NonHierarchical NonHierarchical Nonhierarchical hierarchical hierarchical Fixedeffects Grand mean -3.04 (0.13) Respondent'sage -2.99 (0.30) Whethersame interviewer as previousyear Interviewer status Whetherregular interviewer (compared witharea manager) Whethersupervisor interviewer (compared witharea manager) -1.70 (0.35) -0.03 (0.01) -1.94 (0.45) -0.03 (0.01) - -2.99 (0.67) -0.04 (0.01) 0.21 (0.28) -3.19 (0.90) -0.03 (0.01) 0.63 (0.34) 1.35 (0.60) 1.06 (0.84) 2.25 (0.76) 2.23 (1.25) Variance components source Randomeffects: Respondent PSU Interviewer 1 1.55 (0.64)$ 1.97 (0.71)4 1 1.48 (0.63)$ 1.78 (0.68)4 t Standarderrorsare givenin parentheses. randomparametersbased on a contrasttest. Significant This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions 1 1.59 (0.66)4 1.67 (0.67)4 and Sample DesignEffects Effects Interviewer 73 random variationis reduced slightlyand the PSU random variationremains interviewer essentiallythe same. explanatoryvariableswe considered,two approached signiOf the various interviewer ficancein thehierarchicalversionof model 6. These werethebinaryvariableforwhetherthe had visitedthe household for the previousyear's interview(interviewer same interviewer interviewer continuity)and one of the two dummyvariablesmodellingthe three-category supervisoror area manager).Here we can see that the statusvariable (regularinterviewer, variancecomponentis again slightlyreduced. interviewer of of whichcharacteristics interpretation we would have had a verydifferent Interestingly model. effectifwe had onlyrunthenon-hierarchical are havinga significant theinterviewer continuityvariablewas clearlynot signimodel,the interviewer With the non-hierarchical In addition(although statusvariableswereclearlysignificant. ficantand thetwo interviewer Middle-aged age variableapproached significance. not shown in Table 3), the interviewer to recordrespondentsas readersof the weremorelikelythanelderlyinterviewers interviewers Independent. Table 4 presentsa behaviouralintentionitemlookingat whetheror not the respondent expectsto have any morechildren.As thisis a subjectiveassessment,the questionhas been classifiedin theattitudecategoryforour analysis.From thehierarchicalanalysesof variance, the estimatedp-valuesfor thisitemwere pi = 0.075 and ps = 0.048 (n = 1177). As was the variation case forthevariancecomponentsmodelundermodel 1,model7 showsa significant betweeninterviewers and possiblevariationbetweenPSUs but not amonggeographicpools. In model 8, we have includedthe threeindividuallevel explanatoryvariablesnumberof childrenin thehousehold,respondent'sgenderand respondent'sage. Each of theseis highly versionsof themodel. Withthe in both the hierarchicaland thenon-hierarchical significant is likely to have morechildrent therespondent model:whether Table4. Multilevel logisticregression Model8 Model7 Model9 Hierarchical NonHierarchical NonHierarchical Nonhierarchical hierarchical hierarchical Fixedeffects Grand mean No. of childrenin household Respondent'sgender (female) Respondent'sage -0.39 (0.06) -0.44 (0.11) years Interviewer's withcompany 7.73 (0.46) -0.85 (0.09) -0.65 (0.19) -0.24 (0.01) 7.59 (0.46) -0.83 (0.10) -0.63 (0.19) -0.23 (0.01) 8.81 (0.60) -0.86 (0.10) -0.64 (0.19) -0.24 (0.01) 0.042 (0.020) 7.39 (0.48) -0.84 (0.10) -0.62 (0.19) -0.24 (0.01) 0.043 (0.027) Variance components Randomeffects: source 1 Respondent PSU - Interviewer - 0.15 (0.09) 0.22 (0.10)l 1 0.00 (0.00) 0.38 (0.16)4 t Standarderrorsare givenin parentheses. randomparametersbased on a contrasttest. t Significant This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions 1 0.00 (0.00) 0.34 (0.15)$ 74 C. O'Muircheartaigh and P. Campanelli additionof theseexplanatoryvariablesin the hierarchicalmodel,randomvariationdue to increases.The disappearanceof the PSUs goes to 0 and randomvariationdue to interviewers thatled to the possible PSU effecthave been PSU effectmay mean thatthe characteristics adequatelyspecifiedin the substantivemodel. Again, thissuggeststhatit is not haphazard but rather to interviewer variability, workloadsthatis contributing variationin interviewer in theirmeasurementof people's intentionsto that thereis variationbetweeninterviewers have morechildren. predictor experienceis a significant In thenon-hierarchical versionof model9, interviewer withmore experiencedinterviewers beingmorelikelyto recorda 'yes' to the morechildren are. Althoughnot shown,in the non-hierarchical question than inexperiencedinterviewers Whenthesame continuity variableapproachedstatisticalsignificance. model,theinterviewer interviewer returnedon thesecondwave of thesurveyhe or she was less likelyto recordyes was. These findings, however,do interviewer to the more childrenquestionthan a different not hold forthe hierarchicalmodel. effect,the Perhaps the most importantpoint here is that,despitethe stronginterviewer by by thesubstantivefixedpartof themodelis unaffected substantivedescriptionrepresented in the However,thereare differences the interviewers (at least not affecteddifferentially). characteristics dependingon whetheran interconclusionsabout the effectof interviewer viewervariancetermis explicitlyincluded. In additionto theseexamples,we conducteda further explorationof theeffect of theextra(Sudman and Bradburn,1974) on model conclusions. role characteristics of theinterviewers checks), typesof item(attitudes,facts,quasi-factsand interviewer For each of the different a sample of variables was drawn from among those shown to have highlysignificant interviewer variability.Across the fourcategories,26 itemswere drawn from84. A crossby PSU) was conductedon each of thesewiththe classifiedmultilevelanalysis(interviewer interviewercharacteristicsas the explanatoryvariables. These included interviewerage, overtime. continuity gender,status,yearswiththecompanyand an indicatorof interviewer in sevenof the 26 cases (27%). Of the 26 modelsconsidered,interviewer age was significant The comparablepercentagesof significant effectsthatwerefoundforthe otherinterviewer status, 12%; gender,8%; interviewer characteristics wereas follows:interviewer continuity, 8%; yearswiththecompany,4%. Althoughsuch data should be treatedwithcaution,they variability age is a generalpredictorof some of theinterviewer may indicatethatinterviewer on thehighvariabilityitems.Freemanand Butler(1976), forexample,foundage and gender to be significantpredictorsof interviewervariance. Collins and Butcher (1982) also of interviewers. Theirstrongest investigatedtheexplanatorypowerof severalcharacteristics evidencewas foran age effect. model dependingon whethera hierarchicalor non-hierarchical Again we saw differences in 27% forthenon-hierarchical modelswereage significant was used. The comparablefigures in 15%, genderin 12%, interviewer of cases, interviewer statusin 35% and years continuity withthecompanyin 15%. In 11 of the26 models,different conclusionsabout theeffectsof on substantiveresultswould have been reached,dependingon interviewer characteristics variancetermwas explicitlyincludedin themodel. whetheran interviewer 4. Summarizing remarks and discussion -that the observationsare independent The assumptionunderlying most statisticalsoftware and identically distributed (IID) -is certainlynot appropriateformost sample surveydata. This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions Effectsand Sample DesignEffects Interviewer 75 of surveydesign Variancescomputedon thisassumptiondo not take intoaccounttheeffects effects). interviewer due to correlated and execution(e.g.inflation due to clustering) (e.g.inflation effectsand reasons why we mightbe interestedin interviewer There are two different The firstis to establishwhetherthesampledesign(typicallyclustering sampledesigneffects. (because many respondentsare interviewedby each in the design) and/orthe interviewer of the observations.This is have an effecton thevariance-covariancestructure interviewer) thetraditionalsamplesurveyapproach and includesa considerationof the designeffectand the interviewer effectfollowingtheANOVA and Census Bureau models.The emphasisis on the estimationof means or proportionsand on the standard errorsof these estimates; variancecomponentsmodels do not add anythingto theseanalyses. Our work witha speciallydesignedstudyin wave 2 of the BHPS permittedus to assess both theseinflationcomponents.Acrossthe820 variablesin thestudy,therewas evidenceof a significanteffectof both the population clusteringand the clusteringof individualsin p was used as the measureof workloads.The intraclasscorrelationcoefficient interviewer effects werecomparablein and interviewer We foundthatsampledesigneffects homogeneity. impact,withoverallinflationof thevarianceas greatas fivetimesthe unadjustedestimate. The median effectacross the 820 variables was an 80% increase in the variance. The was comparableacrossthesetypes, correlationcoefficients magnitudeof theintra-interviewer check items. There was a though the most sensitiveitems tended to be the interviewer tendencyfor variablesthatwere subjectto large designeffectsto be sensitivealso to large of thiscorrelationin Section3.1. effectsand we offera possibleinterpretation interviewer The large values of pi on particularitemsand the fact that pi is of the same order of of Pi magnitudeas Ps suggestthatsurveyorganizationsshouldincorporatethemeasurement of thesurveydesignare too expensiveto allow in theirdesigns.If thenecessarymodifications this,organizationsshould at least tryto minimizeits effect;thiscould be accomplishedby reducing interviewers'workloads. Current practice tends to favour smaller dedicated effects interviewer forceswithlarge assignments;in the presenceof substantialinterviewer thisis a misguidedpolicy. The second reason is to ensure that effectson the univariatedistributionsdo not contaminateour estimatesof relationshipsbetweenvariablesin thepopulation;in thiscase our objectiveis to controlthe effectsor to eliminatethemfromthe analysis.The standard approach of thesurveysampleris to estimatetheparametersassumingthattheyare IID and to produce design-basedvarianceestimatesusingresamplingmethodssuch as thejackknife or bootstrap;this,however,is onlyan approximatesolution.The explicitmodellingof effects In thissituationthereare two aspectsof interest: is bothmorepreciseand moreinformative. workloadsin themodel and theinterviewer includingthesampleclustering whetherexplicitly changes the estimatesof the relationships(the contaminationissue) and whetherthe have an effecton the distributionof values obtained for the clusteringand interviewers dependentvariable. Using software developed for multilevelanalysis (hierarchicalmodelling) we have presented an alternativeframeworkwithin which to consider the sample design and interviewer effectsby incorporatingthemdirectlyinto substantivemodels of interest.For checkitemon whether childrenwere illustrationwe chose threebinaryitems-an interviewer the Independent,and a present duringthe interview,a behavioural item, readershipof have anotherchild. that would it was they likely thought respondents subjectiveitem,whether which a interviewer we found effect, persistedwhenwe significant For each of theseitems, extra-role characterand various in interviewers' workloads controlledfor inequalities the where found situations here we For items other not presented istics of the interviewers. This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions 76 C. O'Muircheartaigh and P. Campanelli did help to explainthe interviewer effects.In addition,we found interviewer characteristics would have that conclusionsabout the influenceof the various extra-rolecharacteristics model ratherthana in manycases ifwe had used onlythestandardnon-hierarchical differed hierarchicalmodel. thefactorsthatmightprovidean explanationof In laterworkwe hope to explorefurther the variance components.From a modellingstandpointthe issue is of specifyingapprofactorsin the substantivemodels of interest.From a sample survey priatelythe underlying standpointthe issue is that of incorporatingin the analysis a recognitionof the special featuresof the sample designand surveyexecutionthatmake a particulardata set deviate fromIID data. Multilevelmodelshave a naturalcongruencewithmanyimportantaspectsof the surveysituation;both the sample design and the fieldworkimplementationcan be describedappropriatelyas introducinghierarchicallevelsinto the data and thusmultilevel thatmakesit possibleto includeboth substantiveand design analysisprovidesa framework factorsin the same analysis. Acknowledgements The data wereoriginallycollectedby theESRC ResearchCentreforMicro-socialChange at the Universityof Essex. The data fileforthispaper was made available throughthe ESRC for the analyses or interpretations preData Archive;the Archivebears no responsibility sentedhere. References variancestudyforthe eightimpactcitiesof the Bailey,L., Moore, T. F. and Bailar, B. A. (1978) An interviewer National CrimeSurveycitiessample.J. Am. Statist.Ass., 73, 16-23. Biemer,P., Groves, R., Lyberg,L., Mathiowetz,N. and Sudman,S. (eds) (1989) MeasurementErrorsin Surveys. New York: Wiley. to HLM: ComputerProgram Bryk,A. S., Raudenbush,S. W., Congdon,R. and Seltzer,M. (1986) An Introduction and User's Guide.Chicago: Universityof Chicago. in an attitudesurvey.J. MarktRes. Soc., 25, and clusteringeffects Collins, M. and Butcher,B. (1982) Interviewer no. 1, 39-58. Medicine,University variance.Report.Departmentof Social and Preventive Feather,J.(1973) A studyofinterviewer of Saskatchewan,Saskatoon. Fellegi,I. P. (1964) Responsevarianceand its estimation.J. Am. Statist.Ass., 59, 1016-1041. thecorrelatedresponsevariance.J. Am. Statist.Ass.,69, 496-501. (1974) An improvedmethodof estimating variancein surveys.Publ. Opin. Q., 40, 79-91. Freeman,J. and Butler,E. W. (1976) Some sourcesof interviewer variability(withdiscussion).J. R. Statist. Gales, K. and Kendall, M. G. (1957) An inquiryconcerninginterviewer Soc. A, 120, 121-147. Goldstein,H. (1995) MultilevelStatisticalModels,2nd edn. London: Arnold. variabilitytakenfromtwo samplesurveys.Appl. Statist.,5, 73-85. Gray, P. G. (1956) Examplesof interviewer effectsin centralizedtelephone Groves, R. M. and Magilavy,L. J. (1986) Measuringand explaininginterviewer surveys.Publ. Opin. Q., 50, 251-256. Hansen, M. H., Hurwitz,W. N. and Bershad,M. A. (1961) Measurementerrorsin censusesand surveys.Bull. Int. Statist.Inst.,38, 359-374. on theaccuracyof surveyresults.J. Am. Statist. Hanson, R. H. and Marks,E. S. (1958) Influenceof theinterviewer Ass., 53, 635-655. Hartley,H. 0. and Rao, J. N. K. (1978) Estimationof nonsamplingvariancecomponentsin sample surveys.In SurveySamplingand Measurement(ed. N. K. Namboodiri),pp. 35-43. New York: AcademicPress. on the and respondentcharacteristics of interviewer Hox, J.J.,de Leeuw, E. D. and Kreft,I. G. G. (1991) The effect qualityof surveydata: a multilevelmodel. In MeasurementErrorsin Surveys(eds P. P. Biemer,R. M. Groves, L. E. Lyberg,N. A. Mathiowetzand S. Sudman). New York: Wiley. of Chicago Press. in Social Research.Chicago: University Hyman,H. (1954) Interviewing Kalton, G. (1979) Ultimateclustersampling.J. R. Statist.Soc. A, 142, 210-222. varianceforattitudinalvariables.J. Am. Statist.Ass., 57, 92-115. Kish, L. (1962) Studiesof interviewer Longford,N. T. (1988) VARCL Manual. Princeton:EducationalTestingService. This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions InterviewerEffectsand Sample Design Effects 77 Lynn, P. and Lievesley,D. (1991) Drawing GeneralPopulationSamples in Great Britain.London: Social and CommunityPlanningResearch. in statisticalsamplingin theIndian StatisticalInstitute.J. R. Statist. Mahalanobis,P. C. (1946) Recentexperiments Soc., 109, 325-370. O'Muircheartaigh, C. A. (1984a) The magnitudeand patternof responsevariancein thePeruFertility Survey.World Fertility SurveyScientific Report45. InternationalStatisticalInstitute,the Hague. (1984b) The magnitudeand patternof responsevariancein the Lesotho FertilitySurvey. WorldFertility SurveyScientificReport70. InternationalStatisticalInstitute,the Hague. O'Muircheartaigh,C. A. and Wiggins,R. D. (1981) The impactof interviewer variabilityin an epidemiological survey.Psychol.Med., 11, 817-824. Rasbash, J.,Woodhouse,G., Goldstein,H., Yang, M., Howarth,J. and Plewis,I. (1995) MLn Software.London: Instituteof Education. Sudman,S. and Bradburn,N. (1974) ResponseEffectsin Surveys.Chicago: Aldine. Verma,V., Scott,C. and O'Muircheartaigh,C. (1980) Sample designsand samplingerrorsforthe World Fertility Survey(withdiscussion).J. R. Statist.Soc. A, 143, 431-473. Wiggins,R. D., Longford,N. and O'Muircheartaigh, C. A. (1992) A variancecomponentsapproachto interviewer In Surveyand StatisticalComputing effects. (eds A. Westlake,R. Banks,C. Payne and T. Orchard).Amsterdam: North-Holland. Woodhouse,G. (1995) A Guide to MLn for New Users.London: Instituteof Education. This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM All use subject to JSTOR Terms and Conditions