Genome-wide scan in a nationwide study sample of
Transcription
Genome-wide scan in a nationwide study sample of
© 2001 Oxford University Press Human Molecular Genetics, 2001, Vol. 10, No. 26 3037–3048 Genome-wide scan in a nationwide study sample of schizophrenia families in Finland reveals susceptibility loci on chromosomes 2q and 5q Tiina Paunio1,2,*, Jesper Ekelund1,2, Teppo Varilo1,3, Alex Parker4, Iiris Hovatta1, Joni A. Turunen1,3, Kate Rinard4, Alessandro Foti4, Joseph D. Terwilliger5,6, Hannu Juvonen2, Jaana Suvisaari2, Ritva Arajärvi2, Jaana Suokas2, Timo Partonen2, Jouko Lönnqvist2, Joanne Meyer4 and Leena Peltonen1,3,7 1Department of Molecular Medicine and 2Mental Health and Alcohol Research, National Public Health Institute, Helsinki, Finland, 3Department of Medical Genetics, University of Helsinki, Finland, 4Millennium Pharmaceuticals, Inc., Cambridge, MA, USA, 5Department of Psychiatry and Columbia Genome Center, Columbia University, 6Division of Medical Genetics, New York State Psychiatric Institute, New York, NY, USA and 7Department of Human Genetics, UCLA, Los Angeles, CA, USA Received August 22, 2001; Revised and Accepted October 23, 2001 We have previously carried out two genome-wide scans in samples of Finns ascertained for schizophrenia from national epidemiological registers. Here, we report data from a third genome scan in a nationwide Finnish schizophrenia study sample of 238 pedigrees with 591 affected individuals. Of the 238 pedigrees, 53 originated from a small internal isolate (IS) on the eastern border of Finland with a well established genealogical history and a small number of founders, who settled in the community 300 years ago. The total study sample of over 1200 individuals were genotyped, using 315 markers. In addition to the previously identified chromosome 1 locus, two new loci were identified on chromosomes 2q and 5q. The highest LOD scores were found in the IS families with marker D2S427 (Zmax = 4.43) and in the families originating from the late settlement region with marker D5S414 (Zmax = 3.56). In addition to 1q, 2q and 5q, some evidence for linkage emerged at 4q, 9q and Xp, the regions also suggested by our previous genome scans, whereas, in the nationwide study sample, the region at 7q failed to show further evidence of linkage. The chromosome 5q finding is of particular interest, since several other studies have also shown evidence for linkage in the vicinity of this locus. INTRODUCTION Schizophrenia is assumed to be an etiologically heterogeneous psychiatric disorder characterized by periodic psychotic symptoms, and by problems in reality testing. The disease affects ∼1% of people throughout the world (1), although a number of recent studies have suggested lower prevalence rates (2). Moreover, a higher prevalence, potentially affected by genetic drift, has been observed in certain populations (3). In addition to environmental factors, such as infection during the second trimester of pregnancy (4), obstetric complications (5) and the early rearing environment (6,7), multiple lines of evidence imply that genetic factors contribute to the pathogenesis. Relatives of schizophrenia probands are at increased risk of developing schizophrenia or other schizophrenia spectrum disorders (8), as are adoptees with schizophrenic biological parents (9). Perhaps the most convincing evidence for genetic factors influencing schizophrenia comes from twin studies. In a Finnish study, the proband-wise concordance was shown to be around five times higher (46%) among monozygotic than among dizygotic twins (9%) (10), and similar findings have been made in studies of other populations (11). On the basis of population-wide health care registers, the lifetime prevalence of schizophrenia in Finland has been estimated to be 1.2%. Interestingly, the prevalence was found to be higher in the rural eastern and northeastern parts of the country than in the more densely inhabited urban population in southern Finland, one of the areas with particularly high agecorrected lifetime risks being found in an isolated municipality in northeastern Finland [internal isolate (IS)] (3.2%) (3). Genetic analyses of schizophrenia families in this isolate provided some evidence for a putative predisposing locus on chromosome 1q (3). Another study of a population-wide collection of affected sib pairs in Finland showed some evidence of two loci influencing the risk of schizophrenia on chromosomes 7q and 1q (12). Despite the efforts of several groups worldwide, no major locus for schizophrenia has yet been identified and even replication of previously implicated loci has been rare (13). This may be due to the etiological complexity of the disorder *To whom correspondence should be addressed at: Department of Molecular Medicine, National Public Health Institute, Biomedicum, PL104, 00251 Helsinki, Finland. Tel: +358 9 47448751; Fax: +358 9 47448480; Email: [email protected] 3038 Human Molecular Genetics, 2001, Vol. 10, No. 26 Table 1. Study sample Nuclear families Extended pedigrees IS 90 53 AF 192 185 Total (Com) 282 238 Genotyped individuals Affected individuals LC1 LC2 LC3 LC4 389 96 +21 +6 +13 876 340 +54 +40 +21 1265 (1241a) 436 +75 +46 +34 Overlap with 30 Hovatta et al. (3) 18 (16a) 137 (29a) 37 +8 +2 +8 Overlap with 116 Ekelund et al. (12) 115 (51a) 447 (98a) 212 +29 +14 +8 aNumbers of individuals who were genotyped genome-wide. There was an overlap of three families in Hovatta et al. (3) and Ekelund et al. (12), including three individuals in the genome-wide stage of scan. or to genetic, environmental and sociocultural variation among the populations studied, as well as to inconsistent diagnostic schemes and samples of insufficient sizes. Here, a populationwide study sample of more than 1200 individuals from 238 Finnish pedigrees including multiple cases of schizophrenia were genotyped, using a genome-wide collection of microsatellite markers. In addition to linkage findings on chromosome 1q, the data provide evidence for two additional schizophrenia loci on chromosomes 2q and 5q. RESULTS Genome-wide scan (stages 1 and 2) Our nationwide schizophrenia study sample of more than 1200 individuals included two categories of families: those originating from in an isolated municipality in northeastern Finland with a particularly high age-corrected lifetime risk for schizophrenia (IS) and those from elsewhere in Finland [all-Finland sample (AF)] (Table 1 and Fig. 1). These samples partially overlapped with those analyzed in two previous studies (3,12). However, at the genome-wide stage of the search, the overlap for the individual samples was only 8% for the IS sample and 11% for the AF sample (Table 1). In the initial analyses with a marker set of 315 markers equally spaced across all 22 autosomes and the X chromosome, the IS and AF study samples were treated separately: two-point linkage analyses were performed in each subsample, using both liability classes 1 and 2 (LC1 and LC2). In order to evaluate the statistical significance of these findings, which is complicated by the multiple testing, we performed simulations using estimated allele frequencies, marker maps and existing pedigree structures. The markers giving LOD scores ≥1 were then chosen for a linkage analysis now including the broader liability classes 3 and 4 (LC3 and LC4) in IS, AF and LC1–LC4 in the combined sample (Com). Isolate sample. In the initial (stage 1) pairwise analyses of the 53 IS pedigrees, the best evidence for linkage was obtained with the marker D2S427 on chromosome 2q37. Using the recessive model of inheritance in nuclear families without genealogical links and for the most restricted diagnostic class Figure 1. Stratification of the study sample according to genealogical data. When collecting the study sample, families were categorized into two classes: those in which at least one of the parents originated from the internal isolate (IS) and those in which both parents originated from elsewhere in Finland (AF). The AF sample could be further divided by choosing families with all known ancestors from the late-settlement region of Finland (AF/LS). The population in this region arose from a second wave of inhabitation that occurred in Finland in the 16th century. (LC1), the LOD score (Zmax) reached for this marker was 4.43. When the full pedigree information was incorporated in the analyses, Zmax of 2.94 was obtained. For markers on chromosomes 1 (D1S1728), 3 (D3S1311), 5 (GATA81C06) and 18 (D18S877), we obtained Zmax values ≥2. Altogether 37 autosomal markers and two of the X-chromosomal markers produced LOD scores with Zmax ≥ 1 in the stage 1 analysis of the IS families (Fig. 2A). These 39 markers were then chosen for two-point linkage analysis, also using the broader diagnostic classes LC3 and LC4 (stage 2 of the analyses). Typically, with the exception of a few markers, evidence for linkage remained the same or decreased, and none of the markers resulted in a LOD score ≥3. LOD scores ≥2 were obtained for markers on chromosomes 1, 3, 5 and 19, specifically D1S1728, D3S2460, D5S395, GATA81C06 and D19S418. Table 2 summarizes the results of the analyses of stages 1 and 2. All-Finland sample. In the two-point linkage analysis of the 185 pedigrees from the nationwide sample, excluding the IS pedigrees, 27 of the markers of the basic scan set gave a Zmax ≥ 1 using the disease status of LC1 or LC2. The strongest evidence Human Molecular Genetics, 2001, Vol. 10, No. 26 3039 Table 2. Results of the two-point linkage analysis of the genome-wide scan Location (cM) Pedigree typea Mode of inheritance Zmax in Com 2 Extended pedigrees Dominant 2.68 1 Nuclear families Recessive 3.30 3 Nuclear families Recessive 0.89 1 Extended pedigrees Recessive 0.36 2.33 4 Nuclear families Dominant 1.41 2.36 4 Nuclear families Dominant 0.44 Nuclear families Dominant 0.01 Extended pedigrees Recessive 0.73 Nuclear families Dominant 3.55 Nuclear families Recessive 1.60 Nuclear families Dominant 1.07 Marker IS D1S1728 1 97.17 2.40 IS D2S427 2 224.70 4.43 IS D3S2460 3 127.52 2.17 IS D3S1311 3 217.76 2.79 IS D5S395 5 52.55 IS GATA81C06 5 57.50 IS D18S877 18 54.40 2.37 2 IS D19S418 19 82.72 2.04 4 AF D5S820 5 159.77 3.16 1 AF D12S85 12 54.92 2.11 3 AF/LS D5S1480 5 143.40 2.05 1 aRefers Chromosome Zmax LC Material to the pedigrees used in linkage analyses producing the Zmax (Materials and Methods). for linkage, a LOD score of 3.16, was obtained with the marker D5S820, using the diagnostic class of LC1 and the dominant inheritance model in the nuclear families. When the complete pedigree information was used, the evidence for linkage was slightly lower (Zmax = 2.86). No other marker provided a LOD score ≥2.0 in the first stage of the analysis (Fig. 2B and Table 2). In the stage 2 analyses with broader diagnostic classes (LC3 and LC4), only one of the markers, D12S85, gave a LOD score ≥2 (2.11) (Table 2). Thus, the strongest evidence for linkage was obtained with the marker D5S820 on the long arm of chromosome 5. Families from the late settlement region (AF/LS). The AF sample could be stratified genealogically by grouping the families with ancestors from the late settlement region of Finland (AF/LS) (Fig. 1). A total of 118 families originating from the northern and eastern parts of the country were analyzed with the 27 markers that had produced a LOD score ≥1 in the AF sample. In an analysis using all four LCs, only four markers (D1S431, D4S2394, D5S1480 and D12S1294) provided increased evidence for linkage as compared with the AF sample, and only the marker D5S1480 showed a Zmax ≥ 2 (Zmax = 2.05 in LC1) (Table 2). Combined study sample (Com). When all the study families were pooled together, the total study sample consisted of 238 pedigrees and more than 1200 genotyped individuals. This sample was analyzed with the 62 markers that had given a Zmax ≥ 1 in at least one analysis in either the IS or the AF sample, and all four diagnostic classes were used. The best evidence of linkage was obtained for chromosome 5q, in which a Zmax of 3.55 was obtained with the marker D5S820 in LC1 and a recessive inheritance model in the nuclear families. When LC1 was used with full pedigree information, the marker D2S427 also yielded a LOD score ≥3 (3.30) (Fig. 2C and Table 2). Simulation. Approximate solutions to the genome-wide significance of point-wise LOD scores have been derived on the assumption that meiotic information is complete (13). However, in reality, the situation is much more complicated by the presence of pedigrees of varying size and structure, often with a significant amount of missing genotype data. The proposed criteria of Lander and Kruglyak (13) are not entirely appropriate, and only a simulation based on the real markers and data structures used in a given study can provide realistic inferential guidance (14), especially when multiple analytical techniques have been used for data analysis. In this experiment, we estimated the distribution of the LOD scores maximized over all models within the AF and IS data sets separately and then maximized over both data sets. The resulting distribution functions are shown in Figure 3A. Figure 3B presents the density functions for the same LOD scores maximized over the models. For a range of false positive rates, the equivalent LOD score value is indicated in Table 3, first for an average model within a given data set, then maximized over all models in each data set, and finally, maximized over both models and data sets. The highest LOD score observed in this study has a genome-wide significance of ∼0.01, exceeding the 0.05 cut-off recommended by Lander and Kruglyak (13). In other words, only one in 100 genome scans conducted on these data sets with the analysis models applied would be expected to lead to such a significant LOD score if there were no linkage to any marker in the genome scan. The data also shows that the penalty for performing such a large number of analyses and maximizing over two independent data sets was only ∼0.8 LOD score units. This could be subtracted from any of the observed LOD scores, to get a statistic which is somewhat more comparable to traditional single-model LOD scores. After doing this, our best LOD score is still comparable in significance to a single-analysis LOD score of roughly 3.6. Multipoint analysis of chromosomes 2q and 5q According to the pairwise linkage analyses of stages 1 and 2 in the genome-wide search, two chromosomal regions appeared most interesting: chromosomes 2q and 5q. We conducted multipoint non-parametric analyses on these chromosomal regions using Simwalk2 software. We used the LC and study 3040 Human Molecular Genetics, 2001, Vol. 10, No. 26 Figure 2. Pairwise linkage analysis of the IS families (A) and the AF families (B) with the genome-widely distributed markers in LC1 and LC2 (stage 1 of the analyses). (C) The Com sample was analyzed with 62 markers that had given a Zmax ≥ 1 in stage 1, using all four diagnostic classes in the analyses. sample combination that had shown the best evidence for linkage in the two-point analyses in the corresponding region. Chromosome 2q. Multipoint analyses were performed on chromosome 2q in the IS and AF samples as well as in the Com sample. The best evidence for sharing of a marker allele was obtained for the marker D2S427 in the pedigrees of the Com sample with statistic A (P = 0.0032). Analyses of other data sets also yielded the best evidence for allele sharing at D2S427 (Table 4 and Fig. 4A). Chromosome 5q. The Simwalk2 analyses were first conducted on chromosome 5q, using markers of the basic scan set in the Com, AF and AF/LS samples. In all of them, the best evidence Human Molecular Genetics, 2001, Vol. 10, No. 26 3041 Figure 3. Distribution (A) and density (B) functions estimated from the simulation study based on the MLE assuming that all the models had the same numbers of equivalent analyses in each data set (IS and AF); LOD scores maximized over models for each sample separately (IS-MOM and AF-MOM) and maximized over the two samples together (ALL). A shift of the order of magnitude of 1 LOD score can be observed in the simulated values when multiple models are used in the analyses. for marker allele sharing was obtained with statistic B for the marker D5S1480, located 13 cM centromeric of D5S820, the statistically most significant result for the AF/LS study sample (P = 0.0016) (Fig. 4B). Encouraged by these results, especially in view of previously conducted studies of schizophrenia families from other populations (15,16), we pursued fine mapping of chromosome 5q. A total of 30 additional markers in chromosome 5q were genotyped; 29 of these covered a distance of 14 cM with an average intermarker distance of 0.4 cM. In pairwise linkage analysis of the IS families, none of the markers on chromosome 5q gave a LOD score ≥2, whereas, in the AF sample, the best two-point LOD score value was obtained, as earlier described, with the marker D5S820 (Zmax = 3.16). However, in the genealogically stratified AF/LS sample, analyses with the marker D5S414 from the dense map, using LC4 and a recessive inheritance model in pedigrees, resulted in a Zmax = 3.56 (Table 4). D5S414 is located 4 cM centromeric of the marker D5S1480, which had given the best LOD score (2.05) in stage 2 analyses of the AF/LS families. 3042 Human Molecular Genetics, 2001, Vol. 10, No. 26 Table 3. Critical LOD scores for given levels of genome-wide significance Genome-wide P-value One model, one data set Maximized over models (one data set) Maximized over data sets and models IS AF IS AF 0.10 2.56 2.54 3.18 2.99 3.38 0.05 2.85 2.83 3.48 3.28 3.68 0.01 3.52 3.50 4.16 3.95 4.36 0.001 4.48 4.45 5.11 4.91 5.31 Table 4. Comparison of the results of two-point linkage analyses and Simwalk2 non-parametric multipoint analyses Chromosome Sample 2q IS 5q aAs Model Two-point LOD score analysis Simwalk2 LC Type of families analyzed Mode of inheritance Marker Zmax P-valuea Marker Statistic b P-value 1 Nuclear families Recessive D2S427 4.43 0.0000032 D2S427 A 0.013 AF 1 Nuclear families Dominant D2S434 1.63 0.0031 D2S427 A 0.013 Com 1 Pedigrees Recessive D2S427 3.3 0.000049 D2S427 A 0.0032 IS 1 Nuclear families Recessive D5S498 1.32 0.0069 D5S476 A 0.26 AF 1 Nuclear families Dominant D5S820 3.16 0.000069 D5S210 A 0.0079 AF/LS 4 Pedigrees Recessive D5S414 3.56 0.000026 FGF1 A 0.00019 Com 1 Nuclear families Recessive D5S820 3.55 0.000027 FGF1 A 0.0079 can be calculated directly from the two-point LOD score under homogeneity. A measures the number of founder alleles contributing to the alleles of the affected individuals. bStatistic Finally, multipoint analyses were conducted, using 25 markers over an interval of 70 cM on chromosome 5q with the IS, AF, AF/LS and Com study samples. In every case, the best evidence for shared marker alleles in those affected was obtained between markers D5S414 and D5S1480. Just as in the two-point LOD score analyses, the most significant result was obtained for the families from the late settlement region at marker FGF1, located 2.7 cM centromeric to D5S1480 (P = 0.00019, statistic A). The results of these analyses are summarized in Table 4 and Figure 4C. DISCUSSION Studying schizophrenia in the Finnish population: genetic and environmental aspects Genetic studies of schizophrenia in study samples collected from different populations have given inconsistent results. Almost every autosome has a putative locus for schizophrenia and replication has been rare. This is most likely due to the complexity of the disease phenotype and its inheritance, but also to a host of other factors, such as diagnostic uncertainties, selection bias and variability in the methods of data collection (17). It is possible and even probable that many of the reported findings are false-positives, since the significance levels observed are typically within the range of magnitudes expected from genome scan in the absence of a real signal. Furthermore, a uniform distribution of schizophrenia loci across the genome is consistent with what might be expected by chance alone. Studies of large, statistically powerful samples are likely to produce more compelling results than those of more typical small-scale studies (17). Here, we present the results of the largest single study published so far, in which schizophrenia susceptibility loci have been studied in a genome-wide scan. Additionally and perhaps more importantly, all the families participating in this study originate from the same culturally, historically and genetically isolated population of Finland. They were ascertained according to the same study scheme, their genealogical background was carefully evaluated and consensus diagnoses were assessed after elaborate examination of life time diagnoses by the same team of experienced psychiatrists. Isolated populations established by a limited number of founders have proven useful for mapping genes of rare monogenic disorders, such as those of the Finnish disease heritage (18,19). In genetic studies of complex traits, such as schizophrenia in primis, with multiple genes forming the genetic background, the advantage of genetic isolates is probably less distinct (20). However, families with multiple affected individuals arising in a population with a restricted number of founders still offer an advantage as compared with study material from more heterogeneous populations. Further, if environmental Human Molecular Genetics, 2001, Vol. 10, No. 26 3043 Figure 4. Simwalk2 non-parametric analysis of chromosomes 2q (A) and 5q with the markers of the basic scan set (B) and with the dense map (C). In (A) and (B), the genetic distances used were based on information from the Marshfield meiotic map. In (C), data from the Human Genome Browser (http://genome.cse.ucsc.edu/ index.html) were applied, presuming the equivalence of 1 × 106 bp to 1 cM in the analyses. factors play an important role in the disease pathogenesis, homogeneity of the environmental components is a distinct advantage for genetic studies. The Finnish population has a relatively uniform culture, educational system, religion and language, and, in subpopulations, social and environmental similarities extend even further beyond the national homogeneity. The family structures can be determined reliably and common ancestors identified in subisolates, using complete population registers, church records and archived information dating from 1634 (3). The high quality of the health care 3044 Human Molecular Genetics, 2001, Vol. 10, No. 26 Table 5. Comparison of chromosomal regions identified in Finnish schizophrenia families Material Chr 1q Chr 4q Chr 7q Marker Zmax Marker Zmax Marker Zmax Marker Zmax Marker Zmax IS None >1 D4S2361 1.3 None >1 None >1 DXS1214 1.85 AF D1S1595 1.51 D4S2394 1.04 None >1 D9S1825 1.28 None >1 IS without families in Hovatta et al. (3) D1S1609 1.8 None >1 None >1 None >1 DXS7108 1.17 AF without families in Ekelund et al. (12) None >1 D4S2394 1.8 None >1 D9S934 1.16 None >1 system and uniform education of the doctors in only five medical schools make the registers very reliable, so that most of the serious ascertainment biases can be avoided. Finally, there is relatively good compliance of the population toward medical genetic research in Finland, so that collection of samples for a population-wide study is feasible. Even in genetic isolates, the issue of locus heterogeneity in complex traits is complex. During the past years, this matter has typically been tackled by studying linkage disequilibrium (LD) in selected chromosomal regions on the assumption that, in inbred populations, increased LD between adjacent markers should be identifiable. This feature has facilitated the LDbased gene mapping and cloning of numerous rare recessive diseases (19). The intervals of LD around common alleles have been less striking than around rare alleles, but evidence exists for wider LD intervals in general chromosomes of Finns as compared with those of more mixed populations (21–24). Furthermore, a recent study demonstrates, even within expanding populations, the existence of subisolates with dramatically increased LD can exist (25). Replication of earlier findings in the Finnish family material We have carried out two earlier genome scans in Finnish schizophrenia families (3,12). In the first one, using only 20 families from the internal subisolate, the best evidence for linkage was obtained for the markers on chromosome 1q with a Zmax of 3.8 and a potential 6 cM haplotype in some families. Other chromosomal regions with markers showing some evidence for linkage included 4q, 9q and Xp (3). In the second study, which was performed using 134 sib-pair families from the whole of Finland, some evidence emerged for linkage on chromosome 1q (Zmax = 2.62), whereas markers on 7q showed the best evidence for linkage (Zmax = 3.18) (12). In the present study, we found some evidence for linkage with markers on chromosomes 1q, 4q and 9q, LOD scores ≥1 being obtained in the initial (stage 1) analyses in either IS or AF families (Table 5). However, no evidence was obtained for linkage with the markers of the basic scan set on chromosome 7q. Since this sample overlaps with those in the two previously conducted studies, we also performed pairwise linkage analyses with markers on the above-mentioned chromosomal regions, excluding all previously analyzed families. In these analyses also, evidence for linkage on 1q, 4q, 9q and Xp remained, with LOD scores ≥1 obtained for the markers in those regions (Table 5). Chr 9q Chr Xp Among the chromosomal regions linked to schizophrenia in Finland, the long arm of chromosome 1 (1q31–1q42) emerges as particularly promising. Not only have there been the abovementioned findings in Finnish families, but also a linkage finding in families with bipolar disorder and the observation of a translocation in a Scottish pedigree with schizophrenia spectrum conditions (3,12,26,27). Consequently, we performed fine mapping of this wide chromosomal region and the results are described elsewhere (28). Chromosomes 2q and 5q as newly implicated putative loci for schizophrenia susceptibility genes in Finnish families The present study revealed two interesting loci on chromosomes 2q and 5q, potentially containing genes predisposing to schizophrenia. For chromosome 2q, the best evidence of linkage was obtained in the IS families with a Zmax of 4.43 for marker D2S427. Non-parametric multipoint analysis with statistic A of Simwalk2 software gave P = 0.013. When the Com study sample was analyzed, a maximum LOD score of 3.3 was obtained in pairwise linkage analyses and the multipoint analysis using statistic A of Simwalk2 showed some evidence for allele sharing with P = 0.0032. Findings of linkage to 2q have been reported by other investigators (29,30), but the markers giving the statistically most important signals for linkage to schizophrenia were located >100 cM centromeric from D2S427. For chromosome 5q, the best evidence of linkage was obtained with marker D5S820 in the initial (stage 1) analyses of the AF study sample. With analysis of the Com study sample, the maximum LOD score increased to 3.55. When markers of the dense map were included, the best two-point LOD score value (3.56) was obtained for the AF/LS families with the marker D5S414, located 15–20 cM centromeric to the marker D5S820, but only 4 cM from the marker D5S1480, which had given the best evidence for linkage in the initial multipoint analyses. The most significant evidence for allele sharing in the Simwalk2 analyses with the markers of the denser map was obtained, using statistic A, for the AF/LS families at marker FGF1, located in the vicinity of D5S414, with P = 0.00019. Thus, the chromosomal interval between markers D5S479 and D5S636, covering 17 cM, remains a potentially interesting region for predisposing genes in Finnish schizophrenia families. The first finding of linkage of schizophrenia to 5q was published more than a decade ago by Sherrington et al. (31) in Human Molecular Genetics, 2001, Vol. 10, No. 26 3045 Icelandic and British families, but the report was retracted with updated information (32). However, a recently conducted genome-wide scan with 13 pedigrees from Iceland and the UK, using schizophrenia spectrum diagnosis, gave a three-point LOD score of 3.6 at D5S422 (33). This marker is located ∼20 cM telomeric from D5S414. In a study of 265 Irish pedigrees with schizophrenia, a multipoint heterogeneity LOD score of 3.04 was found with marker D5S804, located 8 cM centromeric to marker D5S414, under a narrow phenotypic definition and with a recessive genetic model (16). In German and Israeli families, additional support was obtained at marker D5S399 located only 1.1 cM from D5S414, with a Zmax of 1.8 from multipoint analysis (15). In a pedigree with eight cases of schizophrenia from Palau, Micronesia, a pairwise LOD score of 2.66 was obtained at the marker D51480 (34). Finally, in a reanalysis, using Simwalk2, of previously published genomescreen data from a Costa Rican pedigree with bipolar disorder, a new locus on chromosome 5q was suggested, with best evidence for allele sharing at D5S436 (P = 0.04 with allele sharing statistic D of Simwalk2) (35). Thus, multiple lines of evidence now imply the possible involvement of 5q as a region containing genes for schizophrenia or psychosis susceptibility (Fig. 5), although a recent multicenter study failed to add to the existing evidence for linkage to 5q (36). In addition to the above-mentioned findings in other linkage studies, a de novo interstitial deletion of paternal origin, extending from band q22 to q23.2, has been published. The deleted regions contained the marker D5S804, but not the marker D5S399 (37). In conclusion, as implied by numerous other studies, schizophrenia, even in an isolated population, appears to be a genetically complex trait. Four of the five chromosomal loci previously identified in Finnish schizophrenia families (3,12) also remain interesting in the genome scan of the extensive study sample described here. The major finding is the evidence obtained for linkage of schizophrenia or schizophrenia spectrum conditions to two novel loci on chromosomes 2 and 5. The implication of the chromosome 5q finding is particularly interesting, since at least five other independent studies have shown evidence for linkage in the same region. MATERIALS AND METHODS Collection of the study sample By using nationwide registers of hospitalization for schizophrenia (hospital discharge registers), disability pensions and free medication (Social Insurance Institution), we identified 30 339 individuals born between 1940 and 1969 who had a diagnosis of schizophrenia between 1969 and 1991. The data were combined, using each individual’s unique identification code with that from the National Population Register to construct core families. This ascertainment strategy, procedures for acquiring permission to access the registers and issues concerning informed consent have been addressed previously (3,38). Each of the probands was contacted by his or her own clinician. The probands were asked for written informed consent and permission to contact their first-degree family members. Collection of blood samples was carried out as recommended in the Helsinki Declaration. Figure 5. Multiple lines of evidence imply the possible involvement of chromosome 5q as a region containing susceptibility loci for schizophrenia or other type of psychosis. Positive findings in linkage analyses have been obtained from various populations, namely, from Iceland and Britain (33), Ireland (16), Germany and Israel (15), Palau, Micronesia (34) and Costa Rica (35). The intervals between the markers giving the best signal in each study are indicated in units based on sequence data from the Human Genome Project (left) and on the Marshfield genetic map (right). Assessment of diagnoses All available inpatient and outpatient records were collected from the probands and from those of their relatives who had had any psychiatric diagnosis, according to the registers. Registration of schizophrenia has been shown, in several studies, to be highly reliable in Finland (10,39–41). Consensus diagnosis was made by two psychiatrists blind to family structure to give a best-estimate lifetime diagnosis according to the criteria of the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV). According to DSM-IV, the typical symptoms of schizophrenia include psychotic symptoms (hallucinations or delusions, disorganized speech, disorganized or catatonic behavior) and negative symptoms, such as poverty of speech. The Operational Criteria (OPCRIT) checklist (42) was also completed for the affected individuals. In the event of disagreement, diagnosis was based on the 3046 Human Molecular Genetics, 2001, Vol. 10, No. 26 opinion of a third reviewer. Using this diagnostic approach, agreement between different psychiatrists on lifetime diagnoses has been shown to be good (3,12). Affected individuals were divided into four LCs according to the consensus diagnosis. The narrowest was LC1, consisting only of individuals with schizophrenia. LC2 also included individuals with schizoaffective disorder, LC3 added those with schizophrenia spectrum conditions (schizoid, schizotypal and paranoid personality disorder, schizophreniform, delusional and brief psychotic disorder, as well as psychosis NUD) and LC4 was the broadest, including all individuals with severe major affective disorders. Of the 1265 individuals participating in the study, 511 (41%) were affected according to LC2 and 80 (6%) more were affected according to LC4 (Table 1). The affected individuals included showed a slight over-representation (59%) of males. Families participating in the study We categorized families into two classes: those in which at least one of the parents of the proband originated from the IS and those in which both parents came from elsewhere in Finland (AF). The AF sample was further stratified genealogically by identifying families originating from the latesettlement region of Finland (AF/LS). The population of this region arose from a second wave of inhabitation that occurred in Finland in the 16th century (24). At that time, a small population in South-Savo inhabited the northern and eastern parts of the country, including the isolate (Fig. 1). The names, dates and places of birth of the ancestors were traced via local church registers going as far back as 1850. For earlier periods, microfilm and microfiche copies of all local church records, available in the National Archives of Finland, were used for genealogical studies. Altogether 1265 individuals from 238 pedigrees participated in the study; 53 pedigrees originated from the IS. The largest pedigree (a2) in the IS sample consisted of 151 individuals, including 19 sibships with DNA available from 88 individuals, 31 of whom were affected. Of the 185 pedigrees in the AF sample, 118 originated from the late settlement region of Finland (AF/LS) with all known ancestors born in that region. In 27 families, all known ancestors originated from the early settlement region in the southern part of Finland, and in the remaining 40 families, a varying proportion of the ancestors came from the early and late settlement regions, respectively. Table 1 gives the numbers of nuclear families, pedigrees, and affected individuals per pedigree that participated in the study. In the majority (88%) of the pedigrees, DNA was available from at least two affected individuals and, in 75 pedigrees (32%), DNA was available from three or more affected individuals. The present study sample partially overlaps with those analyzed in two previous studies (3,12). Approximately onehalf of the individuals (46%) had been included in the previous analyses. However, only a fraction of these individuals were included in the initial, genome-wide stage of the scan in the earlier studies, so that the overlap in the genome-wide search is 8% for the IS sample and 11% for the AF sample (Table 1). Genotyping A genome-wide scan was carried out, using a marker set of 315 markers spaced at ∼10–20 cM intervals on average across all the human chromosomes (analyzed in stages 1 and 2). The markers, di-, tri- and tetranucleotide repeats, were selected from the CHLC-6 set and supplemental markers were added from the Généthon map. Altogether 30 markers were additionally mapped on chromosome 5q, 29 of them covering a distance of 14 cM. The maps we used incorporated information from the Marshfield meiotic map and Stanford University G3 RH data. In addition, for estimation of the marker order and intermarker distances on chromosome 5, we used the Human Genome Browser (http://genome.cse.ucsc.edu/index.html; draft assembly issued on January 9, 2001) which is based on sequence data from the Human Genome Project. In the dense map region, the genetic distance was estimated from the physical distance by presuming the equivalence of 1 × 106 bp to 1 cM. Polymerase chain reactions (PCR) were set up with 20 ng genomic DNA, and PCR cycling consisted of denaturation at 95°C for 5 min, followed by 30 cycles at 95°C for 30 s, five cycles at 5°C for 30 s, and at 72°C for 60 s, and concluded with extension at 72°C for 10 min at the end. The gels were run on an Applied Biosystems (ABI) 377 DNA sequencer, using ABI Prims* 377 data collection software. The data were analyzed with the ABI Prims* GeneScan* 2.0.2 with Genotyper 1.1.1. Statistical analyses Two-point LOD score analysis. The analyses were done using two affecteds-only models, one for dominant and the other for recessive gene inheritance, each assuming an absence of phenocopies and a very rare disease allele for the reasons outlined by Göring and Terwilliger (43), who demonstrated that these models lead to LOD score tests conceptually equivalent to model-free analysis, but with superior statistical properties under the null hypothesis. The analyses were done with the MLINK program of the linkage package (44), using the ANALYZE package to streamline the analysis (45). Marker locus allele frequencies were estimated by gene counting (46), applying the DOWNFREQ program on the conservative assumption that all the individuals in a pedigree are a random sample of the population, an approximation that is mandated by the paucity of founders available for genotyping and the complexity of some of the pedigree and population structures (47). Because the overwhelming majority of families were small nuclear pedigrees, there was no power to detect locus heterogeneity or epistatic interactions, owing to insufficient degrees of freedom in the data. For this reason, such analyses were not done systematically (48). The analyses were done according to the following strategy: at stage 1, we used LC1 and LC2 as criteria for the disease status and analyzed the IS and AF samples separately. At stage 2, markers that had given LOD scores of at least 1 were included, so that the criteria of LC3 and LC4 were applied in the analyses of the IS and AF samples. In addition, the AF/LS sample and the Com sample were analyzed from these selected markers in all the diagnostic models, LC1–LC4. Simulation. To evaluate the genome-wide statistical significance of the highest LOD score obtained in this experiment, we performed a simulation study with our data set. Genotypes were simulated for all the markers used in the scan, the marker Human Molecular Genetics, 2001, Vol. 10, No. 26 3047 allele frequencies and intermarker distances being fixed at the estimates assumed from our analysis. At each marker locus, genotypes were assigned only to those individuals who were genotyped for that marker in the actual study. One such simulation was performed for each of the 22 autosomes independently, the maximum LOD score (Zmax) over that chromosome being stored separately for the AF and IS data sets for each of the 100 replicates. At the end, an array of maximum LOD scores were available and one replicate of each chromosome, model, diagnosis and data set, respectively, was randomly selected to represent a simulated whole genome scan. The LOD score was maximized over the 22 autosomes in order to get a genome-wide first-order statistic separately for each model, diagnosis and data set and, subsequently, over all the models and diagnoses. Finally, the distribution of such genomewide first-order statistics was estimated from the resulting empirical distributions. It was not computationally feasible to simulate and analyze the thousands of replicates of the entire genome scan in the two data sets (IS and AF) needed to accurately estimate the small P-values we are concerned with making inference about. Therefore, we performed a maximum-likelihood estimation (MLE) of the equivalent number of independent tests performed in these scans. The simulated and bootstrapped histogram of the Zmax values was fitted to the density function of the first order statistic of N independent χ2 random variables, on the assumption that a given two-point LOD score is distributed according to 2Zln(10) ∼ 0.5χ2(1) pointwise. When the equivalent number of independent tests were considered to be equal for all models, the estimated numbers were 353 for IS and 338 for AF. This difference is due to the fact that, in the AF sample, the pedigrees are generally smaller and the autocorrelation of the LOD score between linked markers was therefore greater than in the larger pedigrees with many more ungenotyped family members characteristic of the IS sample. As expected, the differences between the models were statistically significant when whole pedigree analyses were compared with nuclear families only (P < 0.001), most strikingly in the isolate sample (P < 0.00001). Additionally, there was concern about the effect of maximizing the LOD score on eight models in each of the samples separately. The ‘equivalent number of independent tests’ estimated for the LOD scores maximized over the models were 1008 in the AF sample and 1654 in the IS sample, the difference being statistically significant (P < 10–10). In the AF sample, the eight models applied were equivalent stochastically to three independent analyses, whereas in the isolate sample, they were equivalent to ∼4.7 of the independent models analyzed. The difference is largely due to the many more degrees of freedom in the larger pedigrees of the IS sample (14), making the effect of the different models on the predicted genotypes more variable than in the simpler family structures of the AF sample. Since the two study samples were independent, the effect of maximizing the LOD scores for both studies with eight models in each is roughly equivalent to 2662 independent χ2 tests. Based on the equivalent number of independent χ2 tests, we determined the appropriate critical values for inference shown in Table 3 for this experiment. Multipoint analysis. For the regions showing the strongest evidence of linkage in the two-point linkage analysis, multipoint non-parametric analysis was performed, using SIMWALK 2.60 (49). This program slides an imaginary trait locus across the marker map. It calculates several statistics at each position of the map by using Markov chain Monte Carlo methods to sample patterns from the complete distribution of underlying inheritance, in proportion to their likelihood, which is calculated from the genotype data observed. In this study, two of the statistics, A and B, were followed systematically. Statistic A measures the total number of different founder alleles contributing to the alleles of the affected individuals, whereas statistic B measures the maximum number of alleles among the affected individuals identical by descent from any founder. In the analyses we chose the LC and family type that had yielded the strongest evidence for linkage in the two-point analyses of the markers in the studied region. ACKNOWLEDGEMENTS We would like to thank Ms M.Schreck, and Messrs S.Maruti, T.Perheentupa, P.Haimi and A.Tanskanen for their part in the computational issues of the paper. The authors also want to warmly thank the participating patients and their families as well as all the individuals who participated in collecting the samples. The work was supported by Millennium Pharmaceuticals, Inc. REFERENCES 1. Schultz, S. and Andreasen, N. (1999) Schizophrenia. Lancet, 353, 1425–1430. 2. Kendler, K., Gallagher, T., Abelson, J. and Kessler, R. (1996) Lifetime prevalence, demographic risk factors, and diagnostic validity of nonaffective psychosis as assessed in a US community sample. Arch. Gen. Psychiatry, 53, 1022–1031. 3. Hovatta, I., Varilo, T., Suvisaari, J., Terwilliger, J.D., Ollikainen, V., Arajärvi, R., Juvonen, H., Kokko-Sahin, M.L., Väisänen, L., Mannila, H. et al. (1999) A genomewide screen for schizophrenia genes in an isolated Finnish subpopulation, suggesting multiple susceptibility loci. Am. J. Hum. Genet., 65, 1114–1124. 4. Mednick, S.A., Machon, R.A., Huttunen, M.O. and Bonnett, D. (1988) Adult schizophrenia following prenatal exposure to an influenza epidemic. Arch. Gen. Psychiatry, 45, 189–192. 5. Sacker, A., Done, D., Crow, T. and Golding, J. (1995) Antecedents of schizophrenia and affective illness. Obstetric complications. Br. J. Psychiatry, 166, 734–741. 6. Alanen, Y. (1990) Need-adapted treatment of schizophrenia and other psychoses: notes on the theoretical background and practical issues. Psychiatr. Fennica, 21, 31–43. 7. Cannon, T. and Mednick, S. (1993) The schizophrenia high-risk project in Copenhagen: three decades of progress. Acta Psychiatr. Scand., 370 (suppl.), 33–47. 8. Kendler, K. and Diehl, S. (1993) The genetics of schizophrenia: a current, genetic–epidemiologic perspective. Schizophr. Bull., 19, 2. 9. Tienari, P., Wynne, L.C., Moring, J., Laksy, K., Nieminen, P., Sorri, A., Lahti, I., Wahlberg, K.E., Naarala, M., Kurki-Suonio, K. et al. (2000) Finnish adoptive family study: sample selection and adoptee DSM-III-R diagnoses. Acta Psychiatr. Scand., 101, 433–443. 10. Cannon, T.D., Kaprio, J., Lonnqvist, J., Huttunen, M. and Koskenvuo, M. (1998) The genetic epidemiology of schizophrenia in a Finnish twin cohort. A population-based modeling study. Arch. Gen. Psychiatry, 55, 67–74. 11. Cardno, A.G., Marshall, E.J., Coid, B., Macdonald, A.M., Ribchester, T.R., Davies, N.J., Venturi, P., Jones, L.A., Lewis, S.W., Sham, P.C. et al. (1999) Heritability estimates for psychotic disorders: the Maudsley twin psychosis series. Arch. Gen. Psychiatry, 56, 162–168. 12. Ekelund, J., Lichtermann, D., Hovatta, I., Ellonen, P., Suvisaari, J., Terwilliger, J.D., Juvonen, H., Varilo, T., Arajärvi, R., Kokko-Sahin, M.L. et al. (2000) Genome-wide scan for schizophrenia in the Finnish 3048 Human Molecular Genetics, 2001, Vol. 10, No. 26 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. population: evidence for a locus on chromosome 7q22. Hum. Mol. Genet., 12, 1049–1057. Lander, E. and Kruglyak, L. (1995) Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat. Genet., 11, 241–247. Sawcer, S., Jones, H.B., Judge, D., Visser, F., Compston, A., Goodfellow, P.N. and Clayton, D. (1997) Empirical genomewide significance levels established by whole genome simulations. Genet. Epidemiol., 14, 223–229. Schwab, S.G., Eckstein, G.N., Hallmayer, J., Lerer, B., Albus, M., Borrmann, M., Lichtermann, D., Ertl, M.A., Maier, W. and Wildenauer, D.B. (1997) Evidence suggestive of a locus on chromosome 5q31 contributing to susceptibility for schizophrenia in German and Israeli families by multipoint affected sib-pair linkage analysis. Mol. Psychiatry, 2, 156–160. Straub, R.E., MacLean, C.J., O’Neill, F.A., Walsh, D. and Kendler, K.S. (1997) Support for a possible schizophrenia vulnerability locus in region 5q22–31 in Irish families. Mol. Psychiatry, 2, 148–155. Baron, M. (2001) Genetics of schizophrenia and the new millennium: progress and pitfalls. Am. J. Hum. Genet., 68, 299–312. Norio, R., Nevanlinna, H.R. and Perheentupa, J. (1973) Hereditary diseases in Finland; rare flora in rare soul. Ann. Clin. Res., 5, 109–141. Peltonen, L., Jalanko, A. and Varilo, T. (1999) Molecular genetics of the Finnish disease heritage. Hum. Mol. Genet., 8, 1913–1923. Peltonen, L., Palotie, A. and Lange, K. (2000) Use of population isolates for mapping complex traits. Nat. Rev. Genet., 1, 182–190. Eaves, I.A., Merriman, T.R., Barber, R.A., Nutland, S., Tuomilehto-Wolf, E., Tuomilehto, J., Cucca, F. and Todd, J.A. (2000) The genetically isolated populations of Finland and Sardinia may not be a panacea for linkage disequilibrium mapping of common disease genes. Nat. Genet., 25, 320–323. Taillon-Miller, P., Bauer-Sardina, I., Saccone, N.L., Putzel, J., Laitinen, T., Cao, A., Kere, J., Pilia, G., Rice, J.P. and Kwok, P.Y. (2000) Juxtaposed regions of extensive and minimal linkage disequilibrium in human Xq25 and Xq28. Nat. Genet., 25, 324–328. Boehnke, M. (2000) A look at linkage disequilibrium. Nat. Genet., 25, 246–247. Varilo, T., Laan, M., Hovatta, I., Wiebe, V., Terwilliger, J. and Peltonen, L. (2000) Linkage disequilibrium and the demographic history of isolated populations: Finland and a young subpopulation of Kuusamo. Eur. J. Hum. Genet., 8, 604–612. Zavattari, P., Deidda, E., Whalen, M., Lampis, R., Mulargia, A., Loddo, M., Eaves, I., Mastio, G., Todd, J.A. and Cucca, F. (2000) Major factors influencing linkage disequilibrium by analysis of different chromosome regions in distinct populations: demography, chromosome recombination frequency and selection. Hum. Mol. Genet., 9, 2947–2957. Detera-Wadleigh, S.D., Badner, J.A., Berrettini, W.H., Yoshikawa, T., Goldin, L.R., Turner, G., Rollins, D.Y., Moses, T., Sanders, A.R., Karkera, J.D. et al. (1999) A high-density genome scan detects evidence for a bipolar-disorder susceptibility locus on 13q32 and other potential loci on 1q32 and 18p11.2. Proc. Natl Acad. Sci. USA, 96, 5604–5609. Millar, J.K., Wilson-Annan, J.C., Anderson, S., Christie, S., Taylor, M.S., Semple, C.A., Devon, R.S., Clair, D.M., Muir, W.J., Blackwood, D.H. et al. (2000) Disruption of two novel genes by a translocation co-segregating with schizophrenia. Hum. Mol. Genet., 22, 1415–1423. Ekelund, J., Hovatta, I., Parker, A., Paunio, T., Varilo, T., Martin, R., Suhonen, J., Ellonen, P., Chan, G., Sinsheimer, J.S. et al. (2001) Chromosome 1 loci in Finnish Schizophrenia Families. Hum. Mol. Genet., 10, 1611–1617. Faraone, S.V., Matise, T., Svrakic, D., Pepple, J., Malaspina, D., Suarez, B., Hampe, C., Zambuto, C.T., Schmitt, K., Meyer, J. et al. (1998) Genome scan of European–American schizophrenia pedigrees: results of the NIMH Genetics Initiative and Millennium Consortium. Am. J. Med. Genet., 81, 290–295. Levinson, D.F., Mahtani, M.M., Nancarrow, D.J., Brown, D.M., Kruglyak, L., Kirby, A., Hayward, N.K., Crowe, R.R., Andreasen, N.C., Black, D.W. et al. (1998) Genome scan of schizophrenia. Am. J. Psychiatry, 155, 741–750. Sherrington, R., Brynjolfsson, J., Petursson, H., Potter, M., Dudleston, D., Barraclough, B., Wasmuth, J., Dobbs, M. and Gurling, H. (1988) Localization of a susceptibility locus for schizophrenia on chromosome 5. Nature, 336, 164–167. 32. Kalsi, G., Mankoo, B., Curtis, D., Sherrington, R., Melmer, G., Brynjolfsson, J., Sigmundsson, T., Read, T., Murphy, P., Petursson, H. et al. (1999) New DNA markers with increased informativeness show diminished support for a chromosome 5q11–13 schizophrenia susceptibility locus and exclude linkage in two new cohorts of British and Icelandic families. Ann. Hum. Genet., 63, 235–247. 33. Gurling, H.M., Kalsi, G., Brynjolfson, J., Sigmundsson, T., Sherrington, R., Mankoo, B.S., Read, T., Murphy, P., Blaveri, E., McQuillin, A. et al. (2001) Genomewide genetic linkage analysis confirms the presence of susceptibility loci for schizophrenia, on chromosomes 1q32.2, 5q33.2, and 8p21–22 and provides support for linkage to schizophrenia, on chromosomes 11q23.3–24 and 20q12.1–11.23. Am. J. Hum. Genet., 68, 661–673. 34. Byerley, W., Tiobech, S., Blakis, A., Zuo, J., Zhao, M., Hoff, M., Bennet, P., Caleb, O. and Myles-Worsly, M. (1999) Evidence for a 5q31 schizophrenia locus in a large multipex kindred from Palau, Micronesia. Mol. Psychiatry, 4, 4. 35. Garner, C., McInnes, L.A., Service, S.K., Spesny, M., Fournier, E., Leon, P. and Freimer, N.B. (2001) Linkage analysis of a complex pedigree with severe bipolar disorder, using a Markov chain Monte Carlo method. Am. J. Hum. Genet., 68, 1061–1064. 36. Levinson, D.F., Holmans, P., Straub, R.E., Owen, M.J., Wildenauer, D.B., Gejman, P.V., Pulver, A.E., Laurent, C., Kendler, K.S., Walsh, D. et al. (2000) Multicenter linkage study of schizophrenia candidate regions on chromosomes 5q, 6q, 10p, and 13q: schizophrenia linkage collaborative group III. Am. J. Hum. Genet., 67, 652–663. 37. Bennett, R.L., Karayiorgou, M., Sobin, C.A., Norwood, T.H. and Kay, M.A. (1997) Identification of an interstitial deletion in an adult female with schizophrenia, mental retardation, and dysmorphic features: further support for a putative schizophrenia-susceptibility locus at 5q21–23.1. Am. J. Hum. Genet., 61, 1450–1454. 38. Suvisaari, J.M., Haukka, J.K., Tanskanen, A.J. and Lonnqvist, J.K. (1999) Decline in the incidence of schizophrenia in Finnish cohorts born from 1954 to 1965. Arch. Gen. Psychiatry, 56, 733–740. 39. Pakaslahti, A. (1987) On the diagnosis of schizophrenic psychoses in clinical practice. Psychiatr. Fennica, 18, 63–72. 40. Isohanni, M., Makikyro, T., Moring, J., Rasanen, P., Hakko, H., Partanen, U., Koiranen, M. and Jones, P. (1997) A comparison of clinical and research DSM-III-R diagnoses of schizophrenia in a Finnish national birth cohort. Clinical and research diagnoses of schizophrenia. Soc. Psychiatry Psychiatr. Epidemiol., 32, 303–308. 41. Mäkikyrö, T., Isohanni, M., Moring, J., Hakko, H., Hovatta, I. and Lönnqvist, J. (1998) Accuracy of register-based schizophrenia diagnoses in a genetic study. Eur. Psychiatry, 13, 57–62. 42. McGuffin, P., Farmer, A. and Harvey, I. (1991) A polydiagnostic application of operational criteria in studies of psychotic illness. Development and reliability of the OPCRIT system [news]. Arch. Gen. Psychiatry, 48, 764–770. 43. Göring, H.H. and Terwilliger, J.D. (2000) Linkage analysis in the presence of errors IV: joint pseudomarker analysis of linkage and/or linkage disequilibrium on a mixture of pedigrees and singletons when the mode of inheritance cannot be accurately specified. Am. J. Hum. Genet., 66, 1310–1327. 44. Lathrop, G.M., Lalouel, J.M., Julier, C. and Ott, J. (1985) Multilocus linkage analysis in humans: detection of linkage and estimation of recombination. Am. J. Hum. Genet., 37, 482–498. 45. Terwilliger, J.D. and Göring, H.H. (2000) Gene mapping in the 20th and 21st centuries: statistical methods, data analysis, and experimental design. Hum. Biol., 72, 63–132. 46. Smith, C. (1957) Counting methods in genetical statistics. Ann. Hum. Genet., 21, 254–276. 47. Göring, H.H. and Terwilliger, J.D. (2000) Linkage analysis in the presence of errors III: marker loci and their map as nuisance parameters. Am. J. Hum. Genet., 66, 1298–1309. 48. Terwilliger, J.D. (2000) A likelihood-based extended admixture model of oligogenic inheritance in ‘model-based’ and ‘model-free’ analysis. Eur. J. Hum. Genet., 8, 399–406. 49. Sobel, E. and Lange, K. (1996) Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. Am. J. Hum. Genet., 58, 1323–1337.