MHC polymorphism - CBS
Transcription
MHC polymorphism - CBS
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS MHC polymorphism Funcional clustering of MHC molecules: The concept of supertypes Technical University of Denmark - DTU Department of systems biology Polymorphism of MHC • Within a host limited number of loci (genes) • only 6 different class I molecules (two A, B and C) • only 12 different class II molecules Within a population > 100 alleles per locus Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS • Technical University of Denmark - DTU Department of systems biology CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 • The IMGT/HLA Sequence Database currently encompass more than 1500 HLA class I proteins Source: http://www.anthonynolan.com/HIG/index.html Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS HLA polymorphism ~1% probability that an MHC molecule binds a peptide Different hosts sample different peptides from same pathogen. Immunological Bioinformatics, 2010 Technical University of Denmark - DTU Department of systems biology CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS More MHC molecules: more diversity in the presented peptides Seq2Logo: http://www.cbs.dtu.dk/biotools/Seq2Logo Immunological Bioinformatics, 2010 Technical University of Denmark - DTU Department of systems biology CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS HLA-A*02:01 sequence logo HLA specificity overlap A0101 A6802 B0702 Seq2Logo: http://www.cbs.dtu.dk/biotools/Seq2Logo Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A0201 Technical University of Denmark - DTU Department of systems biology MHC Supertypes • Tertiary (structure) • Shared peptide binding motifs • Identification of cross-reacting peptides Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS • Many of the different HLA molecules have similar specificities • HLA molecules with similar specificities can be grouped together • Methods to define supertypes • Structural similarities • Primary (sequence) Each HLA molecule within a supertype binds essentially the Sette et al, Immunogenetics same peptides (1999) 50:201-212 A6802 A0201 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS HLA polymorphism – the supertype hypothesis LOGOS OF HLA-A ALLELES A2 e p y t r e p su Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 O Lund et al., Immunogenetics. 2004 55:797-810 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Clustering CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 O Lund et al., Immunogenetics. 2004 55:797-810 Phenotypic frequencies Supertypes Caucasian African Japanese Chinese Hispanic Average A2, A3, B7 83% 86% 88% 88% 86% 86% + A1, A24, B44 100% 98% 100% 100% 99% 99% +B27, B58, B62 100% 100% 100% 100% 100% 100% Sette et al, Immunogenetics (1999) 50:201-212 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS HLA polymorphism - frequencies => A CTL based vaccine must consist of 6-9 HLA class I epitopes Sette et al, Immunogenetics (1999) 50:201-212 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS HLA polymorphism - supertypes The truth about supertypes! CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A3 A26 A1 A24 A2 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 HLA polymorphism! CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 Data 1000 800 600 400 200 0 HLA-A HLA-B HLA-C 681 1165 569 SYFPEITHI 27 59 4 IEDB 34 28 0 Proteins • Alleles characterized with 5 or more data points • 3% covered Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS 1200 How to fill this gap? CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 Using alignment A68:01 A68:02 A68:01 A68:02 A68:01 A68:02 A68:01 A68:02 A68:01 A68:02 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Align A68:01 (365) versus A68:02 (365). Aln score 2454.000 Aln len 365 Id 0.9863 A68:01 0 MAVMAPRTLVLLLSGALALTQTWAGSHSMRYFYTSVSRPGRGEPRFIAVGYVDDTQFVRFDSDAA ::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::: A68:02 0 MAVMAPRTLVLLLSGALALTQTWAGSHSMRYFYTSMSRPGRGEPRFIAVGYVDDTQFVRFDSDAA 65 SQRMEPRAPWIEQEGPEYWDRNTRNVKAQSQTDRVDLGTLRGYYNQSEAGSHTIQMMYGCDVGSD ::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::: : 65 SQRMEPRAPWIEQEGPEYWDRNTRNVKAQSQTDRVDLGTLRGYYNQSEAGSHTIQRMYGCDVGPD 130 GRFLRGYRQDAYDGKDYIALKEDLRSWTAADMAAQTTKHKWEAAHVAEQWRAYLEGTCVEWLRRY ::::::: : ::::::::::::::::::::::::::::::::::::::::::::::::::::::: 130 GRFLRGYHQYAYDGKDYIALKEDLRSWTAADMAAQTTKHKWEAAHVAEQWRAYLEGTCVEWLRRY 195 LENGKETLQRTDAPKTHMTHHAVSDHEATLRCWALSFYPAEITLTWQRDGEDQTQDTELVETRPA ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 195 LENGKETLQRTDAPKTHMTHHAVSDHEATLRCWALSFYPAEITLTWQRDGEDQTQDTELVETRPA 260 GDGTFQKWVAVVVPSGQEQRYTCHVQHEGLPKPLTLRWEPSSQPTIPIVGIIAGLVLFGAVITGA ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 260 GDGTFQKWVAVVVPSGQEQRYTCHVQHEGLPKPLTLRWEPSSQPTIPIVGIIAGLVLFGAVITGA 325 VVAAVMWRRKSSDRKGGSYSQAASSDSAQGSDVSLTACKV :::::::::::::::::::::::::::::::::::::::: 325 VVAAVMWRRKSSDRKGGSYSQAASSDSAQGSDVSLTACKV Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 Sequence based clustering 0.01 B15_01 B58_01 A24_02 A26_01 A02_01 A68_01 A68_02 B40_01 B39_01 B07_02 B08_01 A03_01 A01_01 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS B27_05 Sequence logos Seq2Logo: http://www.cbs.dtu.dk/biotools/Seq2Logo Immunological Bioinformatics, 2010 HLA-A*6801 Technical University of Denmark - DTU Department of systems biology CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS HLA-A*6802 Pan-specific method • The contact residues are defined as being within 4.0 Å of the peptide in any of a representative set of HLA-A and -B structures with nonamer peptides. • • Only polymorphic residues from A, B, and C alleles are included Pseudo-sequence consisting of 34 amino acid residues. Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS • Include polymorphic residues in potential contact with the bound peptide Example Amino acids of HLA pockets YFAVLTWYGEKVHTHVDTLVRYHY YFAVLTWYGEKVHTHVDTLVRYHY YFAVLTWYGEKVHTHVDTLVRYHY YFAVLTWYGEKVHTHVDTLVRYHY YFAVLTWYGEKVHTHVDTLVRYHY YFAVLTWYGEKVHTHVDTLVRYHY YFAVWTWYGEKVHTHVDTLLRYHY YFAEWTWYGEKVHTHVDTLVRYHY YYAVLTWYGEKVHTHVDTLVRYHY YYAVWTWYRNNVQTDVDTLIRYHY HLA A0201 A0201 A0201 A0201 A0201 A0201 A0202 A0203 A0206 A6802 Aff 0.131751 0.487500 0.364186 0.582749 0.206700 0.727865 0.706274 1.000000 0.682619 0.407855 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Peptide VVLQQHSIA SQVSFQQPL SQCQAIHNV LQQSTYQLV LQPFLQPQL VLAGLLGNV VLAGLLGNV VLAGLLGNV VLAGLLGNV VLAGLLGNV Predictions for novel HLA alleles CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 How good are the predictions? CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 Leave-one-out validation CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Close neighbors can help you! Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 Leave One out performance CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 HLA-A02:01 versus HLA-A68:02 Immunological Bioinformatics, 2010 PCC: 0.61 0.8 0.6 HLA-A6801 A6802 0.032 0.019 0.118 0.025 0.038 0.028 0.021 0.038 0.117 0.039 0.140 0.026 0.048 0.017 0.020 0.025 0.154 0.035 0.011 0.023 0.4 0.2 0 0 0.2 0.4 HLA-A0201 0.6 0.8 Technical University of Denmark - DTU Department of systems biology CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS ISCDEGRFK TDRAAQTRE IAPLRMSAT KPAFKTGEE GVERHIHIF TYGWAWLLK AEDIAKTVA MSGNEIYDH EDVERGQVV ILVEHARVE QKPTLTVML AQKTIEWAQ VEHPNVYKM EERASSSKN EDRKGHDRR LQGTTDVTP NIGVILLLT MRLAHDPDA GEYLKEKIR IPRCSPPPP A0201 0.022 0.013 0.065 0.019 0.060 0.036 0.034 0.025 0.028 0.066 0.055 0.037 0.060 0.013 0.014 0.032 0.171 0.055 0.019 0.015 HLA-A02:01 versus HLA-A68:01 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS PCC: 0.09 Heatmaps and binding motifs HLA-A68:02 HLA-A02:01 HLA-A68:01 HLA-A03:01 HLA−A02:01 HLA−A68:02 HLA−A68:01 HLA.A02.01 HLA.A68.02 HLA.A68.01 HLA.A03.01 HLA−A03:01 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS d = 1− PCC(A, B) The MHCcluster server CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 The MHCcluster server CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 Specificity-based clustering HLA-B58_01 HLA-A24_02 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Clustering of the 12 HLA supertypes HLA-A01_01 HLA-B15_01 HLA−B58:01 HLA−B15:01 HLA-A26_01 HLA−A01:01 HLA−A26:01 HLA-A68_02 HLA−A24:02 HLA−A68:02 HLA-A02_01 HLA−A02:01 HLA−B08:01 HLA-A03_01 HLA−B07:02 HLA−B40:01 HLA-A68_01 HLA-B27_05 HLA−B39:01 HLA−B27:05 HLA.B58.01 HLA.B15.01 HLA.A01.01 HLA.A26.01 HLA.A24.02 HLA.A68.02 HLA.A02.01 HLA.B08.01 HLA.B07.02 HLA.B40.01 HLA.B39.01 HLA.B27.05 HLA-B07_02 HLA.A68.01 HLA-B39_01 HLA−A03:01 HLA.A03.01 HLA-B40_01 HLA−A68:01 HLA-B08_01 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 Specificity-based clustering (w logos) HLA-A01_01 HLA-B15_01 HLA-A26_01 HLA-A68_02 HLA-A02_01 HLA-A03_01 HLA-A68_01 HLA-B27_05 HLA-B40_01 HLA-B39_01 HLA-B08_01 HLA-B07_02 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS HLA-B58_01 HLA-A24_02 Mice and men H−2−Kb HLA−B27 05 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS HLA−A24 02 H−2−Kd H−2−Dd HLA−B27:05 HLA−B39:01 H−2−Db H−2−Kk HLA−B40:01 HLA−B40 01 HLA−B07:02 1.00 H−2−Kk H−2−Ld 0.99 1.00 1.00 1.00 1.00 0.84 0.80 1.00 1.00 0.82 0.86 0.99 HLA−B39 01 HLA−B08:01 HLA−A01 01 H−2−Kb H−2−Db H−2−Dd HLA−B58 01 H−2−Kd HLA−A24:02 HLA−B08 01 1.00 HLA−A01:01 HLA−A26 01 HLA−B58:01 HLA−A26:01 HLA−B15 01 HLA−B15:01 H−2−Ld HLA−B07 02 HLA−A02:01 HLA.B27.05 H.2.Kk HLA.B39.01 HLA.B40.01 HLA.B07.02 H.2.Ld H.2.Kb HLA.B08.01 H.2.Db H.2.Kd H.2.Dd HLA.A24.02 HLA.A01.01 HLA.B58.01 HLA.A26.01 HLA.B15.01 HLA.A02.01 HLA−A03 01 HLA.A03.01 HLA−A03:01 HLA−A02 01 Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Text Zhang et al. Bioinformatics 2008 Other pan-specific methods Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 Work by Ilka Hoff Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS And now to animals Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Are chimpanzees like Human? • Can we predict binding specificities from non-human primates using NetMHCpan method trained on human specificity data only? Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Predicting Primate MHC Sidney et al. (2006) Sidney et al. (2006) Non-human primates Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Patr B*0101 Patr A*0101 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Pig Technical University of Denmark - DTU Department of systems biology Immunological Bioinformatics, 2010 Known BoLA class I epitopes MHC N Predic1on HD6 2138 0.070 T2b 662 0.133 T2b 662 0.328 T2a 662 0.017 Average predicted T2c of 12 662 0.036 rank CTL BoLA T2a 662 0.036 restricted epitopes is 3% T2a 662 0.104 W10 2282 0.086 T5 586 0.017 T7 2850 0.058 AW10 1726 0.168 D18.4 1302 0.113 0.097 Phil Toye and Vish Nene, ILRI Immunological Bioinformatics, 2010 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Epitope Protein VGYPKVKEEML Tp1 SHEELKKLGML Tp2 DGFDRDALF Tp2 KSSHGMGKVGK Tp2 FAQSLVCVL Tp2 QSLVCVLMK Tp2 KTSIPNPCKW Tp2 TGASIQTTL Tp4 SKADVIAKY Tp5 EFISFPISL Tp7 CGAELNHFL Tp8 AKFPGMKKSK Tp9 Ave alterna1ve Predic1on VGYPKVKEEML 0.070 EELKKLGML 0.009 LEGDGFDRDAL 0.012 KSSHGMGKVGK 0.017 FAQSLVCVL 0.036 QSLVCVLMK 0.036 KTSIPNPCK 0.018 ATGASIQTTL 0.023 SKADVIAKY 0.017 FISFPISL 0.002 GAELNHFLTL 0.009 AKFPGMKKSK 0.113 0.030 Technical University of Denmark - DTU Department of systems biology MHCMotifViewer CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS MHCMotifViewer. Rapin et al. Immunogenetics. 2008 ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial Technical University of Denmark - DTU Department of systems biology MhcMotifViewer CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS MHCMotifViewer. Rapin et al. Immunogenetics. 2008 ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial Technical University of Denmark - DTU Department of systems biology MhcMotifViewer CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS MHCMotifViewer. Rapin et al. Immunogenetics. 2008 ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial Technical University of Denmark - DTU Department of systems biology MHCMotifViewer. Rapin et al. Immunogenetics. 2008 ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial Technical University of Denmark - DTU Department of systems biology CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Unexpected similarities Conclusions give a simple but often wrong approximation of MHC specificities • Specificity clustering provides a more precise cluster of MHC molecules • Pan-specific prediction methods can capture the subtle differences in binding specificity between MHC molecules • MHCcluster is a easy to use method for accurate MHC clustering Reproduces and reveals difference within the known HLA class I and class II supertypes • Demonstrates MHC specificity overlap between species (some mice are like men) • Technical University of Denmark - DTU Department of systems biology ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS • Supertypes • Supertypes CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Conclusions XXXXX • Fdgfdg • dfgdfg Technical University of Denmark - DTU Department of systems biology ECCB/ISMB-2009 - Immunological Bioinformatics Tutorial