Identification of somatically acquired mutations in chordoma exomes
Transcription
Identification of somatically acquired mutations in chordoma exomes
Identification of somatically acquired mutations in chordoma exomes Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute Somatic Mutation Discovery somatic germline Somatic Mutation Discovery drivers passengers Somatic Mutation Discovery Drivers • • • • • implicated in oncogenesis reside in cancer genes inform biology aid diagnosis/prognosis targets for therapy Known cancer genes • type of variant/gene • variant known (COSMIC) Novel cancer genes • recurrent variants • recurrently mutated genes • seen in other cancers Somatic Mutation Discovery Passengers • biologically inert • reflect DNA exposures Signature • spectrum of variants • sequence context Exomes good first analysis for drivers (Cancer Genes) Frequently mutated cancer genes PBRM1 (renal cancer) SF3B1 (myelodysplasia) 4/7 tumours truncating 4/9 tumours missense (K700E) Infrequently mutated cancer genes Infrequently mutated cancer genes Passengers (mutational signatures) PD4595a T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 0 20 40 60 80 100 Number of substitutions 700 600 500 400 300 200 100 0 ER +ve ER -ve PD4106a T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 0 20 40 60 80 100 Number of substitutions 700 600 500 400 300 200 100 0 ER +ve ER -ve PD4100a T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 0 20 40 60 80 Number of substitutions 700 600 500 400 300 200 100 0 ER +ve ER -ve 100 PD4200a T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 0 20 40 60 80 100 Number of substitutions 700 600 500 400 300 200 100 0 ER +ve ER -ve PD4120a PD4937a PD4203a T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 0 100 200 300 400 0 50 100 150 200 250 0 300 50 100 150 200 0 50 100 150 Number of substitutions Number of substitutions Number of substitutions PD4119a PD4123a PD4137a PD4125a T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 20 T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T Number of substitutions T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 0 PD4127a T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 40 60 Number of substitutions 80 100 0 20 40 60 80 Number of substitutions 100 200 T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 0 20 40 60 80 100 0 20 Number of substitutions 40 60 80 100 80 100 Number of substitutions PD4601a T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 0 20 40 60 Number of substitutions 700 600 500 400 300 200 100 0 ER +ve ER -ve 7000 T 6000 5000 G 4000 3000 C 2000 A 1000 0 -10 -9 PD4120a -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 600 500 T>G/A>C T>C/A>G T>A/A>T C>T/G>A C>G/G>C C>A/G>T 400 T 300 200 0 100 200 300 Number of substitutions 400 100 0 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 400 600 500 250 350 500 300 400 400 200 250 300 150 200 200 150 200 300 100 100 100 100 0 0 50 50 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 0 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 0 -10 -9 10 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 140 140 120 120 120 100 100 100 100 80 80 80 60 60 40 40 20 20 0 0 180 160 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 140 120 80 60 40 60 40 20 0 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 20 0 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 180 160 140 120 100 80 60 40 20 0 700 600 500 400 300 200 100 0 ER +ve ER -ve Study design 23 Chordoma exomes Drivers Contribution of known cancer genes Evidence for novel cancer genes Passengers Mutation burden Mutation spectrum Strategy Sample collection and review Variant discovery (validation) Extension study (validation) Data analysis Sureselect Hiseq PD3905a_2 PD3905b_2 PD3904a_2 PD3904b_2 PD3890a_2 PD3890b_2 PD4116a_2 PD4116b_2 PD4113a_3 PD4113b_3 PD4112a_2 PD4112b_2 PD4110a_2 PD4110b_2 Subs (Caveman) PD4108a_2 PD4108b_2 PD4107a_3 PD4107b_3 PD4106a_2 PD4106b_2 PD4105a_2 PD4105b_2 PD4104a_2 PD4104b_2 PD4103a_2 PD4103b_2 Variant Detection 22.5 90% 20 80% 17.5 70% 15 60% 12.5 50% 10 40% 7.5 30% 5 20% 2.5 10% Indels (Pindel) 0% Post-Processing Filters ‘Exomes’ do not cover all coding sequence Coverage (%) 25 PD4102a_2 PD4102b_2 PD4100a_2 PD4100b_2 PD4099a_2 PD4099b_2 PD4098a_3 PD4098b_3 0 PD4094a_2 PD4094b_2 Sequencing PD4093a_2 PD4093b_2 Exome PD4092a_2 PD4092b_2 Gb Variant discovery 100% Variant discovery Exome Sequencing Sureselect Hiseq Variant Detection Subs (Caveman) Indels (Pindel) Post-Processing Filters Variant detection is imprecise Validation Validation Tumour Normal Validation Advantages PCR Orthogonal Disad mononuc dedicated 454 quick analysis pipelines non-ortho no-pcr performance all variants capacity analysis pipelines lead time non-ortho Sequence capture miseq miseq Data analysis Somatic mutations Known cancer genes Chordoma Known cancer genes Known cancer genes Inactivating mutations in tumour supressor genes Known cancer genes Previously identified variants (COSMIC) Known cancer genes PTEN PIK3CA Known cancer genes Driver (35%) No Driver (65%) Clinically actionable? Novel cancer genes Recurrently mutated genes Missense: non-recurrent, not-clustered Recurrently mutated genes Missense: non-recurrent, not-clustered LYST Tumour Normal Tumour Normal Lysosomal trafficking regulator •transport protein associated proper lysosome function. •Chédiak-Higashi syndrome. •enlarged lysosomes •recurrent infections •albinism •Neuropathy Recurrently mutated genes Missense: non-recurrent, not-clustered ITGA10 Tumour Normal Tumour Normal Integrin: transmembrane glycoprotein receptors that mediate cell-matrix and cell-cell interactions.. Recurrently mutated genes Follow up investigation…? Summary • Known cancer genes PI3K signalling (13%): PIK3CA (2), PTEN (1) Chromatin remodelling (9%) ARID1A (1), PBRM1 (1) PTPRD (1), CDKN2A (40%) • Most tumours we failed to identify a driver • No frequently mutated cancer gene Summary • Variant/s not called • • • not recognised (missense) coverage algorithm • Variant/s not targeted • • • • rearrangements copy number changes non-coding variants epigenetic variants • 5 whole genomes in pipeline (5 pending) Acknowledgements Sam Behjati, Peter Campbell Adrienne Flanagan Susanna Cooke Mike Stratton Josh Sommer Peter Van Loo David C Wedge Nischalan Pilay John Marshall Sarah O’Meara Helen Davies Serena Nik-Zainal David Beare Adam Butler John Gamble Claire Hardy Jonathon Hinton Ming Ming Jia Alagu Jayakumar David Jones Calli Latimer Mark Maddison Sancha Martin Stuart McLaren Andrew Menzies Laura Mudie Keiran Raine Jon Teague Jose Tubio Dina Halai