Targeted resequencing analysis of 25 genes

Transcription

Targeted resequencing analysis of 25 genes
Published Ahead of Print on July 5, 2013, as doi:10.3324/haematol.2013.086686.
Copyright 2013 Ferrata Storti Foundation.
Early Release Paper
Targeted resequencing analysis of 25 genes commonly mutated in
myeloid disorders in del(5q) myelodysplastic syndromes
by Marta Fernandez-Mercado, Adam Burns, Andrea Pellagatti,
Aristoteles Giagounidis, Ulrich Germing, Xabier Agirre, Felipe Prosper, Carlo Aul,
Sally Killick, James S. Wainscoat, Anna Schuh, and Jacqueline Boultwood
Haematologica 2013 [Epub ahead of print]
Citation: Fernandez-Mercado M, Burns A, Pellagatti A, Giagounidis A, Germing U, Agirre X,
Prosper F, Aul C, Killick S, Wainscoat JS, Schuh A, and Boultwood J. Targeted resequencing
analysis of 25 genes commonly mutated in myeloid disorders in del(5q) myelodysplastic
syndromes. Haematologica. 2013; 98:xxx
doi:10.3324/haematol.2013.086686
Publisher's Disclaimer.
E-publishing ahead of print is increasingly important for the rapid dissemination of science.
Haematologica is, therefore, E-publishing PDF files of an early version of manuscripts that
have completed a regular peer review and have been accepted for publication. E-publishing
of this PDF file has been approved by the authors. After having E-published Ahead of Print,
manuscripts will then undergo technical and English editing, typesetting, proof correction and
be presented for the authors' final approval; the final version of the manuscript will then
appear in print on a regular issue of the journal. All legal disclaimers that apply to the
journal also pertain to this production process.
Haematologica (pISSN: 0390-6078, eISSN: 1592-8721, NLM ID: 0417435, www.haematologica.org) publishes peer-reviewed papers across all areas of experimental and clinical
hematology. The journal is owned by the Ferrata Storti Foundation, a non-profit organization, and serves the scientific community with strict adherence to the principles of open
access publishing (www.doaj.org). In addition, the journal makes every paper published
immediately available in PubMed Central (PMC), the US National Institutes of Health (NIH)
free digital archive of biomedical and life sciences journal literature.
Support Haematologica and Open Access Publishing by becoming a member of the European Hematology Association (EHA)
and enjoying the benefits of this membership, which include free participation in the online CME program
Official Organ of the European Hematology Association
Published by the Ferrata Storti Foundation, Pavia, Italy
www.haematologica.org
Targeted resequencing analysis of 25 genes commonly
mutated in myeloid disorders in del(5q) myelodysplastic
syndromes
Running heads: Targeted resequencing of 25 genes in del(5q) MDS
1
2
1
Marta Fernandez-Mercado, * Adam Burns, * Andrea Pellagatti, Aristoteles Giagounidis,
4
5
5
3
3
6
Ulrich Germing, Xabier Agirre, Felipe Prosper, Carlo Aul, Sally Killick, James S. Wainscoat,
Anna Schuh
2¥
and Jacqueline Boultwood
1
1¥
1
LLR Molecular Haematology Unit, NDCLS, RDM, John Radcliffe Hospital, Oxford, UK;
2
NIHR Biomedical Research Centre, Oxford, UK; Medizinische Klinik II, St Johannes Hospital,
3
4
Duisburg, Germany; Department of Hematology, Oncology and Clinical Immunology,
5
Heinrich-Heine-Universität, Düsseldorf, Germany; Division of Cancer and Area of Cell Therapy
and Haematology Service, Foundation for Applied Medical Research, Clínica Universitaria,
6
Universidad de Navarra, Pamplona, Spain; and Department of Haematology, Royal
Bournemouth Hospital, Bournemouth, UK
Statement of equal authors’ contribution:
¥
*MFM and AB contributed equally to this manuscript; JB and AS were co-senior authors
Correspondence
Professor Jacqueline Boultwood, LLR Molecular Haematology Unit, Nuffield Division of Clinical
Laboratory Sciences, Radcliffe Department of Medicine, University of Oxford, John Radcliffe
Hospital, Oxford OX3 9DU, UK. E-mail: [email protected]
Acknowledgments
The authors would like to thank Leukaemia and Lymphoma Research of the United Kingdom
and the Oxford Partnership Comprehensive Biomedical Research Centre, with funding from the
Department of Health's NIHR Biomedical Research Centres funding scheme for funding this
work. The views expressed in this publication are those of the authors and not necessarily those
of the Department of Health. The authors would like to thank the patients who accepted to
participate in this study. The authors would also like to thank all co-workers in their laboratories
for their technical assistance as well as all physicians for referring patient material.
1
Abstract
Interstitial deletion of chromosome 5q is the most common chromosomal abnormality in
myelodysplastic syndromes. The catalogue of genes involved in the molecular pathogenesis of
myelodysplastic syndromes is rapidly expanding and next-generation sequencing technology
allows detection of these mutations at high depth. Here we describe the design, validation and
application of a targeted next-generation sequencing approach to simultaneously screen 25
genes mutated in myeloid malignancies. We used this method alongside single nucleotide
polymorphism-array technology to characterize the mutational and cytogenetic profile of 43 early
or advanced del(5q) myelodysplastic syndrome cases. A total of 29 mutations were detected in
our cohort. Overall, 45% of early and 66.7% of advanced cases presented at least one
mutation. Genes with the highest mutation frequency among advanced cases were TP53 and
ASXL1 (25% of patients each). These showed a lower mutation frequency in 5q- syndrome
cases (4.5% and 13.6%, respectively), suggesting a role in disease progression in del(5q)
myelodysplastic syndromes. 52% of mutations identified were in genes involved in epigenetic
regulation (ASXL1, TET2, DNMT3A and JAK2). Six mutations showed allele frequencies <20%,
likely below the detection limit of traditional sequencing methods. Genomic array data showed
that advanced del(5q) myelodysplastic syndrome cases displayed a complex background of
cytogenetic aberrations, often encompassing genes involved in myeloid disorders. Our study is
the first to investigate the molecular pathogenesis of early and advanced del(5q)
myelodysplastic syndromes using next-generation sequencing technology on a large panel of
genes frequently mutated in myeloid malignancies, further illuminating the molecular landscape
of del(5q) myelodysplastic syndromes.
2
Introduction
The myelodysplastic syndromes (MDS) represent a heterogeneous group of clonal
hematopoietic stem cell (HSC) malignancies that are characterized by ineffective hematopoiesis
resulting in peripheral cytopenias, and typically a hypercellular bone marrow. The MDS are preleukemic conditions showing frequent progression (approximately 40% of patients) to acute
myeloid leukemia (AML). In the early stages of the disease, apoptosis of the bone marrow
precursor cells prevails, but in more advanced disease increased proliferation of immature
blasts occurs.1 About 50% of MDS exhibit acquired genomic abnormalities detected by
conventional cytogenetic banding techniques. Recent molecular investigations have revealed
additional genetic abnormalities in MDS, including micro-deletions and loss of heterozygosity
(LOH) due to acquired uniparental disomy (UPD).2
Interstitial deletion within the long arm of chromosome 5 [del(5q)] is one of the most frequent
cytogenetic abnormalities observed in myeloid malignancies, occurring in approximately 1020% of patients with de novo MDS3 and in a similar proportion of patients with de novo AML.4 In
de novo MDS the del(5q) occurs either in isolation or together with other karyotypic
abnormalities. Although the 5q- is a good prognostic indicator when found in isolation,5 this is
6
not the case when the 5q- is part of a complex karyotype. In a large MDS database, del(5q)
was reported as an isolated abnormality in 14% of patients with clonal abnormalities, in 5% with
one other abnormality, and in 11% with a complex karyotype.6 The median overall survival in
these groups was 80, 47 and 7 months respectively.6 These findings are consistent with the
general notion that the total number of cytogenetic changes found represents an independent
factor that can allow for the stratification of patient cohorts into prognostic subgroups.
The 5q- syndrome is the most distinct of all the MDS and is characterized by isolated del(5q),
severe macrocytic anemia, frequent thrombocytosis, female predominance, and a lower risk of
progression to AML.7 Patients with the 5q- syndrome have one of the best outcomes of any
8
7,8
MDS subgroup, with relatively long survival often of several years of duration.
Whilst a small
number of gene mutations have been reported in the 5q- syndrome, including mutation of TP53
and JAK2,9,10 the molecular landscape of this disease remains to be fully determined.
Approximately 10% of patients with the 5q- syndrome show transformation to AML,7 but the
genetic aberrations that drive this process are not fully determined.
The International Prognostic Scoring System (IPSS)11 and its revised version8 are based upon
karyotypic abnormalities as well as on morphological data. Recently, a new and more
comprehensive cytogenetic scoring system has been developed, which allows for a refined
cytogenetic risk prediction.12 However, the heterogeneous clinical outcome observed within the
karyotypically and morphologically-defined groups in the IPSS indicates that it may be possible
to refine the cytogenetic classification by using additional markers. The catalogue of genes that
play a role in the molecular pathogenesis of MDS is rapidly expanding, and includes TET2,
SF3B1, EZH2 and ASXL1.13-16 Unraveling the genetic complexity of MDS promises to elucidate
3
the pathophysiology of this disease, refine the taxonomy and prognostic scoring systems, and
provide novel therapeutic targets.
Technological advances in DNA sequencing provide an important tool to analyze
heterogeneous cancer samples. Massively parallel sequencing enables the analysis of
independent, clonal, DNA molecules
17
and offers the opportunity to adjust the balance between
breadth and depth of such assays to identify a wide variety of potentially critical DNA changes in
tumors. Broad approaches, such as whole genome and whole exome sequencing have been
used to discover new cancer gene mutations 15,18,19 or to study clonal evolution.20 In particular,
several such studies in MDS have identified recurrently mutated genes and novel pathways
involved in pathogenesis, such as those encoding splicing factors.15,19 However, these genomewide approaches are still expensive and have relatively low sensitivity. In contrast, a more
targeted sequencing approach aimed at detecting selected recurrent mutations in MDS allows
for cost-effective and fast sequencing at the high depth required for accurate characterization of
heterogeneous cancer samples.
Here, we describe the design, validation and application of a targeted next generation
sequencing (NGS) approach using a bench-top platform to simultaneously screen 25 genes
mutated in a range of myeloid malignancies. We used this method to characterize the
mutational profile of a cohort of 43 MDS cases with del(5q).
Methods
Patient samples
Test Cohort
Nine MDS samples, with known mutations detected by Sanger sequencing, pyrosequencing or
amplification refractory mutation system PCR (ARMS-PCR) in at least one of our target genes,
were selected to validate our gene panel. The nine samples chosen contained a total of 13
variants across 8 genes and included missense (n=4), nonsense (n=1) and frameshift (n=8)
mutations (Table 1).
MDS del(5q) Cohort
Samples from 43 untreated MDS cases harboring a del(5q) were selected for mutational
screening (mean age 66.0, range 24-88). These included 22 patients with 5q- syndrome, 9
cases with refractory anemia (RA) with additional karyotypic abnormalities, and 12 cases with
advanced MDS (defined as having an increased number of blasts and including 11 RA with
excess of blasts, RAEB, and 1 CMML in transformation). All karyotypes were determined by
conventional G-banding. This study was approved by the ethics committees of the institutes
involved and informed consent was obtained.
4
DNA Extraction
Genomic DNA was isolated by phenol-chloroform extraction from peripheral blood neutrophils
isolated using Histopaque (Sigma-Aldrich) and pelleted after hypotonic lysis of erythrocytes.
The purity of the neutrophil populations was high, >95%, as assessed by standard morphology
on Wright-Giemsa-stained cytospin preparations.
Targeted re-sequencing
We designed a TruSeq Custom Amplicon panel (TSCA, Illumina), targeting 25 genes mutated in
various myeloid malignancies (Table 2). The panel was developed using the online
DesignStudio pipeline (http://designstudio.illumina.com, Illumina), and covers a total of
46,604bp with 322 amplicons. In genes with well-defined mutational hotspots only these regions
were targeted; otherwise the entire coding sequence of the gene was sequenced. Libraries
prepared from 250ng DNA were subjected to 250bp paired-end sequencing.
Protein sequences resulting from detected DNA-sequence changes were predicted using
insilico.ehu.es on-line tool,21 and Alamut Software (Interactive Biosoftware, San Diego, CA,
USA). PolyPhen-2 v2.2.2 on-line tool
22
was used to predict the functional effect of variant calls
(Polymorphism Phenotyping v2, http://genetics.bwh.harvard.edu/pph2/).
FLT3 ITD fragment analysis
Thirty-three samples in the test cohort with sufficient DNA were screened for internal tandem
duplications in FLT3 gene (FLT3-ITD) using conventional fragment analysis.23 These included
20 5q- syndrome cases, 5 del(5q) RA with additional karyotypic abnormalities and 8 advanced
del(5q) MDS cases.
Genomic array profiling
Single nucleotide polymorphism (SNP) array data were available from a previously published
study2 for 33 of the 43 samples included in the targeted sequencing analysis. Those data
allowed us to identify cryptic copy number changes and UPD regions.
Results
Quality of MDS del(5q) MiSeq sample run
The number of clusters that passed the quality filter was over 100,000 for the majority of the
samples (40/43, 93%) (Figure S1). Paired-end MiSeq sequencing produced more than 2.2Gb of
sequence data with 91% of reads higher than the quality threshold of Q30, exceeding the
expected minimums of 2Gb and 75%, respectively. The average depth of coverage across all
samples was >390x, with 98% of cases (42/43) over 250x, 91% (39/43) over 300x and 49%
5
(21/43) ≥400x. The overall sensitivity of the assay and its background noise were estimated at
1-3% (Supplementary Information and Table S1).
Validation of the Myeloid Gene Panel
In order to compare the accuracy and sensitivity of our TSCA assay against standard methods
of mutation screening (Sanger sequencing, pyrosequencing, fragment analysis), we rescreened 9 test samples (Table 1a), containing 17 variants across 11 genes (ASXL1, DNMT3A,
EZH2, FLT3, IDH1, IDH2, KIT, NPM1, NRAS, RUNX1 and TP53).
Using the BaseSpace data analysis pipeline, we were able to successfully identify 5 missense,
3 nonsense and 7 frameshift mutations in our validation cohort (15 out of 17, 88.2%). In
particular, short indel (insertions/deletions) mutations in both ASXL1 and NPM1 (1bp and 5bp
respectively) were correctly identified by the BaseSpace analysis software. Analysis of the
TEST009 aligned reads in the Integrative Genomics Viewer (IGV, Broad Institute), revealed a
dramatically reduced read depth of 30x across TP53, compared to >1000x in other samples,
suggesting that there was a failure to align the reads to the reference sequence. The TEST009
sequence data was therefore submitted to a second alignment and variant calling pipeline
(Stampy/Platypus
24,25
), which successfully identified a 19bp deletion (Figure S2). Two 109bp
and 64bp FLT3 internal tandem duplications (ITD) in samples TEST003 and TEST004
respectively were only called after visual inspection of the un-aligned data for reads matching
part of the FLT3 target sequence. The presence of FLT3-ITDs was subsequently confirmed by
fragment analysis.
In addition to the known control mutations, we identified 6 mutations affecting 5 genes in 5
samples (Table 1b). One of these mutations, the C1464X variant in TEST001, was visible in
earlier Sanger sequencing traces; however at the time, the variant had not been called as it was
within the range of background noise (Figure S3). All other additional mutations were confirmed
by Sanger sequencing and fragment analysis (Figure S4 and Table S2).
Mutations detected in del(5q) cases
Highly purified peripheral blood neutrophil DNA samples from 43 MDS cases harboring a
del(5q) were subjected to mutational screening using the 25-gene panel described above. A
total of 4036 variant calls were detected by a combination of BaseSpace, Stampy/Platypus and
visual inspection of the FLT3 locus. Of these, all non-synonymous variant calls with a COSMIC
ID (i.e. recorded in Catalogue of Somatic Mutations in Cancer26) were considered relevant. We
also included in the analysis all non-synonymous variant calls not found in either COSMIC or
the dbSNP database (build 135). A total of 29 non-synonymous variants were called in 10
different genes: 7 affecting TP53, 6 ASXL1, 5 TET2, 2 CBL, 2 DNMT3A, 2 SF3B1, 2 JAK2, 1
U2AF1, 1 RUNX1 and 1 WT1 (Table 3, Figure 1, Table S3). In addition, 21 synonymous
6
variants with a COSMIC ID were found in 5 different genes (10 PDGFRA, 5 IDH1, 3 cKIT, 2
FLT3 and 1 TP53) (Table S4).
Distribution of the non-synonymous mutations among disease subgroups
A total of 29 mutations were detected in our cohort of 43 del(5q) MDS cases. Twelve of 29
mutations were found in 9 of the 22 5q- syndrome cases (45.0%) (Table 3). Five mutations
affected 4 of the 9 del(5q) RA cases (44.0%) with additional cytogenetic aberrations (Table 3).
The more advanced del(5q) MDS cases presented a higher proportion of sequence changes:
12 variant calls were found in 8 of the 12 advanced del(5q) cases (66.7%) (Table 3). The genes
with the highest mutation frequency in this cohort were TP53 (3/12 patients, 25%; 5 mutations in
total as two patients had two TP53 mutations) and ASXL1 (3/12, 25%). The mutation frequency
for these two genes was lower in 5q- syndrome cases (TP53 1/22, 4.5%; ASXL1 3/22, 13.6%).
Other mutations were identified in 5q- syndrome cases (3 TET2, 2 SF3B1, 1 DNMT3A, 1
RUNX1 and 1 WT1), in del(5q) RA with additional cytogenetic abnormalities (1 additional TP53,
1 CBL, 1 DNMT3A, 1 U2AF1 and 1 JAK2) and in advanced del(5q) MDS cases (2 TET2, 1 CBL
and 1 JAK2). It is of note that six of the mutations detected in this study present variant
frequencies lower than 20%, which are likely to be below the level of detection of Sanger
sequencing.27,28 These low frequency mutations were found in the following genes: 2 TET2, 1
ASXL1, 1 DNMT3A, 1 JAK2 and 1 SF3B1 (Figure 1, Table S3).
These data show that a number of different gene mutations occur in patients with the 5qsyndrome and that advanced del(5q) MDS cases display a greater mutation frequency than
early del(5q) MDS cases, with mutation of TP53 and ASXL1 genes being the most frequent.
Co-occurring mutations: analysis of clonality and timing of mutation acquisition
Clonal evolution has been documented as MDS transforms to AML,29 and when de novo AML
relapses after initial chemotherapy.30 The proportion of sequencing reads reporting a given
mutation can be used to estimate the fraction of tumour cells carrying that mutation, and to
identify whether mutations are clonal (in all tumor cells) or subclonal (in a fraction of tumor
cells).31 This estimation needs to take into account copy number and loss of heterozygosity
(LOH) data. Five cases in our cohort showed mutations in more than one gene. Whole genome
array data was available for all of them.2 The genes with co-occurring mutations were ASXL1,
WT1, SF3B1, TET2, DNMT3A, JAK2 and CBL (Figure 1, Figure 2).
In two cases (1 5q- syndrome, MDS08, and 1 CMML, MDS42) two mutations were present at
similar allele frequency, ASXL1 (44.7%) and WT1 (49.0%) in the 5q- syndrome case, and
ASXL1 (45.4%) and CBL (96.0%) -the latter within a UPD region- in the CMML case. This is
suggestive of a dominant clonal population of cells. In this scenario, it is not possible to
determine the temporal order of mutations. A third case (MDS29, a del(5q) RA with additional
7
karyotypic abnormalities) had a DNMT3A mutation at variant allele frequency of ~44%, and a
JAK2 mutation at ~7%. Since the copy number showed these to have occurred in diploid
regions without any LOH, the fraction of cells carrying the mutations would be ~88% and ~14%
respectively. On this basis, we could not infer if the JAK2 mutation was subclonal to the cells
carrying the DNMT3A mutation or if, on the contrary, it represented an independent clone.
However, assuming that each mutation occurred only once during tumour evolution, it is
possible to suggest that DNMT3A mutation occurred earlier than JAK2 in the disease course.
Similarly, the fourth case (MDS12, a 5q- syndrome case) had ~80% of cells carrying a SF3B1
mutation, ~20% with ASXL1 and ~10% with TET2. We can therefore suggest that the SF3B1
mutation occurred before ASXL1 or TET2. The variant allele fractions for ASXL1 and TET2
could be consistent with either TET2 being subclonal to ASXL1 or on a separate branch of the
phylogenetic tree, so we cannot establish the timing of those two mutations to each other.
Finally, the fifth case (MDS43, a del(5q) RAEB case) presented ~86% of cells with ASXL1 and
~55% with TET2. In this case, it was clear that TET2 was subclonal to ASXL1 and must have
occurred later.
Copy number changes and uniparental disomy analysis
Thirty-three (18 5q- syndrome, 6 del(5q) RA with additional cytogenetic abnormalities, and 9
cases of advanced del(5q) MDS) of the 43 del(5q) MDS samples included in the targeted
sequencing analysis had been previously analysed by SNP-arrays to identify cryptic copy
number changes and regions of UPD (defined as continuous stretches of homozygous SNP
calls >2 Mb without copy number loss).
2
The results of the analysis are listed in Table S5. The del(5q) was characterized in all 33 cases.
Copy number changes in addition to the del(5q) were observed in 6 of 9 advanced MDS cases
(66.7%) and 4 of 6 del(5q) RA with additional cytogenetic abnormalities cases (66.7%), but in
only 4 of 18 5q- syndrome cases (22.2%). In the 5q- syndrome group, 31 regions of UPD were
identified in 17 of 18 patients. All other cases included in this study showed regions of UPD, 6
regions in all 6 del(5q) RA with additional cytogenetic aberrations, and 17 in all 9 del(5q)
advanced cases.
2
A proportion of the regions affected by copy number loss encompassed genes that are part of
our TSCA gene panel. In advanced del(5q) MDS cases, these were EZH2, NPM1, ETV6,
ASXL1 and TP53 (Figure 1, Table S6). Additional regions of cytogenetic loss encompassed
CBL and ETV6 in two different del(5q) RA with additional cytogenetic aberration cases (Figure
1, Table S6). The only DNMT3A loss was seen in a 5q- syndrome case. In the one case (a
del(5q) RAEB case) presenting cytogenetic loss encompassing TP53, the remaining copy
presented a missense mutation (R273H), predicted to be damaging to the function of the protein
(Table S3).
8
These results show that advanced del(5q) MDS cases display a more complex landscape of
cytogenetic aberrations, both karyotypically evident and cryptic. These regions often contain
genes involved in myeloid disease.
Discussion
In this study, we sought to validate an Illumina-based targeted NGS platform to simultaneously
screen 25 genes relevant to myeloid malignancies for mutations. Once validated, we aimed to
use this gene panel to characterize the mutational profile of a cohort of 43 MDS cases with
del(5q), in the context of additional molecular and high-density genomic array data. The
prevalence of the mutations detected in complex DNA samples has typically been limited to
approximately 20% using Sanger sequencing.27,28 The development of specific mutation
enrichment or detection strategies has greatly increased this sensitivity.
32,33
In keeping with the
improved power of mutation detection of NGS over traditional sequencing techniques, we
identified previously undetected mutations in the validation cohort (that comprised 9 test
samples containing 17 variants across 11 genes) in addition to the previously known mutations
in these samples.
Using the BaseSpace and Stampy/Platypus24,25 analysis software, we were able to successfully
identify all point mutations, short indels and deletions included in the validation cohort. However,
FLT3-ITD variants that consist of patient-specific sequence duplications were amplified and
sequenced, but were not identified using either bio-informatics pipeline and therefore had to be
visually identified. This highlights the need for further refinements to commercially available
analysis pipelines before their use in routine clinical practice. The non-alignment of these reads
is largely a function of the comparative size of the insertion or deletion compared to the absolute
read length. We are hopeful that in the future longer read lengths, in combination with
improvements to the alignment algorithms, will greatly increase the ability to detect these
important mutations.
Once our panel was successfully validated, we applied it to study the mutational profile of a
series of MDS cases with the del(5q). Gene mutation screening in del(5q) MDS has been
performed in previous studies, but most of these studies focused on a limited numbers of
genes, and have mainly employed traditional sequencing methods.
investigated larger number of genes
41-43
9,34-40
Other studies have
but did not specifically focus on MDS cases with
del(5q). To our knowledge, the present work is the first attempt to screen a large number of
genes using a targeted NGS approach in both early and advanced del(5q) MDS.
A total of 29 mutations were detected in our cohort of 43 del(5q) MDS cases. Overall, 45% of
5q- syndrome and 44% of del(5q) RA with additional cytogenetic aberrations cases presented at
9
least one mutation. The more advanced del(5q) cases showed a higher proportion of mutated
cases, and 66.7% presented at least one mutation. The genes with the highest mutation
frequency among advanced cases were TP53 and ASXL1 (25% of patients each).
The
mutation frequency for these two genes was lower in 5q- syndrome cases (TP53 4.5%, ASXL1
13.6%). We therefore confirmed in our del(5q) cohort the observation made by our group and
9,44,45
others that TP53 mutations occur predominantly in MDS with complex karyotype.
The
increased incidence of TP53 and ASXL1 mutations in advanced del(5q) cases in our present
study suggests that these abnormalities may play a role in disease progression in del(5q) MDS.
These data are consistent with a recent report that has shown that TP53 mutations were
associated with disease progression in del(5q) MDS.
43
The 5q- syndrome is widely considered to be relatively genetically stable compared to other
MDS subtypes, on the basis of molecular studies (including genomic array data analysis).2 This
is reflected in its relatively good prognosis.
8,11
Previous studies have shown mutations in a
limited number of genes, including TP53, JAK2 and ASXL19,10,34,46 in this MDS subtype. The
incidence of JAK2 and ASXL1 mutations is ~6%.37 Here, we show that over 40% of patients
with the 5q- syndrome in fact harbor a gene mutation, including TET2, SF3B1, RUNX1, WT1
and ASXL1.
The SF3B1 mutations detected in this study were identified in 5q- syndrome cases, and not in
the other two del(5q) patient groups, which are karyotypically or morphologically defined by
more advanced disease. SF3B1 mutations have been associated with a relatively benign
disease course.15,43,47 It has been suggested that multipotent hematopoietic stem cells initially
attain a splicing factor mutation as founding genetic lesion, and subsequently acquire additional
mutations that drive their malignant transformation.43,48 This is consistent with our finding in one
5q- syndrome case with a high SF3B1 mutant allele frequency and two other mutations (ASXL1
and TET2) with lower mutant allele frequencies.
The present study shows that a high proportion of genes involved in the epigenetic regulation of
the cell (TET2, ASXL1, DNMT3A and JAK2) are affected by either mutations or cytogenetic
losses in del(5q) MDS cases: 15 of 29 genes with non-synonymous mutations (51.7%) and 4 of
10 genes in regions affected by cytogenetic loss (40%) were epigenetic regulators. This
observation is consistent with a recent report of mutations in a large cohort of MDS (n=117),
where the authors also found 80 mutations in genes predicted to affect the epigenetic regulation
of the cell in half of the cohort (52% of cases).43 Genome-wide methylation analysis on a subset
of cases with and without mutations in epigenetic factors did not highlight a specific DNA
methylation profile associated with these mutations (Supplementary Information and Figure S5).
A total of six of the mutations detected in this study present variant frequencies below the level
of detection of Sanger sequencing, which is estimated to be around 15-20%.27,28 For example
10
one patient with the 5q- syndrome showed a DNMT3A mutant allele frequency of 7.8% and
another case a SF3B1 mutant allele frequency of 11.4%. Sanger Sequencing has been the gold
standard for sequencing for many years, and the vast majority of sequencing studies published
to date have used this technology. It is likely that previous studies underestimated the
prevalence of mutations in MDS. This has recently been illustrated by Jadersten et al.34 who
used NGS to reveal TP53 mutations (median clone size 11%) in nearly 20% of low-risk MDS
patients with del(5q). Our data support the hypothesis that the prevalence of mutations in
del(5q) MDS may have also been underestimated for other genes. Here, we have shown that
genes involved in the epigenetic regulation of the cell frequently harbor low-frequency mutations
in del(5q) MDS, non detectable by means of Sanger sequencing. This has previously been
demonstrated for TET2 in MDS and CMML.49
The proportion of variant reads can be used to determine the order of occurrence of multiple
mutations and therefore to infer the clonal evolution from early stages of the disease.
Interestingly, ASXL1 was one of the genes involved in four of the five cases with two or more
mutations, with a lower variant frequency than the other co-mutated genes in three of the cases,
suggesting that mutation of ASXL1 represented a later event in the disease course in these
cases. Our analysis of clonality was based on a small number of single cases with multiple
mutations. Studies with a similar sensitivity involving larger MDS cohorts will certainly help
establishing the phylogenetic structure of tumor evolution.
In summary, we have successfully developed and validated a panel that allows for the
screening of 25 genes frequently mutated in myeloid malignancies. The present study on
del(5q) MDS has shown that a number of gene mutations occur in patients with the 5qsyndrome, and that >40% of patients with this low-risk MDS subtype harbor at least one gene
mutation. A higher percentage of mutations was found among the more advanced del(5q) MDS
cases, with TP53 and ASXL1 being the more frequently mutated genes. Our study is the first to
investigate and compare the molecular pathogenesis of early and advanced del(5q) MDS using
targeted NGS technology on a large panel of genes frequently mutated in myeloid
malignancies.
Authorship and Disclosures:
Conceived and designed the experiments: JB, AS, JSW. Performed the experiments: MFM, AB.
Analyzed the data: MFM, AB, AP, XA, FP. Contributed reagents/materials/analysis tools: SK,
AG, CA, UG. Wrote the paper: MFM, AB, AP, JSW, AS, JB.
11
References
1.
Heaney ML, Golde DW. Myelodysplasia. N Engl J Med. 1999;340(21):1649-60.
2.
Wang L, Fidler C, Nadig N, Giagounidis A, Della Porta MG, Malcovati L, et al. Genomewide analysis of copy number changes and loss of heterozygosity in myelodysplastic
syndrome with del(5q) using high-density single nucleotide polymorphism arrays.
Haematologica. 2008;93(7):994-1000.
3.
Bernasconi P, Klersy C, Boni M, Cavigliano PM, Calatroni S, Giardini I, et al. Incidence and
prognostic significance of karyotype abnormalities in de novo primary myelodysplastic
syndromes: a study on 331 patients from a single institution. Leukemia. 2005;19(8):142431.
4.
Johansson B, Harrison C. Acute Myeloid Leukemia. In: Heim S, Mitelman F, eds. Cancer
Cytogenetics (ed 3rd). Hoboken, NJ, 2009:45-139.
5.
Giagounidis AA, Germing U, Haase S, Hildebrandt B, Schlegelberger B, Schoch C, et al.
Clinical, morphological, cytogenetic, and prognostic features of patients with
myelodysplastic syndromes and del(5q) including band q31. Leukemia. 2004;18(1):113-9.
6.
Haase D, Germing U, Schanz J, Pfeilstocker M, Nosslinger T, Hildebrandt B, et al. New
insights into the prognostic impact of the karyotype in MDS and correlation with subtypes:
evidence from a core dataset of 2124 patients. Blood. 2007;110(13):4385-95.
7.
Boultwood J, Pellagatti A, McKenzie AN, Wainscoat JS. Advances in the 5q- syndrome.
Blood. 2010;116(26):5803-11.
8.
Greenberg PL, Tuechler H, Schanz J, Sanz G, Garcia-Manero G, Sole F, et al. Revised
international prognostic scoring system for myelodysplastic syndromes. Blood.
2012;120(12):2454-65.
9.
Fidler C, Watkins F, Bowen DT, Littlewood TJ, Wainscoat JS, Boultwood J. NRAS, FLT3
and TP53 mutations in patients with myelodysplastic syndrome and a del(5q).
Haematologica. 2004;89(7):865-6.
10. Wong KF, Wong WS, Siu LL, Lau TC, Chan NP. JAK2 V617F mutation is associated with
5q- syndrome in Chinese. Leuk Lymphoma. 2009;50(8):1333-5.
11. Greenberg P, Cox C, LeBeau MM, Fenaux P, Morel P, Sanz G, et al. International scoring
system for evaluating prognosis in myelodysplastic syndromes. Blood. 1997;89(6):2079-88.
12. Schanz J, Tuchler H, Sole F, Mallo M, Luno E, Cervera J, et al. New comprehensive
cytogenetic scoring system for primary myelodysplastic syndromes (MDS) and oligoblastic
acute myeloid leukemia after MDS derived from an international database merge. J Clin
Oncol. 2012;30(8):820-9.
13. Ernst T, Chase AJ, Score J, Hidalgo-Curtis CE, Bryant C, Jones AV, et al. Inactivating
mutations of the histone methyltransferase gene EZH2 in myeloid disorders. Nat Genet.
2010;42(8):722-6.
14. Gelsi-Boyer V, Trouplin V, Adelaide J, Bonansea J, Cervera N, Carbuccia N, et al.
Mutations of polycomb-associated gene ASXL1 in myelodysplastic syndromes and chronic
myelomonocytic leukaemia. Br J Haematol. 2009;145(6):788-800.
15. Papaemmanuil E, Cazzola M, Boultwood J, Malcovati L, Vyas P, Bowen D, et al. Somatic
SF3B1 mutation in myelodysplasia with ring sideroblasts. N Engl J Med.
2011;365(15):1384-95.
12
16. Tefferi A, Lim KH, Abdel-Wahab O, Lasho TL, Patel J, Patnaik MM, et al. Detection of
mutant TET2 in myeloid malignancies other than myeloproliferative neoplasms: CMML,
MDS, MDS/MPN and AML. Leukemia. 2009;23(7):1343-5.
17. Druley TE, Vallania FL, Wegner DJ, Varley KE, Knowles OL, Bonds JA, et al.
Quantification of rare allelic variants from pooled genomic DNA. Nat Methods.
2009;6(4):263-5.
18. Varela I, Tarpey P, Raine K, Huang D, Ong CK, Stephens P, et al. Exome sequencing
identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma.
Nature. 2011;469(7331):539-42.
19. Yoshida K, Sanada M, Shiraishi Y, Nowak D, Nagata Y, Yamamoto R, et al. Frequent
pathway mutations of splicing machinery in myelodysplasia. Nature. 2011;478(7367):64-9.
20. Ding L, Ellis MJ, Li S, Larson DE, Chen K, Wallis JW, et al. Genome remodelling in a
basal-like breast cancer metastasis and xenograft. Nature. 2010;464(7291):999-1005.
21. Bikandi J, San Millan R, Rementeria A, Garaizar J. In silico analysis of complete bacterial
genomes: PCR, AFLP-PCR and endonuclease restriction. Bioinformatics. 2004;20(5):7989.
22. Adzhubei I, Jordan DM, Sunyaev SR. Predicting Functional Effect of Human Missense
Mutations Using PolyPhen-2. Curr Protoc Hum Genet. 2013;Chapter 7:Unit7 20.
23. Murphy KM, Levis M, Hafez MJ, Geiger T, Cooper LC, Smith BD, et al. Detection of FLT3
internal tandem duplication and D835 mutations by a multiplex polymerase chain reaction
and capillary electrophoresis assay. J Mol Diagn. 2003;5(2):96-102.
24. Lunter G, Goodson M. Stampy: a statistical algorithm for sensitive and fast mapping of
Illumina sequence reads. Genome Res. 2011;21(6):936-9.
25. Rimmer A, Mathieson I, Lunter G, McVean G. Platypus: An Integrated Variant Caller
(www.well.ox.ac.uk/platypus). 2012.
26. Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D, et al. COSMIC: mining
complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids
Res. 2011;39(Database issue):D945-50.
27. Bar-Eli M, Ahuja H, Gonzalez-Cadavid N, Foti A, Cline MJ. Analysis of N-RAS exon-1
mutations in myelodysplastic syndromes by polymerase chain reaction and direct
sequencing. Blood. 1989;73(1):281-3.
28. Collins SJ, Howard M, Andrews DF, Agura E, Radich J. Rare occurrence of N-ras point
mutations in Philadelphia chromosome positive chronic myeloid leukemia. Blood.
1989;73(4):1028-32.
29. Walter MJ, Shen D, Ding L, Shao J, Koboldt DC, Chen K, et al. Clonal architecture of
secondary acute myeloid leukemia. N Engl J Med. 2012;366(12):1090-8.
30. Ding L, Ley TJ, Larson DE, Miller CA, Koboldt DC, Welch JS, et al. Clonal evolution in
relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature.
2012;481(7382):506-10.
31. Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, Lau KW, et al. The
life history of 21 breast cancers. Cell. 2012;149(5):994-1007.
32. Li M, Diehl F, Dressman D, Vogelstein B, Kinzler KW. BEAMing up for detection and
quantification of rare sequence variants. Nat Methods. 2006;3(2):95-7.
13
33. Su Z, Dias-Santagata D, Duke M, Hutchinson K, Lin YL, Borger DR, et al. A platform for
rapid detection of multiple oncogenic mutations with relevance to targeted therapy in nonsmall-cell lung cancer. J Mol Diagn. 2011;13(1):74-84.
34. Jadersten M, Saft L, Smith A, Kulasekararaj A, Pomplun S, Gohring G, et al. TP53
mutations in low-risk myelodysplastic syndromes with del(5q) predict disease progression.
J Clin Oncol. 2011;29(15):1971-9.
35. Jerez A, Gondek LP, Jankowska AM, Makishima H, Przychodzen B, Tiu RV, et al.
Topography, clinical, and genomic correlates of 5q myeloid malignancies revisited. J Clin
Oncol. 2012;30(12):1343-9.
36. Pardanani A, Patnaik MM, Lasho TL, Mai M, Knudson RA, Finke C, et al. Recurrent IDH
mutations in high-risk myelodysplastic syndrome or acute myeloid leukemia with isolated
del(5q). Leukemia. 2010;24(7):1370-2.
37. Patnaik MM, Lasho TL, Finke CM, Gangat N, Caramazza D, Holtan SG, et al. WHOdefined 'myelodysplastic syndrome with isolated del(5q)' in 88 consecutive patients:
survival data, leukemic transformation rates and prevalence of JAK2, MPL and IDH
mutations. Leukemia. 2010;24(7):1283-9.
38. Patnaik MM, Lasho TL, Finke CM, Knudson RA, Ketterling RP, Chen D, et al. Isolated
del(5q) in myeloid malignancies: clinicopathologic and molecular features in 143
consecutive patients. Am J Hematol. 2011;86(5):393-8.
39. Sebaa A, Ades L, Baran-Marzack F, Mozziconacci MJ, Penther D, Dobbelstein S, et al.
Incidence of 17p deletions and TP53 mutation in myelodysplastic syndrome and acute
myeloid leukemia with 5q deletion. Genes Chromosomes Cancer. 2012;51(12):1086-92.
40. Sokol L, Caceres G, Rocha K, Stockero KJ, Dewald DW, List AF. JAK2(V617F) mutation in
myelodysplastic syndrome (MDS) with del(5q) arises in genetically discordant clones. Leuk
Res. 2010;34(6):821-3.
41. Bejar R, Stevenson K, Abdel-Wahab O, Galili N, Nilsson B, Garcia-Manero G, et al. Clinical
effect of point mutations in myelodysplastic syndromes. N Engl J Med. 2011;364(26):2496506.
42. Damm F, Kosmider O, Gelsi-Boyer V, Renneville A, Carbuccia N, Hidalgo-Curtis C, et al.
Mutations affecting mRNA splicing define distinct clinical phenotypes and correlate with
patient outcome in myelodysplastic syndromes. Blood. 2012;119(14):3211-8.
43. Mian SA, Smith AE, Kulasekararaj AG, Kizilors A, Mohamedali AM, Lea NC, et al.
Spliceosome mutations exhibit specific associations with epigenetic modifiers and protooncogenes mutated in myelodysplastic syndrome. Haematologica. 2013.
44. Jonveaux P, Fenaux P, Quiquandon I, Pignon JM, Lai JL, Loucheux-Lefebvre MH, et al.
Mutations in the p53 gene in myelodysplastic syndromes. Oncogene. 1991;6(12):2243-7.
45. Lai JL, Preudhomme C, Zandecki M, Flactif M, Vanrumbeke M, Lepelley P, et al.
Myelodysplastic syndromes and acute myeloid leukemia with 17p deletion. An entity
characterized by specific dysgranulopoiesis and a high incidence of P53 mutations.
Leukemia. 1995;9(3):370-81.
46. Boultwood J, Perry J, Pellagatti A, Fernandez-Mercado M, Fernandez-Santamaria C,
Calasanz MJ, et al. Frequent mutation of the polycomb-associated gene ASXL1 in the
myelodysplastic syndromes and in acute myeloid leukemia. Leukemia. 2010;24(5):1062-5.
47. Malcovati L, Papaemmanuil E, Bowen DT, Boultwood J, Della Porta MG, Pascutto C, et al.
Clinical significance of SF3B1 mutations in myelodysplastic syndromes and
myelodysplastic/myeloproliferative neoplasms. Blood. 2011;118(24):6239-46.
14
48. Cazzola M, Rossi M, Malcovati L. Biologic and clinical significance of somatic mutations of
SF3B1 in myeloid and lymphoid neoplasms. Blood. 2013;121(2):260-9.
49. Smith AE, Mohamedali AM, Kulasekararaj A, Lim Z, Gaken J, Lea NC, et al. Nextgeneration sequencing of the TET2 gene in 355 MDS and CMML patients reveals lowabundance mutant clones with early origins, but indicates no definite prognostic value.
Blood. 2010;116(19):3923-32.
15
Table 1a. Summary of mutations present in test samples used for TSCA panel validation. All Qscore values were generated by GATK through the BaseSpace pipeline, with the exception of
the TEST009 variant, which was generated by Platypus.
Sample
ID
Gene
TEST001
ASXL1
Mutation
TEST001
EZH2
c.1925het_insA;
p.G643RfsX13
p.L98Ifs*28
TEST001
EZH2
p.Q250X
TEST001
NRAS
TEST001
RUNX1
TEST002
ASXL1
TEST003
IDH2
Position
Q-score
Depth of
coverage
(x)
Frequency
60%
Chr20:31,022,442
99
239
Chr7:148529801
99
895
32%
Chr7:148523705
99
437
42%
p.G12D
Chr1:115258747
99
955
32%
p.S141X
Chr21:36252940
99
361
31%
Chr20:31,022,263
99
159
28%
Chr15:90,631,934
99
1382
51%
Chr5:170,837,548
99
836
23%
TEST003
NPM1
c.1748G>GA;p.W583X
c.13775G>GA;
p.R140Q
----/TCTG
TEST003
FLT3
109bp insertion
N/A
TEST004
NPM1
----/TCTG
Chr5:170,837,548
TEST004
FLT3
64bp insertion
N/A
TEST005
NPM1
----/TCTG
Chr5:170,837,548
99
527
TEST006
DNMT3A
c.2648C>CT;p.R882V
Chr2:25,457,242
99
250
44%
TEST007
KIT
c.2447A>AG;p.D816V
Chr4:55,599,321
99
248
39%
TEST008
NPM1
----/TCTG
Chr5:170,837,548
99
875
20%
TEST008
IDH1
Chr4:209,113,114
99
1390
48%
TEST009
TP53
c.6694C>CT; p.R132C
TGTACATGGCCATGG
CGCGG / T
Chr17:7,578,441
200
731
95%
Detected visually only
99
734
25%
Detected visually only
15%
Table 1b. Summary of additional mutations found in the test samples.
Mutation
Position
TET2
p.C1464X
Chr4:106,193,930
99
Depth
of
coverage
(x)
423
TET2
p.L1258Afs*10
Chr4:106,164,903
99
2968
25%
NPM1
G/GTCTG
Chr5:170,837,547
99
999
18%
TEST007
RUNX1
p.L71Sfs*24
Chr21:36,259,199
99
395
51%
TEST007
SF3B1
p.K700E
Chr2:198,266,834
99
2053
45%
TEST008
FLT3
p.I836del
Chr13:28,592,636
99
2271
48%
Sample
ID
Gene
TEST001
TEST005
TEST006
16
Q-score
Frequency
47%
Table 2. List of genes targeted for enrichment in the TSCA library.
Gene
Location
Chromosomal Coordinates
Targeted Exons
ASXL1
20q11.21
chr20:30946147-31027122
ATRX
Xq21.1
chrX:76,760,356-77,041,719
CBL
11q23.3
chr11:119076986-119178859
CBLB
3q13.11
chr3:105377109-105587887
9, 10
CBLC
19q13.32
chr19:45281126-45303903
9, 10
DNMT3A
2p23.3
chr2:25455830-25564784
23
ETV6/TEL
12p13.2
chr12:11802788-12048325
All 8 exons
EZH2
7q36.1
chr7:148504464-148581441
FLT3
13q12.2
chr13:28577411-28674729
IDH1
2q34
chr2:209100953-209119806
2-20
14, 15 (JM and TK1 domains)
20 (D835)
4
IDH2
15q26.1
chr15:90627212-90645708
4
JAK2
9p24.1
chr9:4985245-5128183
12, 14
KIT
4q12
chr4:55524095-55606881
2, 8-11, 13, 17
MPL
1p34.2
chr1:43803475-43820135
10
NPM1
5q35.1
chr5:170814708-170837888
12
NRAS
1p13.2
chr1:115247085-115259515
2, 3
PDGFRA
4q12
chr4:55095264-55164412
12, 14, 18
RUNX1
21q22.12
chr21:36193574-36260987
3-8
SF3B1
2q33.1
chr2:198256698-198299771
15, 16
SRSF2
17q25.1
chr17:74730197-74733493
1
TET2
4q24
chr4:106067842-106200960
3-11
TP53
17p13.1
chr17:7571720-7590868
4-9
U2AF1
21q22.3
chr21:44513066-44527688
2, 6
WT1
11p13
chr11:32409322-32457081
7, 9 (Cys-His zinc finger domains)
ZRSR2
Xp22.2
chrX:15808574-15841382
All 11 exons
17
12
8-10 (ADD domain)
17-31 (Helicase domain)
8-9 (ring finger domain and linker sequence)
Table 3. Summary of non-synonymous variant calls with a COSMIC ID or not present in dbSNP
5q- syndrome (n=22)
Number of mutations
(%)
TP53
1 (4.5)
ASXL1
3 (13.6)
TET2
3 (13.6)
JAK2V617F
0
CBL
0
DNMT3A
1 (4.5)
U2AF1
0
SF3B1
2 (9.1)
RUNX1
1 (4.5)
WT1
1 (4.5)
TOTAL
NUMBER OF
12
MUTATIONS
Patients
presenting at
9 (45.0)
least one
mutation
* Two patients had two TP53 mutations.
RA del(5q) with
additional karyotypic
abnormalities (n=9)
Number of mutations
(%)
1 (11.1)
0
0
1/8 (12.5)
1 (11.1)
1 (11.1)
1 (11.1)
0
0
0
18
Advanced del(5q)
cases (n=12)
Number of mutations
(%)
5 (41.7)*
3 (25.0)
2 (16.7)
1/8 (12.5)
1 (8.3)
0
0
0
0
0
5
12
4 (44.4)
8 (66.7)
Figure Legends
Figure 1. Mutations, deletions and loss of heterozygosity in 25 genes analysed in del(5q)
MDS samples. Columns show results for each of the 43 analysed cases. Grey boxes indicate
mutated cases. Black boxes mark samples for which SNP-array data were available. X: double
mutant. Δ: gene encompassed within a region of cytogenetic loss. Θ: gene encompassed within
a region of UPD.
Figure 2. Mutant allele frequencies in individual del(5q) MDS samples. The area of each
coloured circle indicates the allele frequency of the given mutation. The text under the circles
lists the frequency and nature of each mutation in order of decreasing allele frequency.
19
Supplementary Information
Targeted re-sequencing
We designed a TruSeq Custom Amplicon panel (TSCA, Illumina), targeting 25 genes mutated
in various myeloid malignancies (Table 2). The panel was developed using the online
DesignStudio pipeline (http://designstudio.illumina.com, Illumina), and covers a total of
46,604bp with 322 amplicons. In genes with well-defined mutational hotspots only these
regions were targeted; otherwise the entire coding sequence of the gene was sequenced.
Dual-barcoded TSCA libraries were created from 250ng of genomic DNA, in accordance with
the manufacturer’s instructions, before undergoing 2x150bp paired-end sequencing on the
Illumina MiSeq platform. The initial alignment and variant calling analysis was performed with
the BaseSpace online analysis tool (https://basespace.illumina.com, Illumina). In order to
1
screen for larger insertions and deletions, the data was also was run through the Stampy and
2
Platypus pipelines, which uses a different algorithm to map sequencing reads to a reference
genome. All variants called were visually inspected in IGV.
All candidate sequence variations that passed the internal Illumina integrity filters, and with a
quality score greater than Q60, were taken forward for further analysis. All variations were
confirmed visually and then checked against dbSNP build 135 (NCBI, National Center for
Biotechnology Information, USA) and COSMIC (Catalog of Somatic Mutations In Cancer,
Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK) databases, to assess whether the
variations found were reported polymorphisms or annotated mutations, respectively.
Assay sensitivity
To evaluate the sensitivity of the assay, we used two different approaches: (1) comparison
with absolute real-time PCR quantification for a specific mutation and (2) definition of general
background noise across all amplicons.
1. Comparison with real-time PCR
V617F
We determined the variant allele frequency (VAF) in 7 JAK2
positive samples by real-
time PCR, and compared them with the VAF values from the targeted sequencing assay.
We performed real-time PCR using the commercially available JAK2 MutaQuant™ kit
(Ipsogen, Luminy Biotech, Marseille, France), which distinguishes between JAK2 wild-type
and V617F alleles through Taqman allelic discrimination. Allele specific probes, labelled with
5’ reporter and 3’ quencher dyes, for both wild-type and V617F alleles are used to amplify the
region of interest. The JAK2
V617F
percentage can be calculated from the fluorescent levels of
each assay.
DNA samples were quantified using a BioPhotometer (Eppendorf, Hamburg, Germany) and
normalised to a working concentration of 5ng/µl in nuclease free water. RT-PCR reactions
were setup in a 100 well rotor by a CAS1200 liquid handling instrument (Qiagen, Hilden,
Germany). Each reaction contained 6.25µl 2x Taqman Universal PCR Master Mix (Applied
Biosystems, Life Technologies, Carlsbad, CA,), 0.5µl 25x primer/probe mix (Ipsogen, Luminy
Biotech, Marseille, France), 3.25µl nuclease free water and 2.5µl 5ng/µl sample DNA. 4-point
duplicate standard curves were included with each run, amplified from standard plasmids
included in the kit. Positive (>99.9% V617F) and negative (<0.1% V617F) controls were also
included in each run. Each sample was processed in duplicate for both the wild type and
V617F alleles.
The reactions were amplified on a Rotor-Gene 6000 instrument (Qiagen, Hilden, Germany)
with the following PCR conditions: 50°C for 2 minutes, 95°C for 10 minutes followed by 50
cycles of 95°C for 15 seconds and 62°C for 1 minute, with acquisition of FAM fluorescence
during the 62°C step.
Analysis of the raw data was performed using the Rotor-Gene Q software package (Qiagen,
Hilden, Germany). The cycle threshold was set at 0.03 with the slope corrected, as per the
manufacturer’s guidelines (Ipsogen, Luminy Biotech, Marseille, France). Raw data tables for
both Wild-Type and V617F assays were exported into Excel (Microsoft, Redmond, WA) to
facilitate further analysis. The standard curves were plotted (y = mean ct, x = Log 10 CN, where
CN is gene copy number/5µl) for both the wild-type and V617F standard samples, and the Y
2
V617F
and R values were extracted. The copy number for JAK2
was calculated as: (mean
CtJAK2V617F – Standard Curve InterceptJAK2V617F)/Standard Curve SlopeJAK2V617F. JAK2 wild-type
copy number was calculated as: (mean CtJAK2WT – Standard Curve InterceptJAK2WT)/Standard
V617F
Curve SlopeJAK2WT. Final results were determined as a percentage of JAK2
allele load,
calculated by: Copy NumberJAK2V617F/(Copy NumberJAK2V617F + Copy NumberJAK2WT) x 100.
V617F
The variant allele frequency of the JAK2
positive samples, as determined by the real-time
PCR assay, ranged from 1-24% (Table S2). All mutations with a VAF >3% (6/7, 86%) were
V617F
successfully aligned and called as the JAK2
variant. The remaining mutation (1% VAF)
was present in the sequencing reads, but was below the detection limit of the variant calling
software.
2. Background noise
We determined the background noise level of our assay by investigating the sequencing read
composition at 31 SNP loci over 14 chromosomes in 15 samples. The SNPs were all initially
identified by our data analysis pipeline, are bi-allelic and are all recorded in dbSNP135 as
being non-pathogenic. At each locus (465 total), we measured the level of background noise
by calculating the percentage of sequencing reads containing any of the alternate nucleotides
(3 in the case of homozygous SNPs, 2 in the case of heterozygous SNPs).
The mean level of background noise in our assay was thus determined as 0.31% (range 0.00.8%) across all SNP loci in all samples, and was consistently low both between the SNPs
(mean 0.31%, range 0.1-0.8%), and between the samples (mean 0.33%, range 0.25-0.55).
Interestingly, the background level at heterozygous loci was lower than that at homozygous
loci (0.2% and 0.4% respectively).
Taken together, we therefore defined the sensitivity of the panel at 1-3% depending on the
locus examined and the variant caller software.
V617F
JAK2
V617F
JAK2
pyrosequencing
3
(c.1849G>T) mutation was analysed using primers as previously described. In
brief, DNA was amplified in 25μl reactions, containing 2x Qiagen Multiplex PCR Master Mix
(Qiagen), 5x Q Solution (Qiagen) and 5mM each of reverse and biotinylated forward primers.
o
Cycling conditions consisted of an initial denaturation step of 97 C for 15 minutes followed by
o
o
o
35 cycles of 30 seconds at 97 C, 90 seconds at 62 C and 2 minutes at 72 C. The resulting
biotinylated PCR product was subjected to pyrosequencing using a Pyromark Q24 System
(Qiagen). Pyromark Q24 allele quantification (AQ) software was used to quantify the level (if
V617F
any) of JAK2
variant present in each sample.
FLT3-ITD ARMS-PCR
4
FLT3-ITD mutations were analysed using primers as previously described, modified with
WellRED fluorescent dyes.
4,5
In brief, DNA was amplified in 25μl reactions, containing 2x
Qiagen Multiplex PCR Master Mix (Qiagen), 5x Q Solution (Qiagen) and 5mM each of forward
o
and reverse primers. Cycling conditions consisted of an initial denaturation step of 95 C for 15
o
o
minutes followed by 35 cycles of 30 seconds at 95 C, 1 minute at 56 C and 2 minutes at
o
o
72 C, with a final extension step of 10 minutes at 72 C. The resulting PCR product was
diluted 1:10. 2μl of diluted PCR product was mixed with 40μl Sample Loading Solution
(Beckman Coulter) and 0.5μl GenomeLab DNA Size Standard 600 (Beckman Coulter) and
subjected to capillary electrophoresis on a CEQ8000 Genetic Analysis System (Beckman
Coulter). Data analysis was performed using CEQ analysis software version 9.0.25.
NPM1 fragment analysis
Validation of the NPM1 mutation was performed by fragment analysis, using primers as
6
previously described. DNA was amplified in 25μl reactions containing 2x Qiagen Master Mix
(Qiagen), 10pmol of forward and reverse primers and sterile water up to the final 25μl volume.
o
Cycling conditions consisted of an initial denaturation step of 95 C for 15 minutes followed by
o
o
o
40 cycles of 30 seconds at 92 C, 30 seconds at 58 C and 20 seconds at 72 C, with a final
o
extension step of 10 minutes at 72 C. The resulting PCR product was diluted 1:10. 2μl of
diluted PCR product was mixed with 40μl Sample Loading Solution (Beckman Coulter) and
0.5μl GenomeLab DNA Size Standard 600 (Beckman Coulter) and subjected to capillary
electrophoresis on a CEQ8000 Genetic Analysis System (Beckman Coulter). Data analysis
was performed using CEQ analysis software version 9.0.25.
Sanger Sequencing
Mutations discovered in the validation cohort in TET2, RUNX1, SF3B1 and FLT3 were
confirmed by Sanger sequencing. DNA was amplified in 25μl reactions containing 2x Qiagen
Master Mix (Qiagen) and 5mM of forward and reverse primers. 5x Q Solution (Qiagen) was
used where indicated (Table S2). Cycling conditions for all targets consisted of an initial
o
o
denaturation step of 97 C for 15 minutes followed by 35 cycles of 30 seconds at 92 C, 30
o
o
o
seconds at 55 C (RUNX1 and FLT3) or 60 C (TET2 and SF3B1) and 20 seconds at 72 C,
o
with a final extension step of 10 minutes at 72 C. The PCR products were purified using
MicroClean (Cambio) and 1μl of purified PCR product was used for sequencing with the Big
Dye terminator v3.1 chemistry (Applied Biosystems) with either the forward or reverse primer.
After ethanol/EDTA precipitation, the samples underwent electrophoresis on an ABI 3130
Genetic Analyzer (Applied Biosystems).
Genome-wide DNA-methylation
The DNA methylation profiles of 14 cases were analysed using Illumina HumanMethylation 27
BeadChip (Illumina, Inc., San Diego, CA, USA). Those 14 cases included 11 5q- syndrome, 1
del(5q) RA with additional cytogenetic aberrations and 2 advanced del(5q) cases. To ensure
karyotypic homogeneity, only the DNA methylation profiles of the 11 5q- syndrome cases was
further analysed based on the mutational status of the genes involved in epigenetic regulation
included in our TSCA. Within these 11 5q- syndrome cases 1 had a DNMT3A mutation, 2 had
an ASXL1 mutation, and 1 had concomitant ASXL1 and TET2 mutations.
Data analysis was carried out using R/Bioconductor. Before selection of differentially
methylated probes a filtering process based on the mean β-values for each gene mutated
under study (DNMT3A, ASXL1, ASXL1 and TET2, ASXL1 or TET2) was performed to focus
the analysis on genes with large differences in their methylation status. Briefly, the obtained
mean value was categorized in three states: unmethylated state (mean value < 0.3), partially
methylated state (mean value > 0.3-<0.7) and methylated state (mean value > 0.7). We
assigned a value of 0, 1 or 2 to each probe in function of its methylation state and calculated
the difference between states for each comparison. All probes with differential methylated
state equal to 0 were filtered out. Finally, fold-change of mean β-values was used to find out
the probes that showed significant differential methylation patterns. Probes were selected as
significant using a logFC cut off of 1.5.
In order to investigate the potential effect on DNA methylation of mutations in genes involved
in the epigenetic regulation of the cell, the following comparisons were run:

2 ASXL1-mut cases versus 7 cases with no epigenetic gene mutations. Number of
differentially methylated genes (DMG): 422.

1 DNMT3A-mut case versus 7 cases with no epigenetic gene mutations. Number of
DMG:144.

1 ASXL1 & TET2 mutant cases versus 7 cases with no epigenetic gene mutations.
Number of DMG:156.

3 ASXL1-mut cases versus 7 cases with no epigenetic gene mutations. Number of
DMG: 205.
The lists of DMG were used to generate supervised clusters on all 11 5q- syndrome cases.
None of the analyses managed to cluster the samples based on their mutations in epigenetic
genes. Based on these results, we cannot attribute any specific DNA methylation profile to the
mutations detected in genes involved in the epigenetic regulation of the cell.
SNP mapping assay and data analysis
The SNP mapping assay was performed according to the protocol supplied by the
manufacturer (Affymetrix, Santa Clara, CA, USA). Briefly, 250 ng DNA were digested with
Hind III, ligated to the adaptor, and amplified by polymerase chain reaction (PCR) using a
single primer. PCR products were purified with the DNA amplification clean-up kit (Clontech)
and the amplicons were quantified. The 40 μg of purified amplicons were fragmented, endlabeled and hybridized to a Genechip Mapping 50K Hind III array at 48°C for 16–18 hours in a
Hybridization Oven 640 (Affymetrix). After washing and staining in a Fluidics Station 450
(Affymetrix), the arrays were scanned with a GeneChip Scanner 3000 (Affymetrix).
Cell intensity calculations and scaling were performed using GeneChip Operating Software
(GCOS). Data were analyzed using GeneChip Genotyping Analysis Software Version 4.0
(Affymetrix) and CNAG software version 2.0. Quality control was performed within the
Genotyping software after scaling the signal intensities of all arrays to a target of 100%. DNA
copy number was analyzed with both the chromosome copy number tool (CNAT) version 3.0
and CNAG version 2.0. CNAT compares obtained SNP hybridization signal intensities with
SNP intensity distributions of a reference set from more than 100 healthy individuals of
different ethnicity. For analysis with CNAG we used a pool of 45 healthy controls as a
7
reference set.
References
1. Lunter G, Goodson M. Stampy: a statistical algorithm for sensitive and fast mapping of
Illumina sequence reads. Genome Res 2011;21(6):936-9.
2. Rimmer A, Mathieson I, Lunter G, McVean G. Platypus: An Integrated Variant Caller
(www.well.ox.ac.uk/platypus). 2012.
3.
Jones AV, Kreil S, Zoi K, Waghorn K, Curtis C, Zhang L, et al. Widespread occurrence of
the JAK2 V617F mutation in chronic myeloproliferative disorders. Blood
2005;106(6):2162-8.
4. Murphy KM, Levis M, Hafez MJ, Geiger T, Cooper LC, Smith BD, et al. Detection of FLT3
internal tandem duplication and D835 mutations by a multiplex polymerase chain reaction
and capillary electrophoresis assay. J Mol Diagn 2003;5(2):96-102.
5. Kiyoi H, Naoe T, Yokota S, Nakao M, Minami S, Kuriyama K, et al. Internal tandem
duplication of FLT3 associated with leukocytosis in acute promyelocytic leukemia.
Leukemia Study Group of the Ministry of Health and Welfare (Kohseisho). Leukemia
1997;11(9):1447-52.
6. Scholl S, Mugge LO, Landt O, Loncarevic IF, Kunert C, Clement JH, et al. Rapid
screening and sensitive detection of NPM1 (nucleophosmin) exon 12 mutations in acute
myeloid leukaemia. Leuk Res 2007;31(9):1205-11.
7. Wang L, Fidler C, Nadig N, Giagounidis A, Della Porta MG, Malcovati L, et al. Genomewide analysis of copy number changes and loss of heterozygosity in myelodysplastic
syndrome with del(5q) using high-density single nucleotide polymorphism arrays.
Haematologica 2008;93(7):994-1000.
V617F
Table S1. Summary of JAK2
variant allele frequencies (VAF).
RT-PCR
MiSeq
JAK2 WT
Copy
Number
V617F
Copy
Number
VAF
Total Depth
Reference
Depth
JAK2_A
60035
1865
0.03
8205
JAK2_B
51617
5929
0.10
7924
JAK2_C
58408
9834
0.14
JAK2_D
52331
7917
JAK2_E
59013
JAK2_F
50564
JAK2_G
36490
11804
Sample ID
Variant
Depth
VAF
7564
628
0.08
6540
1366
0.17
7883
5859
2015
0.26
0.13
7828
6342
1472
0.19
852
0.01
7411
7219
177
0.02
10411
0.17
8139
6390
1719
0.21
0.24
7637
5290
2333
0.31
Table S2. Sanger sequencing primers and PCR conditions.
Target
Forward Primer
Reverse Primer
PCR
Conditions
Reference
TET2
AGACTTATGTATCTTTCATCTAGCTCTGG
ACTCTCTTCCTTTCAACCAAAGATT
60°C
Gelsi-Boyer et
al.
RUNX1
GCTGTTTGCAGGGTCCTAA
CCTGTCCTCCCACCACCCTC
5x Q
Solution,
55°C
SF3B1
CTGCAGTTTGGCYGAATAGTTG
AAAATTCTGTTAGAACCATGAAACA
60°C
Papaemmanuil
et al.
FLT3
CCGCCAGGAACGTGCTTG
GCAGCCTCACATTGCCCC
5x Q
Solution,
55°C
Nakao et al.
Table S3. Detailed description of non-synonymous variants with a COSMIC ID or not reported in dbSNP.
sample
ID
Diagnostic
Gene
Genome
coordinates
DNA
change
Protein
change
Qscore
Variant call
ratio [%
(variant/total)]
COSMIC ID
dbSNP ID
Polyphen2
(score, sensitivity, specificity)
MDS16
RA (5qsyndrome)
RUNX1
chr21:36259324
A>AG
L29S
99
31.9 (23/72)
COSM24756
rs111527738
Probably damaging (0.999 0.14,
0.99)
MDS15
RA (5qsyndrome)
SF3B1
chr2:198266834
T>TC
K700E
99
11.4
(170/1496)
COSM84677
NA
Probably damaging (1.000, 0.00,
1.00)
DNMT3A
chr2:25457242
C>CA
R882L
75
7.8 (92/1176)
NA
NA
Probably damaging (0.982, 0.75,
0.96)
ASXL1
chr20:31022449
insG
G646WfsX12
99
44.7 (174/389)
COSM34210
NA
Truncated protein
WT1
chr11:32413565
C>CT
R462Q
99
49.0 (174/355)
COSM21408
NA
Probably damaging (1.000,
0.00,1.00)
TET2
chr4:106193748
C>CT
R1404X
99
45.1 (309/685)
COSM42037
NA
Truncated protein
ASXL1
chr20:31022449
insG
G646WfsX12
99
10.5 (37/351)
COSM34210
NA
Truncated protein
COSM84677
NA
Probably damaging (1.000, 0.00,
1.00)
MDS07
MDS08
MDS08
MDS14
MDS12
MDS12
MDS12
MDS06
MDS11
MDS10
RA (5qsyndrome)
RA (5qsyndrome)
RA (5qsyndrome)
RA (5qsyndrome)
RA (5qsyndrome)
RA (5qsyndrome)
RA (5qsyndrome)
RA (5qsyndrome)
RA (5qsyndrome)
RA (5qsyndrome)
SF3B1
chr2:198266834
T>TC
K700E
99
40.0
(620/1549)
TET2
chr4:106164896
insA
fs (Y1255X)
99
5.3 (41/771)
COSM110747
NA
Truncated protein
TET2
chr4:106197552
C>CT
P1962L
99
50.3 (303/602)
COSM41894
NA
Probably damaging (0.974, 0.76,
0.96)
ASXL1
chr20:31022902
G>GA
W796X
99
35.8 (144/402)
COSM53207
NA
Truncated protein
TP53
chr17:7578413
C>CG
V173L
99
41.1 (109/265)
COSM43559
NA
Probably damaging (0.979, 0.76,
0.96)
MDS29
RA (5qsyndrome)
JAK2
Chr9:5073770
G>GT
V617F
99
7
COSM12600
rs77375493
Probably damaging ( 0.996,
0.55, 0.98)
MDS34
RA (5qsyndrome)
JAK2
Chr9:5073770
G>GT
V617F
99
28
COSM12600
rs77375493
Probably damaging ( 0.996,
0.55, 0.98)
DNMT3A
chr2:25457176
G>GA
P904L
99
44.0 (198/450)
COSM52989
rs149095705
Probably damaging (0.995, 0.68,
0.97)
U2AF1
chr21:44514777
T>TC
Q157R
99
38.3 (242/632)
COSM144989
NA
Probably damaging (0.997, 0.41,
0.98)
CBL
chr11:119149332
C>CT
A447V
99
43.6 (99/227)
NA
NA
Possibly damaging (0.717, 0.86,
0.92)
MDS29
MDS28
MDS30
Del(5q) RA with
additional
cytogenetic
abnormalities
Del(5q) RA with
additional
cytogenetic
abnormalities
Del(5q) RA with
additional
cytogenetic
MDS26
MDS37
MDS42
MDS42
MDS36
MDS33
MDS43
MDS43
MDS39
MDS39
MDS38
MDS38
abnormalities
Del(5q) RA with
additional
cytogenetic
abnormalities
Advanced del(5q)
MDS (RAEB)
Advanced del(5q)
MDS (CMML)
Advanced del(5q)
MDS (CMML)
Advanced del(5q)
MDS (RAEB)
Advanced del(5q)
MDS (RAEB)
Advanced del(5q)
MDS (RAEB)
Advanced del(5q)
MDS (RAEB)
Advanced del(5q)
MDS (RAEB)
Advanced del(5q)
MDS (RAEB)
Advanced del(5q)
MDS (RAEB)
Advanced del(5q)
MDS (RAEB)
TP53
chr17:7577553
A>AG
M243T
99
28.0
(327/1166)
COSM43726
NA
TP53
chr17:7577120
C>CT
R273H
99
82.2 (620/754)
COSM10660
rs28934576
ASXL1
chr20:31023821
G>GT
E1102D
99
45.4 (366/806)
COSM36205
rs139115934
CBL
chr11:119149004
G>GT
W408C
99
96.0 (267/278)
COSM34072
NA
ASXL1
chr20:31024704
G>GA
G1397S
99
49.9
(875/1755)
COSM133033
rs146464648
TET2
chr4:106196850
insCATG
E1728Dfs*13
99
17.0 (121/713)
COSM211745
NA
Truncated protein
NA
NA
Truncated protein
COSM34210
NA
Truncated protein
27.3
(313/1145)
42.8
(470/1097)
Probably damaging (1.000, 0.00,
1.00)
Possibly damaging (0.831, 0.84,
0.93)
Possibly damaging (0.779, 0.85,
0.93)
Probably damaging (0.996, 0.55,
0.98)
Possibly damaging (0.792, 0.85,
0.93)
TET2
chr4:106164880
G>GT
E1250X
99
ASXL1
chr20:31022449
insG
G646WfsX12
99
TP53
chr17:7578190
T>TC
Y220C
99
38.7 (48/124)
COSM99719
rs121912666
Probably damaging (1.000, 0.00,
1.00)
TP53
chr17:7578275
G>GA
Q192X
99
49.3 (99/201)
COSM117949
NA
Truncated protein
COSM6549
rs11540652
COSM11059
NA
TP53
chr17:7577538
C>CA
R248L
99
TP53
chr17:7577568
C>CT
C238Y
99
44.1
(1168/2648)
37.5
(998/2664)
Probably damaging (1.000, 0.00,
1.00)
Probably damaging (1.000, 0.00,
1.00)
Table S4. Detailed description of synonymous variants with a COSMIC ID.
sample
ID
Diagnostic
Gene
Genome
coordinates
DNA
change
Protein
change
Qscore
Variant call ratio
[% (variant/total)]
COSMIC ID
dbSNP ID
MDS04
RA (5q- syndrome)
IDH1
chr2:209113192
G>GA
G105G
99
49.2 (445/904)
COSM253316
rs11554137
MDS13
RA (5q- syndrome)
IDH1
chr2:209113192
G>GA
G105G
99
49.3 (465/943)
COSM253316
rs11554137
MDS01
RA (5q- syndrome)
IDH1
chr2:209113192
G>GA
G105G
99
49.3 (421/854)
COSM253316
rs11554137
MDS24
Del(5q) RA with additional cytogenetic abnormalities
IDH1
chr2:209113192
G>GA
G105G
99
49.6 (483/973)
COSM253316
rs11554137
MDS30
Del(5q) RA with additional cytogenetic abnormalities
IDH1
chr2:209113192
G>GA
G105G
99
49.4 (356/721)
COSM253316
rs11554137
MDS27
Del(5q) RA with additional cytogenetic abnormalities
FLT3
chr13:28608459
T>TC
L561L
99
53.6 (149/278)
COSM19740
rs34374211
MDS43
Advanced del(5q) MDS (RAEB)
FLT3
chr13:28608459
T>TC
L561L
99
52.1 (173/332)
COSM19740
rs34374211
MDS42
Advanced del(5q) MDS (CMML)
KIT
chr4:55599268
C>CT
I798I
99
55.1 (162/294)
COSM1307
rs55789615
MDS26
Del(5q) RA with additional cytogenetic abnormalities
KIT
chr4:55599268
C>CT
I798I
99
45.5 (150/330)
COSM1307
rs55789615
MDS02
RA (5q- syndrome)
KIT
chr4:55599268
C>CT
I798I
99
50.0 (166/332)
COSM1307
rs55789615
MDS05
RA (5q- syndrome)
PDGFRA
chr4:55152040
C>CT
V824V
99
55.6 (280/504)
COSM22413
rs2228230
MDS08
RA (5q- syndrome)
PDGFRA
chr4:55152040
C>CT
V824V
99
53.8 (271/504)
COSM22413
rs2228230
MDS09
RA (5q- syndrome)
PDGFRA
chr4:55152040
C>CT
V824V
99
49.0 (251/512)
COSM22413
rs2228230
MDS11
RA (5q- syndrome)
PDGFRA
chr4:55152040
C>CT
V824V
99
48.0 (210/437)
COSM22413
rs2228230
MDS14
RA (5q- syndrome)
PDGFRA
chr4:55152040
C>CT
V824V
99
47.5 (308/648)
COSM22413
rs2228230
MDS16
RA (5q- syndrome)
PDGFRA
chr4:55152040
C>CT
V824V
99
52.2 (251/481)
COSM22413
rs2228230
MDS01
RA (5q- syndrome)
PDGFRA
chr4:55152040
C>CT
V824V
99
53.0 (231/436)
COSM22413
rs2228230
MDS27
Del(5q) RA with additional cytogenetic abnormalities
PDGFRA
chr4:55152040
C>CT
V824V
99
51.4 (360/701)
COSM22413
rs2228230
MDS35
Advanced del(5q) MDS (RAEB)
PDGFRA
chr4:55152040
C>CT
V824V
99
45.6 (312/684)
COSM22413
rs2228230
MDS42
Advanced del(5q) MDS (CMML)
PDGFRA
chr4:55152040
C>CT
V824V
99
50.8 (332/654)
COSM22413
rs2228230
MDS32
Advanced del(5q) MDS (RAEB)
TP53
chr17:7578210
T>TC
R213R
99
44.9 (137/305)
COSM249885
rs1800372
Table S5. Genomic array results for 33 del(5q) cases analysed, including 18 5q- Syndrome cases. Brackets show several metrics of the detected alterations:
coordinates mapping the alteration (start-end); SNPs within it (start-end); length (bp); SNPs contained (number); copy number. It is noted if any of the 25
genes analysed in this study was encompassed in that region. UPD: uniparental dysomy. NA: not available.
Sample ID
MDS02
Diagnosis
RA, 5qSyndrome
Age/Sex
Karyotype
Deletions
2p23.3 (25119937-26655307; 49925005; 1535370; 14; 1.17; 0.67)
DNMT3A
NA/F
NA
5q22.1-q33.2 (110998762154437028; 20461-21565; 43438266;
1105; 1.38; 0.18)
UPD
Gains
13q14.11-q14.13 (4262752044793601; 43906-43966; 2166081;
61; 1.99; 0.25)
Whole Chr8 (272252-146052174;
29522-33067; 145779922; 3546;
2.28; 0.34)
6q24.1 (139269078-139910603;
25404-25432; 641525; 29; 2.30; 0.38)
10p14-p13 (11264836-13110988;
35711-35761; 1846152; 51; 2.20;
0.31)
MDS03
RA, 5qSyndrome
48/M
46,XY,del(5)(q13:q33)
5q14.3-q34 (87824784-167184563;
19911-21920; 79359779; 2010; 1.40;
0.18)
12q15 (67170328-69219954; 4203942099; 2049626; 61; 2.10; 0.36)
12q24.22-q24.31 (125358706;
43078-43146; 9511465; 69; 2.18;
0.34)
16q22.3-q23.1 (72348989-73935434;
50009-50037; 1586445; 29; 2.25;
0.30)
17q23.2-q23.3 (53845988-58047700;
51071-51112; 4201712; 42; 2.24;
0.33)
4q13.1-q13.2 (65578799-67713359;
14910-14985; 2134560; 76; 2.00;
0.23)
MDS04
MDS05
RA, 5qSyndrome
RA, 5qSyndrome
88/F
60/F
46,XX,del(5)(q13:q33)
46,XX,del(5)(q13:q33)
5q14.3-q34 (86862506-166939254;
19896-21913; 80076748; 2018; 1.61;
0.20)
5q14.2-q33.3 (81866327-156969197;
19783-21650; 75102870; 1868; 1.41;
0.34)
4q26-q27 (117680512-121421102;
16225-16298; 3740590; 74; 2.02;
0.27)
4q31.21-q31.23 (146661914149093249; 16855-16925; 2431335;
71; 1.99; 0.24)
1p31.2-p31.1 (68657746-70841176;
975-1035; 2183430; 61; 1.93; 0.35)
5q11.1-q11.2 (50213917-52264199;
ChrX. ATRX, ZRSR2
MDS06
RA, 5qSyndrome
68/F
46,XX,del(5)(q1415:q33)
5q14.3-q33.3 (87875023-156072147;
199912-21634; 68197124; 1723;
1.82; 0.25)
MDS07
RA, 5qSyndrome
NA/F
46,XX,del(5)(q13:q33)
5q14.3-5q34 (89303345163980289;19955-21829; 74676944;
1875; 1.54; 0.26)
MDS08
RA, 5qSyndrome
NA/M
NA
MDS09
RA, 5qSyndrome
84/F
46,XX,del(5)(q13:q33)
MDS10
RA, 5qSyndrome
76/F
46,XX,del(5)(q13:q33)
5q14.3-q33.3 (89917534-158840805;
19974-21705; 68923271; 1732; 1.47;
0.31)
5q14.3-q33.2 (85182021-154953129;
19863-21600; 69771108; 1738; 1.44;
0.21)
5q12.3-q13.1 (65535157-67677808;
19411-19484; 2142651; 74; 1.41;
0.14)
5q14.3-q15 (83343740-95777305;
19825-20080; 12433565; 256; 1.40;
0.17)
5q21.1-q34 (97786027-163782378;
20142-21815; 65996351; 1674; 1.39;
0.18)
RA, 5qSyndrome
81/F
MDS12
RA, 5qSyndrome
77/F
46,XX,del(5)(q22:q35)
MDS13
RA, 5qSyndrome
64/F
46,XX,del(5)(q33:q34)
MDS14
RA, 5qSyndrome
66/F
46,XX,del(5)(q31:q33)[8]
/46,XX[31]
MDS11
46,XX,del(5)(q13:q33)
5q21.1-q34 (98822612-164720069;
20174-21840; 65897457; 1667; 1.52;
0.22)
5q14.3-q33.1 (86463622-151297473;
19893-21476; 64833851; 1584; 1.48;
0.18)
5q32-q34 (148469763-167102662;
21427-21918; 18632899; 492; 1.46;;
0.28)
5q31.3-q33.3 (142271912156074292; 21232-21637; 13802380;
406; 1.42; 0.23)
18981-19036; 2050282; 56; 2.34;
2.02)
6q14.3-q15 (85411634-88478204;
24018-24102; 3066570; 85; 1.97;
0.28)
4q21.21-21.22 (80195990-82802565;
15257-15352; 2606575; 96; 1.98;
0.43)
13q21.2-q21.31 (5889568161586211; 44290_44371; length;
2690530; 82; 1.96; 0.30)
6q13-q14.1 (75035581-77537444;
23758-23818; 2501863; 61; 1.99;
0.25)
4q12-q13.1 (58804522-61855787;
14735-14810; 3051265; 76; 2.07;
0.33)
13q21.31-q21.32 (6229866765258156; 44390-44460; 2959489;
71; 1.97; 0.26)
2q23.3-q24.1 (152916479155038402; 7616-7694; 2121923; 79;
2.00; 0.30)
7p15.2-p15.1 (25388631-28776965;
26914-27025; 3388334; 112; 1.96;
0.27)
6q13-q14.1 (74620278-77463618;
23753-23813; 2843340; 61; 2.00;
0.24)
1q31.1 (185202380-187449830;
3205-3265; 22474450; 61; 1.95; 0.30)
3p24.1-p23 (29906836-34218933;
10495-10560; 4312097; 66; 1.96;
0.24)
46,XX,del(5)(q13:q33)[6]
/46,XX[4]
5q14.3-q33.3 (84641203-158921205;
19851-21707; 74280002; 1817; 1.69;
0.20)
74/F
NA
5q21.1-q33.3 (102215241156099317; 20248-21641; 53884076;
1394; 1.34; 0.23)
RA, 5qSyndrome
66/F
46,XX,del(5)(q1415:q33)
5q21.3-q34 (106062626-163950485;
20323-21828; 57887789; 1506; 1.35;
0.31)
4q21.3-22.1 888367780-90815461;
15505-15585; 2447681; 81; 1.97;
0.43)
10q23.1 (83732605-85765313;
37101-37181; 2032708; 81; 2.22;
0.54)
3q25.1-q25.2(151639927154140946; 12710-12780; 2501019;
71; 2.00; 0.27)
RA, 5qSyndrome
24/F
46,XX,del(5)(q31:q33)
5q31.3q.33.3 (141347991158590590; NA; 17242599; 491;;
1.37; 0.17)
7q31.33 (123603987-125743270;
28927-28987; 2139283; 61; 2.04;
0.33)
RA, 5qSyndrome
72/F
MDS16
RA, 5qSyndrome
MDS18
MDS20
MDS15
MDS21
MDS23
MDS24
6q22.33-q23.1 (128419190130577705; 25114-25209; 2158515;
96; 1.92; 0.23)
4q26-q27 (120601325122789184;16270-16340; 2187859;
71; 1.93; 0.29)
RA, 5qSyndrome
RA
RA
70/M
77/F
72/M
46,XY,del(5)(q13:q33)
46,XX,del(5)(q13:q33),d
el(11)(q22)[3]/46,XX[2]
46,Y,der(X)t(X;12)(p22;q
21),del(5)(q14-15;q33-
5q14.3-5q33.3 (19918-21643;
87898589-156124093; 68225504;
1726; 1.65; 0.30)
5q14.3-q34 (82810660-163854743;
19815-21825; 81044083; 2011; 1.78;
0.20)
11q22.3-q25 (106376702134173875; 40174-40623; 27797173;
450; 1.77; 0.19) (Only CN loss) CBL
5q15-q33 (91919548-158401872;
20009-21692; 66482324;1684; 1.60;
10q21.2-q21.3 (61596872-64250581;
36741-36796; 2653709; 56; 2.04;
0.33)
1p32.3-33 (48755449-54062185;
462-534; 5306736; 73; 2.09; 0.36)
17q23.2-q24.1 (53562730-61202528;
51069-51137;7639798; 69; 2.02;
0.26)
5q11.2-q12.1 (57355239-62126018;
19149-19330; 4770779; 182; 1.94;
0.39)
Multiple small gains
11q14.3-q21 (91659665-94441966;
39765-39825; 2782301; 61; 1.95;
0.38)
19p12-q12 (21633219-33888946;
53124-53185; 12255727; 62; 1.95;
0.27)
6q24.1-q24.2 (141617059143973425; 25458-25518;2356366;
34),der(12),del(12)(p11q
13)[7]/46,XY[3]
0.16)
61; 1.96; 0.21)
6q23.2-q23.3 (135131247138523284; 25324-25394; 3392037;
71; 1.68; 0.15) (Only CN loss)
12p11.23-p13.31 (980936927307682; 40761-41192; 17498313;
432; 1.63; 0.13) (Only CN loss) ETV6
12q21.33-q22 (89389338-94118553;
42583-42677; 4729215; 95; 1.69;
0.19) (Only CN loss)
MDS25
RA
78/F
MDS26
RA
85/F
MDS29
RA
78/F
MDS31
RA
73/F
46,XX,del(5)(q14:q34),
t(1,3)(p33:p14)[21]/46,X
X[4]
5q14.3-q33.2 (86226079-154919227;
19885-21598; 68693148; 1714; 1.73;
0.33)
12q21.2-q21.31 (7822683681842401; 42349-42425; 3615565;
77; 2.05; 0.44)
46,XX,del(5)(13:q33),+8
5q14.3-q33.3 (86607880-157924393;
19894-21673; 71316513; 1780; 1.60;
0.19)
13q21.1 (55328914-58382524;
44214-44273;3053610;60; 2.02; 0.24)
5q14.3-q33.3 (90357044-158432337;
19981-21696; 68075293; 1716; 1.64;
0.19)
9q21.13 (71990110-74653742;
34293-34368; 2663632; 76; 1.98;
0.25)
5q21.3-q34 (104537088-167772186;
20302-21937; 63235098; 1636; 1.47;
0.16)
8q21.11 (75717988-78220362;
31423-31498; 2502374; 76; 1.95;
0.20)
46,XX,del(5)(q13:q33)[1
8]/46,XX,del(5)(q13:q33)
,-7[1]
46,XX,del(5)(q13:q31)[1
8]/48,XX,del(5)(q13:q31)
,idic(21)(q22),+2mar[2]/4
6,XX[1]
6p21.2-p22.1 (length: 11429636; 131;
2.20; 0.36)
22q13.1-q13.31 (length: 8390013; 88;
2.21; 0.42)
Some small CN changes
Whole Chr8 (228574-143783463;
29518-33065; 143554889; 3548;
2.25; 0.29)
13q13.1-14.11 (31506479-39779549;
43528-43816; 8273070; 289; 1.98;
0.27))
MDS32
MDS35
RAEB
RAEB
52/F
58/M
46,XX,del(5)(q13:q33)
92,XXYY,del(5)(q14:q33
)
5q14.3-33.2 (87217489-153708130;
19903-21559; 66490641; 1657; 1.47;
0.21)
5q21.3-q35.3 (107008082180607628; 20365-22122; 73599546;
1758; 1.51; 0.21) NPM1
14q12 (24397576-26870017; 4594746026; 2472441; 80; 2.00; 0.23)
21q11.1-q22.3 (10000969-46844296;
54389-55266; 36843327; 878 whole
Chr; 1.99; 0.29) RUNX1, U2AF1
13q21.33-q22.1 (7040143573818187; 44605-44740; 3416752;
136; 2.02; 0.26)
16p11.1-q24.3 (34953675-88143266;
49577-50361; 53189591; 785; 1.99;
0.27)
13q31.2-q34 (88103652-113215972;
45121-45851; 25112320; 731; 2.45;
0.38)
MDS36
RAEB
NA/F
46,XX,del(5)(q13:q33),d
el(11)(q23)
5q23.1-q33.2 (116859235155177249; 20656-21605; 38318014;
950; 1.45; 0.20)
6q23.3 (135385192-138089687;
25328-25388; 2704495; 61;2.02;
0.29)
9q12.1 (56241373-59006184; 3090830988; 2764811; 81; 2.01; 0.25)
15q13.1-q13.2 (26237007-28085050;
47841-47866; 1848043; 26; 2.12;
0.35)
5q21.1-q35.3 (98243608-180607628;
20163-22122; 82364020; 1960; 1.42;
0.20) NPM1
7q11.22-q36.3 (69377470158624663; 27729-29517; 89247193;
1789; 1.43; 0.20) EZH2
12p12.1-p13.2 (10219902-22122693;
40783-41038; 11902791; 256; 1.46;
0.22) ETV6
MDS37
RAEB
58/M
4345,XY,del(5)(q31),der(7)
t(7;12)(q22;q1?3),-12,13,19,?del(20)(q1?3)[cp4]
13q14.11q14.2 (40036389-46217079;
43817-44001; 6180690; 185; 1.38;
0.18)
13q14.2-q21.1 (47592092-55466888;
44039-44218; 7874796; 180; 1.43;
0.17)
6p22.3 (22218694-23666140; 2268822735; 1447446; 48)
5q15 (92105855-95229134; 2001120072; 3123279; 62; 2.12; 0.44)
9q21.31-q21.32 (8078235482945497; 34556-34600; 2163143;
45)
15q12-q13.2 (24374497-28800086;
47820-47867; 4425589; 48; 1.46;
0.27)
17p11.2-p13.3(450509-19519465;
50362-50609; 19068956; 248; 1.39;
0.18) TP53
MDS39
RAEB
82/F
46,XX,del(5)(q13:q33),t(
6;12)(q13;p12)[2]/45,XX,
-7,22/46,XX,del(5)(q13:q33
),t(6;12)(q13;p12),+mar[
15]/46,XX[3]
20q11.21-q13.13 (2993363148271268; 53958-54172; 18337637;
215; 1.37; 0.22) ASXL1
5q14.2-q34 (81511479-161557314;
19775-21753; 80045835; 1979; 1.53;
0.44)
7p22.3-p11.2 (250149-56479844;
26082-27645; 56229695; 1564; 1.58;
0.45)
7q21.3-q36.3 (94919442-158624663;
2p22.2-p22.1 (38319370-41333317;
5245-5334; 3013947; 90; 2.29; 0.70)
Multiple gain of copy number
28305-29517; 63705221; 1213; 1.57;
0,47) EZH2
6q13-q15 (72449224-89814330;
23699-24130; 17365106; 432; 1.99;
0.30) SRSF2
MDS40
RAEB
54/F
46,XX,del(5)(q14:q34)
5q14.3-q34 (87425364-161996101;
19904-21761; 74570737; 1858; 1.47;
0.17)
9p21.3-p22.2 (18466830-21763347;
33640-33730; 3296517; 91; 1.98;
0.32)
9p21.1-p21.2 (26316539-30212869;
33860-34045; 3896330; 186; 2.00;
0.24)
12q24.13-q24.21 (111751790115292641; 43011-43072; 3540851;
62; 2.02; 0.32)
MDS41
MDS42
MDS43
RAEB
CMML
RAEB
56/M
45/M
79/F
46,
XY,del(5)(q14:q34)[2];47
,XY,del(5)(q14:q34),+21[
20]
46,XY,del(5)(q13:q33),d
el(13)(q12:q22)
46,XX,del(5)(q15:q33)
5q22.3-q34 (20585-21899;
115079389-166312668; 51233279;
1315; 1.57; 0.39)
5q14.3 (86007785-88632975; 1988119936; 2625190; 56; 1.88; 0.40)
5q14.3-q33.3 (85143956-159665477;
19861-21719; 74521521; 1859; 1.45;
0.24)
8q21.11 (75982355-78287079;
31427-31506; 2304724; 80; 2.06;
0.32)
13q13.2-q21.31 (3349541861943642; 43601-44385; 28448224;
785; 1.44; 0.21
5q21.1-q33.2 (101389190154492074; 20226-21568; 53102884;
1343; 1.50; 0.21)
11q22.1-q25 (97063972-134173875;
39904-40623; 37109903; 720; 2.00;
0.31) CBL
4q13.1 (60577199-62928065; 1476114841; 2350866; 81; 1.89; 0.24)
Multiple gains of copy number
4q21.21 (79349747-81034492;
15226-15290; 1684745; 65; 2.28;
0.44)
Table S6. List of genes affected by cytogenetic loss.
5q- Syndrome
(n=18)
EZH2
NPM1
TP53
ETV6
ASXL1
CBL
DNMT3A
RA del(5q) with
additional
karyotypic
abnormalities
(n=6)
1
1
1
Advanced del(5q)
cases
(n=9)
2
2
1
1
1
Figure S1. Number of clusters generated per amplicon in the panel during the MDS del(5q) cohort MiSeq sequencing run. A total of 96% (308/322) of
all amplicons generated at least 100 clusters during sequencing (average 5,362 clusters/amplicon).
A
B
Figure S2. Comparison of read alignments covering the 19bp TP53 deletion in sample TEST009. The initial read alignment and variant calling
(BaseSpace, A) failed to align any reads containing deletions to the reference genome, resulting in a much lower read depth across this locus (~30x). By
comparison, re-analysis of the same data using the Stampy and Platypus pipeline (B) resulted in a greater number of aligned reads, giving a higher read
depth (>700x) and successfully identified the deletion.
Figure S3. Comparison of the TET2 C1464X mutation in sample TEST001 by Sanger and next-generation sequencing. The C1464X variant was
detected and called in the MiSeq data (top) at a frequency of 47% (200/423 reads). The variant can be seen in the Sanger sequencing trace (bottom), but
was not identified by the Mutation Surveyor software due to the relatively high background noise in the data.
Figure S4. Validation of new mutations found by MiSeq in the validation cohort in addition to TET2 C1464X. The remaining new mutations were
confirmed by Sanger sequencing (A-D) or fragment analysis (E).
Figure S5. Supervised clustering using methylation data from 11 5q- syndrome cases. All pictures have been cropped to show the hierarchical
clustering at the top. (A) Clustering using 422 differentially methylated genes between 2 ASXL1-mut cases and 7 cases with no epigenetic gene mutations.
(B) Clustering using 144 differentially methylated genes between 1 DNMT3A-mut case and 7 cases with no epigenetic gene mutations. (C) Clustering using
156 differentially methylated genes between 1 ASXL1 & TET2-mut case and 7 cases with no epigenetic gene mutations. (D) Clustering using 205
differentially methylated genes between 3 ASXL1-mut cases and 7 cases with no epigenetic gene mutations.