View PDF - Pathology Informatics Summit

Transcription

View PDF - Pathology Informatics Summit
Post-Analytic Clinical Informatics for
Molecular and Genomic Medicine
Federico A. Monzon M.D.
Pathology Informatics Summit 2014
Disclosures
•
Employment
−
Cancer Genetics Laboratory, Baylor College of Medicine
o
−
Invitae Corporation
o
•
Up to Aug 2013
Since Sept 2013
Advisory Board
−
Complete Genomics (April 2012 – June 2013)
Agenda
What goes into interpretation of NGS
data?
• Information challenges for
interpretation of NGS results
• Challenges and opportunities for
reporting and delivery of NGS results
•
Pathologists
https://www.invitae.com/en/news/2013/07/03/invitae-blog-jill-hagenkord-are-physicians-ready/
You can ride the Genetics/Genomics wave!!!
And Informatics is your surf board!
http://photofunnypicture.com/hd-nature-desktop-wallpapers/surfing-group-wallpaper-hd/
BCM Mercury pipeline for Cancer Exomes
Variant call
format
(.vcf)
Jeffrey Reid
Review and Interpretation Pipeline
Confirmation:
Report
Sanger seq
AmpliSeq
CAST PCR
QC, Interpret and Rank
Jeffrey Reid
Tracking test/analysis handoffs
• Excel, email, etc…
• Temporary solution for low volume but is NOT
scalable
Paul Lurix
Bird's Eye View of NGS Data Workflow
Reece Hart
Interpretation
•
Determine the significance of the mutation in the context
of:
−
−
−
−
−
−
−
−
•
Quality of sequencing data
Frequency of variant call
Patient’s phenotype
Family history
Mutated gene(s) (assoc with disease/phenotype)
Type of mutation (silent vs nonsense/missense/splicing)
Mutation previously reported in disease/tumor
Location of mutation in the protein
Automated vs manual review/curation approaches
• Apply interpretation guidelines (ACMG/CAP/AMP)
Variant Call Format (VCF) File
•
Stores variation information for one or more samples
based on genomic location
•
Important! Version of the genome reference is essential. Currently
GRCh37 or hg19 (UCSC) for clinical use, but GRCh38/hg38 is now
available
External and Internal Databases
12
6/13/2014
|
Copyright © InVitae, Inc. All Rights Reserved
|
CONFIDENTIAL
Annotation databases
(examples)
• dbSNP: reported yes/no, clinical significance –
broad / less specific
• HGMD: focused on association with inherited
disease
• COSMIC: focused on association with with
cancer (mostly somatic)
• BIC: focused on breast cancer genes
• My Cancer Genome: focused on therapeutic
significance
All of these have caveats!!!
Annotated VCF
•
•
•
•
Variant frequency and quality, prediction algorithms
Frequency and association with disease: dbSNP, HGMD, OMIM, etc..
Other external annotations: COSMIC, ClinVar, etc..
Internal annotations: Seen before? What did we call it?
How to present the
information for
interpretation?
Cancer Exome Dashboard at BCM
Variant Review and Reporting @ Invitae
Alamut
Alamut
Integrative Genomics Viewer (IGV)
http://www.broadinstitute.org/igv/
Integrative Genomics Viewer (IGV)
http://www.broadinstitute.org/igv/
Commercial software available
Things to consider
•
Normal and Tumor for cancer applications
• Quality Metrics are important
• Calling mutation
−
Which transcript to use?
Mapping & Alignment Issues in a Nutshell
A
➊
gap
-
Transcript
≠ Reference
➋
InDel –
downstream coordinates shifted
T
NCBI
UCSC
➌
Exon coordinate discrepancies
across sources
NCBI
➍
Historical transcripts no longer available
25/4
Reece Hart
Example: rs11340767 (RND3)
Source
AC
Reference
exons
EUtils
NM_005168.3
GRCh37.p10
1146 / 125 / 320 / 1998
NM_005168.4
NG_008492.1
1398 / 125 / 320 / 1998
seqgene
NM_005168.3
GRCh37.p10
102 / 1046 / 125 / 321 / 143 / 1855
UCSC
NM_005168.4
hg19
1398 / 135 / 244 / 76 / 1997
26/4
Reece Hart
Example: PECAM1 Coordinate Discrepancy
UCSC and NCBI
coordinates for exon 1
of PECAM1 differ by
2.5kb
Ensembl says the
whole gene is in a
patch 16kb away.
27/4
Reece Hart
Shared tools to deal with some of these issues
Universal Transcript Archive


An archive of all (recent) versions of transcripts, from
multiple sources, and multiple alignment methods.
http://bitbucket.org/invitae/uta/
HGVS Parser, Mapper, Validator, Formatter


Python tools for manipulating HGVS, including mapping
between transcripts and reference, inferring protein
consequence, and lifting over between transcripts.
http://bitbucket.org/invitae/hgvs/
Reece Hart
Interpretation Issues
•
Limited evidence on the clinical utility of a specific mutation or
gene
−
−
Genes with mild/moderate penetrance: magnitude of risk? What to do?
Few mutations are listed in consensus management guidelines (NCCN, ASCO,
etc)
− Few institutional and national efforts to gather and curate evidence (e.g.
www.mycancergenome.org)
•
Clinical significance of well-studied mutations in a different
tumor type
BRAF V600E mutation – key in metastatic melanoma. Significance in breast
cancer?
− Requires pathologists to research the evidence of clinical utility in order to
issue a clinically relevant interpretation.
−
•
Clinical significance for novel mutations in targetable genes
−
−
Unreported mutation in EGFR
Do they confer susceptibility to Geftifinib/Erlotinib
Limitations of Current LIS/EMR structure
Assay Information
•
LISs are not ready to deal with genomic information.
Single analyte vs. multiparametric assays
− Performance characteristics of the assay.
−
•
For example, for a specific reported mutation one should
store: sequence information, sequencing depth and
quality, location, genome build, gene transcript
evaluated, technology used, etc.
•
For a negative result, one would should store:
sequencing depth/quality, regions interrogated (or not)
Limitations of Current LIS/EMR structure
Reporting
•
We have information communication standards that do not
support data formatting and metadata (data associated to the
result) and thus we need to “dumb-down” the result into text
files and or tables in order to be reported.
•
Limits ability to convey information in a graphical or interactive
manner and to provide access to additional information about
assay or the clinical relevance of the result.
•
As it currently stands, text reporting of NGS-based assay results
is suboptimal.
Report Example – Leukemia Panel
What did you find?
What does this mean for my patient?
What can I do next?
What is the evidence?
NGS Reporting Wish List
•
Ability to deliver report metadata
−
What was covered and how well?
o
−
What was missed?
o
−
What are the sources? How many patients studied? Similar patients to
mine? What were the outcome measures?.....
Is there new information about a reported change?
o
•
Was gene XYZ adequately covered?
What is the evidence for the interpretation?
o
−
Assay design information, historical performance, individual run
performance
Access to “just in time” information
HIS that can handle molecular and genomic data!
Is my gene of interest covered?
How well?
Approaches with the current limitations of
LIS/EMR structure
•
Summarize key aspects of the results and provide
high-level background and “on request” access to
more detailed information
Short concise report with key findings (as example before)
− Feedback from oncologists indicate this is preferred
−
•
Report all available and potentially useful information
−
•
Long report with extensive graphical and bibliographic
information
Do something new
The eReport
• BCM iPad Exome Reporting app
BCM iPad Exome Reporting app
Integration to Pathology Report
•
In Diagnostic section, Comment or Addendum
Does it change tumor classification?
− Does it impact clinical management?
−
•
Currently at Texas Children’s Hospital the Cancer Exome Report
is a stand alone report in the EMR
−
Pathologists may or may not issue a comment/addendum
•
How to best fit this information into pathology reports?
•
AP-LISs will need modifications to accommodate reports from
NGS data
Raw Data Release
• Controversial point
• Patient owns their genome?
• Can physicians have access to all raw
data regardless of their capacity to
analyze and interpret?
• Clash of models:
– Patient only has access to interpreted
results through physician (physician knows
best)
– Patient owns their data and can decide on
how best to use it (patient knows best)
• Ethical, social and legal implications
In Closing
•
Challenges in bioinformatics pipeline for consistent sequence
re-alignments and cross-check of T/N sequences
• Challenges in the access of relevant information for
gene/mutation
• Challenges in the amount of clinical evidence for interpretation
• Challenges in our understanding of biology to make sense of
all observed variation.
•
Commercial and open source tools are
−
−
•
Most have been developed for research use
Increasing number of tools suitable for clinical use
These are not new challenges, we encounter these issues in
all “genomic” technologies (panels, arrays, etc..)
• These are actually opportunities to improve our management
of genomic data
Conclusions
•
Interpretation and reporting of NGS data requires the evaluation and
availability of different sources of information
−
This information needs to available to the clinical care team with informatics
tools that conform to the clinical environment
− Need to develop tools that provide the geneticist/pathologist with the
information needed to provide adequate interpretation of genetic variants
•
Web enabled reporting technologies are a potential solution to
enable graphical and interactive display of NGS results
−
−
Implementation of these solutions in a HIPAA compliant manner
Interactive reports can present the information in different ways to members of
the healthcare team and patient
− Links to internal and external information sources that allow members of the
healthcare team to further explore the results and the evidence used to guide
the interpretation
•
As adoption of clinical sequencing kicks into high gear, molecular
diagnosticians will be faced with managing genomic information and
producing high-content reports that supports clinical decision making
Acknowledgements
• Senior Leadership
– Arthur Beaudet, Richard Gibbs, Jim Lupski, Sharon Plon, David
Wheeler, Kent Osborne, Martha Mims, Jim Versalovic, Tom
Wheeler
• BASIC3 Leadership
– Sharon Plon, Will Parsons, Amy McGuire...
• CGL, TCH Pathology
– Marilyn Li, Liu Liu, Angshumoy Roy, Lola Lopez-Terrada…
• Whole Genome Laboratory / HGSC / MGL
– Yaping Yang, Donna Muzny, Christine Eng, Jeff Reid, Matthew
Bainbridge, Peter Pham, Doreen Ng….
WHOLE EXOME SEQUENCING
In Vitae Team
CROP Team: Jill Hagenkord, Scott Topper, Jon Sorenson, Tim Chiu, Steven
Ciraolo, Geoff Nilsen, David Pirkle, and Emily Hare
Questions?
invitae.com
[email protected]
415.374.7782
Federico A. Monzon M.D., FCAP
[email protected]