Sequencing in Microbiology and Infectious Disease

Transcription

Sequencing in Microbiology and Infectious Disease
Sequencing in
Microbiology and
Infectious Disease:
What’s Available Today?
Michael G Smith, PhD
Sr. Sequencing Specialist
[email protected]
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
© 2014 Illumina, Inc. All rights reserved.
Illumina, 24sure, BaseSpace, BeadArray, BlueFish, BlueFuse, BlueGnome, cBot, CSPro, CytoChip, DesignStudio, Epicentre, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate,
HiScan, HiSeq, HiSeq X, Infinium, iScan, iSelect, ForenSeq, MiSeq, MiSeqDx, MiSeqFGx, NeoPrep, Nextera, NextBio, NextSeq, Powered by Illumina, SeqMonitor, SureMDA, TruGenome, TruSeq,
TruSight, Understand Your Genome, UYG, VeraCode, verifi, VeriSeq, the pumpkin orange color, and the streaming bases design are trademarks of Illumina, Inc. and/or its affiliate(s) in the U.S. and/or
other countries. All other names, logos, and other trademarks are the property of their respective owners.
Increasing System Price & Output
1800Gb | 6B | 2x150
1000Gb | 4B | 2x125
120Gb | 400M | 2x150
15Gb | 25M | 2x300
MiSeq
NextSeq
HiSeq 2500
Decreasing Price Per Gb
2
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
HiSeq X Ten
Throughput to Match Microbiology Applications
Shotgun metagenomics
Microbial diversity
High
Throughput
Gene content and discovery
rRNA Metagenomics
Relative abundance of microbial
diversity
– 16S for bacteria and archaea
– 18S for eukaryotes
Microbial genomics
Detection
Low
Throughput
Identification
Antibiotic sensitivity testing
Molecular epidemiology
3
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
MiSeq Reporter
Workflow options
De Novo
Assembly
Resequencing
PCR Amplicon
Library QC
Amplicon
Metagenomics
Small RNA
Targeted RNA
4
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Illumina Informatics
Complete NGS Analysis Solutions
INTEGRATED
SAMPLE
MANAGEMENT
CORE APPS
All major
biological
applications
Workflow • Storage • Analysis • Sharing
3RD PARTY
APPS
A broad and
growing
ecosystem
EASY
SHARING
BaseSpace is
the place
where ideas
foster
ONE CLICK
DELIVERY
No FTP site,
no hard drive
to ship
5
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Next Generation Sequencing (NGS) & Infectious
Disease
High resolution genomic information enables a range of research
Discovery
6
Characterization
Rearrangements &
Evolution
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Host Pathogen
Interaction
Applications
WGS (fragment- single or double-end reads, and MP)
– De novo
– Resequencing
RNA-Seq
– Viruses and Meta-transcriptomics
Targeted
– Amplicon
Metagenomics (Mixed sample or CIDT)
7
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Whole Genome Sequencing
De novo Assembly and Resequencing
8
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
WGS Sequencing: Viral detection on the MiSeq
from both culture and non-cultured specimens
2 cases of HAV, appeared
unrelated
HAV detected by RT-PCR
Only one sample showed
infectivity in Frp/3 cells but total
RNA extracted directly from the
berries samples confirmed
positive results for HAV by NGS.
Full genome sequencing of subgenotype IA HAVs from frozen
berries linked to the European
outbreak associated with the
consumption of mixed berries
Chiapponi et al., Food Environ Virol 2014
9
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Hepatitis A Sequencing
Workflow
Sample
Prep
• RNA extracted
from inoculated
FRhK-4 derived
cells and directly
from the berries
sample
Library
Prep
• Genome
amplification
using overlapping
primers followed
by Nextera XT
library prep
Sequencing
• Sequencing on
MiSeq
• Sequencing kit:
500 cycle kit,
2 x 250 bp pairedend sequencing.
Analysis
• De-novo assembly
SeqMan NGen
DNAStar
• Alignment with
ClusterW
• Cluster analysis with
Mega5
Bold: Analysis tool on
MSR and BaseSpace
Chiapponi et al., Food Environ Virol 2014
10
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Re-Sequencing
Alignment
3X
2X
reference
Variant Calling
SNV
GCTATGCATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATTTCGAAACTGACT
CTATGCATTGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTG
random TATGCATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATTTCGAAACTGACTGT
errors ATGCATTGGCATGTCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTT
TGCATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATTTCGAAACTGACTGTTA
GCATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATTTCGAAACTGACTGTTAG
CATTGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTTAG
ATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATATCGAAACTGACTGTTAG
TTGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTTAG
TGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTTAG
reference GCTATGCATTGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTTAG
11
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Scaffolding
After mapping, contigs can be ordered, and oriented to produce
even longer sequences called “Scaffolding”, exploiting the mate-pair
library preparations.
Scaffolds
Reads
Reference
12
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Alignment tools for Microbiology Projects
Software
Type
Interface
Notes
BWA
Free SW
Command Line
SAM/BAM output
Bowtie
Free SW
Command Line
Stampy
Free SW
Command Line
Can be slow
SHRiMP2
Free SW
Command Line
Higher sensitivity
than BWA
SNP-o-matic
Free SW
Command Line
Very fast on small
genomes <100
mbp
CLC Workstation
Commercial
GUI
Easy to use
Novoalign
Commercial
Command Line
Fast and accurate
13
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
De novo Assembly in BaseSpace
De novo assemblies
produce contigs without
the aid of a reference
genome.
14
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Bacterial Resequencing
Alignment
Variant Calling
Phylogenetic
Analysis
Coming Soon
On BaseSpace!
Annotation
15
Prokka
http://www.vicbioinformatics.com/software.prok
ka.shtml
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
12 infants were identified to be MRSA colonized with novel subtype
Conventional methods mis-identified members of the outbreak
WGS revealed that the outbreak expanded beyond the Special Care Baby Unit blurring the
classification of hospital-acquired and community-acquired MRSA
Modes of transmission: infants-to-mothers; mothers-to-mothers; mothers-to-partners
154 staff members were screened for MRSA
One tested positive for the novel subtype
Isolation and de-colonization of the
individual effectively ended the outbreak
Cost of the outbreak to the hospital: £10000
Cost of sequencing per strain:
£95
16
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
WGS for 141 P. aeruginosa isolates were obtained from patients, hospital water and the ward environment.
Isolates from three patients had identical genotypes compared with water isolates from the same room.
Whole-genome shotgun sequencing of biofilm DNA extracted from a thermostatic mixer valve revealed this
was the source of a P. aeruginosa subpopulation previously detected in water.
Sample Prep
• DNA extracted
using the MOBIO
UltraClean Microbial
DNA Kit, or Qiagen
Genomic-Tip 100G
17
Library Prep
• Libraries generated
using Illumina
Nextera XT
Sequencing
• Performed on MiSeq
2x150 or 2x300
• One sample also
sequenced on
PacBio
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Analysis
• Aligned reads using
BWA-MEM
• VarScan used for
variant calls
• MLST performed by
Velvet/BLAST
RNA-Seq
Viruses and Meta-transcriptomics
18
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
RNA Viruses: Vaccine Development for Seasonal Flu
Influenza A and B on the Miseq Platform
“Seasonal influenza vaccines,
an essential part of strategic
prevention plans, are updated
continuously to match
circulating strains each year.”
“The design and composition
of the influenza vaccine vary
depending on any changes in
the combined lengths of
influenza viruses A (13.6 kb)
and B (14.6 kb) genome or
variations of the 8 genomic
fragments”
19
Rutvisuttinunt et al. 2013
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Influenza A and B Sequencing
Workflow
Sample
Prep
• RNA extracted
from cell lines
infected with
influenza.
Library
Prep
• Illumina TruSeq
RNA library
prep (standard
protocol).
Sequencing
• Sequencing on
MiSeq
• Pooled 6 viral
samples for
sequencing in one
run.
• Sequencing kit:
300 cycle kit,
2 x 150 bp pairedend sequencing.
Analysis
• Used analysis
workflows available
on MiSeq Reporter
and BaseSpace
• Alignment using
Velvet
• Variant analysis
using BWA-GATK
and Broad IGV
• Also used 3rd party
tools
Bold: Analysis tool on
MSR and BaseSpace
20
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Sequencing Data
High Quality and Deep Coverage
Quality Scores
21
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Coverage
Sequencing Data
Low Variant Detection
Variant
differences /
total bases
Variant %
Table 3
Number of sequence variations from matched references.
Number
2
4
Name
References
A / Thailand/VIRO AF2/2012
B / Thailand/VIRO AF4/2012
Influenza A (A/Brisbane) /11/2010)
Influenza B (B/Wisconsin) /01/2010)
Fragment
(1) PB2
17/2308 (0.7%)
12/2358 (0.5%)
The depth of coverage and accuracy were suitable to
subtype strains for vaccine preparations.
22
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Variant %
Metagenomics
23
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Metagenomics, what is it?
Mixed population samples, 16S rRNA, metagenome (DNA)
or metatranscriptome (RNA)
Hypothesis-free
Culture independent diagnostic testing (CIDT)
24
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Metagenomics
Hypothesis-free Pathogen Detection using the MiSeq
Background:
14 year old boy (SCID as infant), presents 3 times
to the hospital over 14 months with multiple
symptoms including vomiting, headaches,
conjunctivitis and fever.
On and off symptoms for over a year with over 40
hospital tests performed and no diagnosis.
Research Study Aim:
Can Next Generation Sequencing (NGS) help
identify a pathogen?
Results:
Clinical assay for Leptospira, CLIA validated –
negative result
NGS identified Leptospirosis infectious in CSF
Treated with standard Penicillin G and recovered
within a couple weeks
http://www.nejm.org/doi/full/10.1056/NEJMoa1401268
25
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Metagenomics, UCSF Protocol
Application: Shotgun WGS
Sample
Prep
• NO culture
• Extracted
nucleic acids
Library
Prep
• Modified TruSeq
Protocol
Sequencing
• MiSeq workflow:
• 500 cycles
Analysis
• Data Analyzed to
SURPI pipeline.
• Generate FASTQ
• Random priming
to cDNA
http://genome.cshlp.org/content/early/2014/05/16/gr.171934.113.full.pdf+html
26
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Targeted Resequencing
Amplicon
27
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Amplicon-based ReSequencing
16S User Guide
– Homebrew Amplicons
TSCA not for Microorganisms
Concierge Service
28
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
Workflow Overview
16S rRNA end-to-end solution
Sample
Extraction
V3–V4 region
Amplification
Library Prep
Webinar Link:
https://www.youtube.com/watch?v=XbR8Qnfzn04
29
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
MiSeq
Sequencing
Analysis
doi:10.1128/mBio.01012-14
►
Compared OTUs of oral
microbiota of healthy and
periodontal disease samples.
►
OTUs present varied across
all samples.
►
Transcriptome analysis
revealed enzyme expression
was found to be well
conserved among disease
samples.
30
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
16S rRNA Sequencing
The 16S Metagenomics app
performs taxonomic classification of
16S rRNA targeted amplicon reads
using an Illumina-curated version of
the GreenGenes taxonomic
database.
QIIME is an open source software
package for comparison and
analysis of microbial communities,
primarily based on high-throughput
amplicon sequencing data.
Mothur, a bioinformatic tool for
analyzing 16S rRNA gene
sequences.
31
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
MiSeq
32
MiSeqDx
NextSeq
HiSeq
COMPANY CONFIDENTIAL – INTERNAL USE ONLY
HiSeq X Ten