Sequencing in Microbiology and Infectious Disease
Transcription
Sequencing in Microbiology and Infectious Disease
Sequencing in Microbiology and Infectious Disease: What’s Available Today? Michael G Smith, PhD Sr. Sequencing Specialist [email protected] COMPANY CONFIDENTIAL – INTERNAL USE ONLY © 2014 Illumina, Inc. All rights reserved. Illumina, 24sure, BaseSpace, BeadArray, BlueFish, BlueFuse, BlueGnome, cBot, CSPro, CytoChip, DesignStudio, Epicentre, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, HiSeq X, Infinium, iScan, iSelect, ForenSeq, MiSeq, MiSeqDx, MiSeqFGx, NeoPrep, Nextera, NextBio, NextSeq, Powered by Illumina, SeqMonitor, SureMDA, TruGenome, TruSeq, TruSight, Understand Your Genome, UYG, VeraCode, verifi, VeriSeq, the pumpkin orange color, and the streaming bases design are trademarks of Illumina, Inc. and/or its affiliate(s) in the U.S. and/or other countries. All other names, logos, and other trademarks are the property of their respective owners. Increasing System Price & Output 1800Gb | 6B | 2x150 1000Gb | 4B | 2x125 120Gb | 400M | 2x150 15Gb | 25M | 2x300 MiSeq NextSeq HiSeq 2500 Decreasing Price Per Gb 2 COMPANY CONFIDENTIAL – INTERNAL USE ONLY HiSeq X Ten Throughput to Match Microbiology Applications Shotgun metagenomics Microbial diversity High Throughput Gene content and discovery rRNA Metagenomics Relative abundance of microbial diversity – 16S for bacteria and archaea – 18S for eukaryotes Microbial genomics Detection Low Throughput Identification Antibiotic sensitivity testing Molecular epidemiology 3 COMPANY CONFIDENTIAL – INTERNAL USE ONLY MiSeq Reporter Workflow options De Novo Assembly Resequencing PCR Amplicon Library QC Amplicon Metagenomics Small RNA Targeted RNA 4 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Illumina Informatics Complete NGS Analysis Solutions INTEGRATED SAMPLE MANAGEMENT CORE APPS All major biological applications Workflow • Storage • Analysis • Sharing 3RD PARTY APPS A broad and growing ecosystem EASY SHARING BaseSpace is the place where ideas foster ONE CLICK DELIVERY No FTP site, no hard drive to ship 5 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Next Generation Sequencing (NGS) & Infectious Disease High resolution genomic information enables a range of research Discovery 6 Characterization Rearrangements & Evolution COMPANY CONFIDENTIAL – INTERNAL USE ONLY Host Pathogen Interaction Applications WGS (fragment- single or double-end reads, and MP) – De novo – Resequencing RNA-Seq – Viruses and Meta-transcriptomics Targeted – Amplicon Metagenomics (Mixed sample or CIDT) 7 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Whole Genome Sequencing De novo Assembly and Resequencing 8 COMPANY CONFIDENTIAL – INTERNAL USE ONLY WGS Sequencing: Viral detection on the MiSeq from both culture and non-cultured specimens 2 cases of HAV, appeared unrelated HAV detected by RT-PCR Only one sample showed infectivity in Frp/3 cells but total RNA extracted directly from the berries samples confirmed positive results for HAV by NGS. Full genome sequencing of subgenotype IA HAVs from frozen berries linked to the European outbreak associated with the consumption of mixed berries Chiapponi et al., Food Environ Virol 2014 9 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Hepatitis A Sequencing Workflow Sample Prep • RNA extracted from inoculated FRhK-4 derived cells and directly from the berries sample Library Prep • Genome amplification using overlapping primers followed by Nextera XT library prep Sequencing • Sequencing on MiSeq • Sequencing kit: 500 cycle kit, 2 x 250 bp pairedend sequencing. Analysis • De-novo assembly SeqMan NGen DNAStar • Alignment with ClusterW • Cluster analysis with Mega5 Bold: Analysis tool on MSR and BaseSpace Chiapponi et al., Food Environ Virol 2014 10 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Re-Sequencing Alignment 3X 2X reference Variant Calling SNV GCTATGCATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATTTCGAAACTGACT CTATGCATTGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTG random TATGCATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATTTCGAAACTGACTGT errors ATGCATTGGCATGTCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTT TGCATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATTTCGAAACTGACTGTTA GCATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATTTCGAAACTGACTGTTAG CATTGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTTAG ATTGGCATGGCATGCTAGCTACGGGATGCTGATCGATATCGAAACTGACTGTTAG TTGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTTAG TGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTTAG reference GCTATGCATTGGCATGGCATGCTAGCTACAGGATGCTGATCGATTTCGAAACTGACTGTTAG 11 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Scaffolding After mapping, contigs can be ordered, and oriented to produce even longer sequences called “Scaffolding”, exploiting the mate-pair library preparations. Scaffolds Reads Reference 12 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Alignment tools for Microbiology Projects Software Type Interface Notes BWA Free SW Command Line SAM/BAM output Bowtie Free SW Command Line Stampy Free SW Command Line Can be slow SHRiMP2 Free SW Command Line Higher sensitivity than BWA SNP-o-matic Free SW Command Line Very fast on small genomes <100 mbp CLC Workstation Commercial GUI Easy to use Novoalign Commercial Command Line Fast and accurate 13 COMPANY CONFIDENTIAL – INTERNAL USE ONLY De novo Assembly in BaseSpace De novo assemblies produce contigs without the aid of a reference genome. 14 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Bacterial Resequencing Alignment Variant Calling Phylogenetic Analysis Coming Soon On BaseSpace! Annotation 15 Prokka http://www.vicbioinformatics.com/software.prok ka.shtml COMPANY CONFIDENTIAL – INTERNAL USE ONLY 12 infants were identified to be MRSA colonized with novel subtype Conventional methods mis-identified members of the outbreak WGS revealed that the outbreak expanded beyond the Special Care Baby Unit blurring the classification of hospital-acquired and community-acquired MRSA Modes of transmission: infants-to-mothers; mothers-to-mothers; mothers-to-partners 154 staff members were screened for MRSA One tested positive for the novel subtype Isolation and de-colonization of the individual effectively ended the outbreak Cost of the outbreak to the hospital: £10000 Cost of sequencing per strain: £95 16 COMPANY CONFIDENTIAL – INTERNAL USE ONLY WGS for 141 P. aeruginosa isolates were obtained from patients, hospital water and the ward environment. Isolates from three patients had identical genotypes compared with water isolates from the same room. Whole-genome shotgun sequencing of biofilm DNA extracted from a thermostatic mixer valve revealed this was the source of a P. aeruginosa subpopulation previously detected in water. Sample Prep • DNA extracted using the MOBIO UltraClean Microbial DNA Kit, or Qiagen Genomic-Tip 100G 17 Library Prep • Libraries generated using Illumina Nextera XT Sequencing • Performed on MiSeq 2x150 or 2x300 • One sample also sequenced on PacBio COMPANY CONFIDENTIAL – INTERNAL USE ONLY Analysis • Aligned reads using BWA-MEM • VarScan used for variant calls • MLST performed by Velvet/BLAST RNA-Seq Viruses and Meta-transcriptomics 18 COMPANY CONFIDENTIAL – INTERNAL USE ONLY RNA Viruses: Vaccine Development for Seasonal Flu Influenza A and B on the Miseq Platform “Seasonal influenza vaccines, an essential part of strategic prevention plans, are updated continuously to match circulating strains each year.” “The design and composition of the influenza vaccine vary depending on any changes in the combined lengths of influenza viruses A (13.6 kb) and B (14.6 kb) genome or variations of the 8 genomic fragments” 19 Rutvisuttinunt et al. 2013 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Influenza A and B Sequencing Workflow Sample Prep • RNA extracted from cell lines infected with influenza. Library Prep • Illumina TruSeq RNA library prep (standard protocol). Sequencing • Sequencing on MiSeq • Pooled 6 viral samples for sequencing in one run. • Sequencing kit: 300 cycle kit, 2 x 150 bp pairedend sequencing. Analysis • Used analysis workflows available on MiSeq Reporter and BaseSpace • Alignment using Velvet • Variant analysis using BWA-GATK and Broad IGV • Also used 3rd party tools Bold: Analysis tool on MSR and BaseSpace 20 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Sequencing Data High Quality and Deep Coverage Quality Scores 21 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Coverage Sequencing Data Low Variant Detection Variant differences / total bases Variant % Table 3 Number of sequence variations from matched references. Number 2 4 Name References A / Thailand/VIRO AF2/2012 B / Thailand/VIRO AF4/2012 Influenza A (A/Brisbane) /11/2010) Influenza B (B/Wisconsin) /01/2010) Fragment (1) PB2 17/2308 (0.7%) 12/2358 (0.5%) The depth of coverage and accuracy were suitable to subtype strains for vaccine preparations. 22 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Variant % Metagenomics 23 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Metagenomics, what is it? Mixed population samples, 16S rRNA, metagenome (DNA) or metatranscriptome (RNA) Hypothesis-free Culture independent diagnostic testing (CIDT) 24 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Metagenomics Hypothesis-free Pathogen Detection using the MiSeq Background: 14 year old boy (SCID as infant), presents 3 times to the hospital over 14 months with multiple symptoms including vomiting, headaches, conjunctivitis and fever. On and off symptoms for over a year with over 40 hospital tests performed and no diagnosis. Research Study Aim: Can Next Generation Sequencing (NGS) help identify a pathogen? Results: Clinical assay for Leptospira, CLIA validated – negative result NGS identified Leptospirosis infectious in CSF Treated with standard Penicillin G and recovered within a couple weeks http://www.nejm.org/doi/full/10.1056/NEJMoa1401268 25 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Metagenomics, UCSF Protocol Application: Shotgun WGS Sample Prep • NO culture • Extracted nucleic acids Library Prep • Modified TruSeq Protocol Sequencing • MiSeq workflow: • 500 cycles Analysis • Data Analyzed to SURPI pipeline. • Generate FASTQ • Random priming to cDNA http://genome.cshlp.org/content/early/2014/05/16/gr.171934.113.full.pdf+html 26 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Targeted Resequencing Amplicon 27 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Amplicon-based ReSequencing 16S User Guide – Homebrew Amplicons TSCA not for Microorganisms Concierge Service 28 COMPANY CONFIDENTIAL – INTERNAL USE ONLY Workflow Overview 16S rRNA end-to-end solution Sample Extraction V3–V4 region Amplification Library Prep Webinar Link: https://www.youtube.com/watch?v=XbR8Qnfzn04 29 COMPANY CONFIDENTIAL – INTERNAL USE ONLY MiSeq Sequencing Analysis doi:10.1128/mBio.01012-14 ► Compared OTUs of oral microbiota of healthy and periodontal disease samples. ► OTUs present varied across all samples. ► Transcriptome analysis revealed enzyme expression was found to be well conserved among disease samples. 30 COMPANY CONFIDENTIAL – INTERNAL USE ONLY 16S rRNA Sequencing The 16S Metagenomics app performs taxonomic classification of 16S rRNA targeted amplicon reads using an Illumina-curated version of the GreenGenes taxonomic database. QIIME is an open source software package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data. Mothur, a bioinformatic tool for analyzing 16S rRNA gene sequences. 31 COMPANY CONFIDENTIAL – INTERNAL USE ONLY MiSeq 32 MiSeqDx NextSeq HiSeq COMPANY CONFIDENTIAL – INTERNAL USE ONLY HiSeq X Ten