Seminar Presentation - National Institute for Animal Agriculture
Transcription
Seminar Presentation - National Institute for Animal Agriculture
Genomics in food security: 100K Pathogen genome Project Bart Weimer, Ph.D. Professor UC Davis - School of Veterinary Medicine Director BGI@UCDavis The Agricultural System Biosecurity & Safety Health Agriculture Environment Changing world & food safety challenges • World population predicted to reach 9.2 billion by 2050 10 Population (billions) • Increased urbanization • Developing countries increase ~1.5 to 2.0 8 • ~25 Mega-cites around world • • • Increased density Increased distance from food supply Predicted increase in world-wide food-related disease outbreaks 6 4 • Intensive agriculture needed to feed the world 2 Industrialized = Aging Developing = Baby boom 0 1990 2000 2010 2020 2030 2040 2050 (modified from Z_punkt) Food Safety & Quality • Example outbreaks associated with food: • 2013 – Pet food, turtles, hedgehogs, tea, chicken, beef, pork, & Salmonella • 2012 – Pet food, fruit, nuts, peanut butter, hamburger, tuna, chicken & Salmonella • 2011 – Veggies and Re-emergence of E. coli O104 (genomics required to solve) • 2010 – Eggs and S. Enteritidis • 2008 – Peanut butter and S. Typhimurium • 2006 – Tomatoes, peppers and S. Typhimurium • Estimated economic impact in the EU >€5 billion from Campylobacter & Salmonella (EFSA, 2012) Food Safety News Jan 2012 Total FBI 2011 Staph aureus 3% Campylobacter spp. 9% Toxoplasma E. coli O157:H7 gondii 2% 1% Listeria monocytogenes 1% Clostridium perfringens 11% Salmonella (non-typhoid) 12% http://www.cdc.gov/foodborneburden/PDFs/FACTSHEET_A_FINDINGS_updated4-13.pdf Norovirus 61% FBI deaths 2011 Listeria monocytogenes 21% E. coli O157:H7 4% Norovirus 12% Salmonella (non-typhoid) 30% Toxoplasma gondii 26% Staphylococcus aureus 0% Campylobacter spp. 7% http://www.cdc.gov/foodborneburden/PDFs/FACTSHEET_A_FINDINGS_updated4-13.pdf Clostridium perfringens 0% Foodborne Pathogens • Salmonella particularly devastating: • High serotype diversity • High mobile element diversity • Frequent horizontal gene transfer • Emerging stable hypervirulence • Heithoff et al.PLoS Pathog 8(4):e1002647 • Large genomic diversity within serotype • E. coli O104 example of NGS & solutions Salmonella Diversity Salmonella species S. enterica Number of serovars 2,557 S. enterica subsp. enterica 1,531 S. enterica subsp. salamae 505 S. enterica subsp. arizonae 99 S. enterica subsp. diarizonae 336 S. enterica subsp. houtenae 73 S. enterica subsp. indica 13 22 S. bongori Total (genus Salmonella) 2,579 1,630 serotypes important in food animals • Approximately 50-60 serotypes are most common to cause FBI in humans • New foods becoming associated with new serotypes Dynamic Microbial Communities & Disease Newport Relative rate (Log change) Javiana Enteritidis Montevideo Heidelberg Typhimurium As one serotype declines others increase FoodNet Salmonella Phylogenomics 16s Alignment SNPs in 16s rDNA Whole Genome Alignment Entire genome reflect all similarities Active Outbreaks & Complete Genomes • Salmonella enterica subsp. enterica serovar Javiana • Common in fresh cut produce • Only one previously sequenced genome (JCVI, 2008), 19 contigs • Isolate CFSAN001992_73: • Clinical Arizona isolate from produce-related 2012 outbreak • Complete process from isolate to finished genomic sequence <1 week • 1 chromosome; 2 plasmids containing never-seen sequence: Collaboration with M. Allard, E. Brown, E. Strain, M. Hoffman, T. Muravanda, S. Musser (FDA), B. Weimer (UC Davis), Jonas Kolach (PacBio) Sequencing discovers new genes New gene families Core gene families Variable Salmonella genome Pan-genome increases with each isolate sequenced Micro. Ecol. 2011 The Value of Microbial Genomes • We know <1% of the Earth’s microbiome • Horizontal gene transfer is wide-spread and frequent • High-quality, finished genomes are the starting point for: • Functional genomic studies • Comparative genomics • Forensics • Metagenomics Fraser et al. (2002) J Bacteriology 184: 6403-6405 Chain et al. (2009) Science 326: 236-237 Pathogen Evolution • Vibrio evolution rapid • • Example for all enteric bacteria Also shown with environmental organisms • Enterobacteria genome evolution • • HGT more common than appreciated Genome rearrangements influenced by biogeography & other bacterial community members • Evidence for local pressure to induce population genome evolution • • • • Biogeography differences Likely to find footprints of geographical origin Requires large number of genomes to estimate Creates chimeric genomes • Stress • • Induces SNPs Induces new virulence and drug resistance • Mutations in DNA repair genes leads to SNPs • Recombination events • • • SNPs Large segments HGT Shapiro et al., ‘12 Science; Denef & Banfield, ’12 Science New Detection Paradigm Specific gene (PCR) Genome (NGS, multiplex) Microbial Communities & Health Community structure Kawamoto et al., ‘12 Science Olszak et al., ‘12 Science Desai & Weimer, ‘09 Who’s there? What are they doing? Growth & metabolism Microbe changes Host association Host changes Bacteria Testing in Food Collect Sample (25 or 325 g) Protein Pre-enrich sample in 4 L of broth Nonculturable bacteria Fermentation product inhibitors Fat Immune inhibitors Detection method Detection timing Aim: reduce time to result with low cost, sensitive, accuracy Too little genomic information = difficult molecular methods Project aims to combine 2 of 3 steps Detection Serotyping Strain Typing Genus, species Serogroup, Serovar Genovar, Genome qPCR, etc. 8-30 h Traditional: 3-5 days Serology: 4-5 days PFGE, MLST, genotype 5-10 days Faster response: < 30 h More informative results Specificity Sensitivity Surveillance Outbreak response Outbreak investigation Reduce time to result Traditional - Enrichment Ship to lab Pre-enrich Collect sample Log sample Prep Enrich Plate Examine plate Selective enrich Confirm Presumptive Bank Examine plate Confirm Bank ID Characterize Genomics T0 T24 T36 T48 T60 T72+ Next generation – Culture independent Capture & concentrate Collect sample Presumptive Relative amt. Directed plate Multiplex PCR Confirm PCR Directed enrich qPCR Sequence DNA prep Sequencing prep Bank DNA ID Characterize In/out of event T0 T0.5 T1 T2 T3 T4+ NGS Costs are Falling (DeWitt et al., 2011) Genomics & Rapid detection • German outbreak • Chimeric genome – deleted genes = False negative • Eae deletion – main PCR marker missing = False negative • FSMA • International reach with new regulations • Open to investigate alternate detection methods - CID • International trade now firmly in the FDA plan • Public health effort • Using whole genome sequencing (WGS) to investigate outbreaks • Installing Illumina sequencers in field offices • More robust biomarkers are needed for routine testing 2013 Poultry Sci 92:562–572 100K Pathogen genome project 2012 HHSInnovate Secretary’s Choice Awardee Increase food safety using microbe systems biology http://100kgenome.vetmed.ucdavis.edu Discover the genetic constituents that are robust to be predictive biomarkers for specific traits Rapid ID and tracking Understand evolution to build more robust detection systems New isolate emergence and persistence Integration into current practices integration of 100K Project • Produce a database of phylogenomic diversity of important FBI • Industry representative genomes important • Background organisms M. Allard 100K Consortium • Founding Members, Executive committee • • • • NIH (NCBI) CDC USDA Mars, Inc. Pacific Biosciences Steering committee provides guidance for overall project direction and goals ADDITIONAL PARTICIPANTS WELCOME Affiliate Members • • • • • • • • • • • • Agilent Technologies UC Davis (Weimer lab) FDA Additional Steering Committee members • • • • • • • • UC Davis Food Science; Veterinary Diagnostic Lab Salisbury University (US) DoD - Walter Reed Hospital Mass General - Harvard hospital system RIVM (Netherlands) DTU (Denmark) MEFOSA (Lebanon) Sydney Technical University (Australia) Rajiv Gandhi Biotechnology Institute (India) Institute of Environmental Science & Research (NZ) Oak Ridge National Laboratory (ORNL) Additional negotiations in process with groups from Asia, Africa, Europe Corporate Affiliates • • • • • Pacific Biosystems cBio OpGen Kapa Biosystems BGI@UCDavis Process & outcome Collect metadata Collect isolate Sequence genome Merge genome sequence & metadata Validate & verify sequence Validate & verify information Genome evolution Analyze genome for actionable features Traceability features Common traits Diagnostic features Ecology features Infection features Release genomes to public Submission Logistics • Affiliates • Many options • In-kind • Funding • Isolates • Analysis • Data hosting • Various levels of commitment available • Groups providing funding & linking sequencing to important isolates • Isolate submission • • • • • Isolate agreement • MTA • Timing and specific isolates Submit isolates & metadata Authentication Bank isolates for DNA isolation & library construction Sequence – • BGI@UCDavis • Return data to submitter • 12 months for review • Deposit in NCBI for public access • Data return & analysis • Publication Organisms of Interest Initial focus • Salmonella • Norovirus • Listeria • Hepatitis A • Campylobacter • Enteroviruses • Vibrio • E. coli • Shigella • • Yersinia Short reads sequence followed by long read technologies for sub-set of isolates to complete genome • Clostridium • Optical mapping will be used for a selected set to ensure genome quality • Enterococcus • • Cronobacter Long read technology will be used to close 1,000 genomes • Capture genomic diversity to represent pan-genome for the most important organisms • World-wide representation Isolate bank Authenticated & banked Isolates (~3,500 isolates) Brenneria, 1, 0% Brenneria Carnobacterium, 1, 0% Campylobacter Campylobacter, 110, 5% Bacillus, 8, 0% Streptococcus, 2, 0% Weisella, 1, 0% Pending authentication & banking Bacillus (~15,000 isolates) Streptococcus, 2, 0% Carnobacterium Staphylococcus, 1, 0% Citrobacter Shigella , 4, 0% Cronobacter Citrobacter, 1, 0% Cronobacter, 3, 0% Erwinia, 2, 0% Weisella, 1, 0% Bacillus, 8, 0% Brenneria, 1, 0% Vibrio, 305, 5% Erwinia Escherichia Staphylococcus, 1, 0% Enterococcus Vibrio, 287, 13% Shigella , 4, 0% Enterobacter Escherichia, 203, 9% Enterobacter, 4, 0% Exiguobacterium Salmonella, 1674, 28% Klebsiella Exiguobacterium, 1, 0% Enterococcus, 210, 9% Klebsiella, 2, 0% Listeria, 89, 4% Lactococcus, 1, 0% Leuconostoc, 3, 0% Moraxella, 1, 0% Salmonella, 1342, 59% Proteus, 1, 0% Rummeliicillus, 1, 0% Campylobacter, 3295, 55% Lactococcus Rummeliicillus, 1, 0% Listeria Proteus, 2, 0% Leuconostoc MoraxellaMoraxella, 1, 0% Leuconostoc, 3, 0% Proteus Rummeliicillus Listeria, 281, 5% SalmonellaExiguobacterium, 1, 0% Carnobacterium, 1, 0% ShigellaKlebsiella, 4, 0% Lactococcus, 1, 0% Staphylococcus Enterobacter, 5, 0% Streptococcus Vibrio Enterococcus, 210, 3% Citrobacter, 6, 0% Cronobacter, 3, 0% Escherichia, 203, 3% Erwinia, 2, 0% Isolates by region Number of Isolates by Region Africa 3% Middle East 1% Australia 0% Asia 4% South America Europe 7% 1% Unknown 10% North America South America Europe Asia Australia North America 74% Africa Middle East Unknown 100K Sequencing Process Bank culture Isolate DNA Make library Sequence library Short read technologies Library automation Long read technologies Optical mapping Genome DB Project progress • Year 1: • • • • • Focus on the top 50 Salmonella outbreak serotypes Banked ~3500 isolates Developing world-wide partnerships Automation of sequence library construction Sequence 1800 isolates • Year 2-5 • • • • • • • Bank additional isolates Automated, routine library construction Sequence ~25K genomes/year Finish 1000 genomes to a single closed genome Generate epigenomic data Define high resolution map assemblies for small set Define need for additional bioinformatics 100K project Web site http://100kgenome.vetmed.ucdavis.edu Public health Outcomes • Federal agencies embracing NGS for outbreak investigation, trace backs, and monitoring • PFGE vs NGS • Implementation is not a simple path • Pan-genome value • Discover new sets of genes that are >50% of the genome that we ignore today • New robust testing methods that allows routine testing in plant • Human and animal public health • 100K database enables a new era of diagnostics tools • Definition of virulence, antibiotic resistance, source, insights for mitigation, and window into emerging strain differences as a sentinels • Host adaptation, zoonotic movement, supply chain changes Innovation in Methods • Culture independent methods (CIM) to capture & concentrate • Enrich • Detect • ID • Coupling genomics & biomarkers with existing methods and CIM • Increase speed • Increase information diversity • Serotypes • Pathogens Microbial Isolation & detection • Classical approach – grow what we can… • • BAM/ISO & growth • Enrichment and ELISA • Limited by those we know how to grow • We know how to grow ~1% of bacteria Non-culturable bacteria common • Rapid methods – looking for bacterial signatures… • • Finding new organisms without growth Enables customized approaches for screening • Molecular methods • • Viruses often don’t grow in vitro – limits detection approaches • • • Strain diversity and continued outbreaks creating demand for new, next generation methods • Provides bacterial community structure information Individual genome sequencing Provides bacterial metabolism capability Can be linked to food characteristics Examples of molecular tools • Data analysis post sequencing • Comparison for SNPs • Gene content = annotation • Content comparison = forensics • Gene • Protein • COGs/GO use in statistical enrichment • Sea mammal outbreak (SNP) • Food outbreak (SNP) • Going beyond SNPs and beyond • Biomarker hunting • Genomics – limited based on sequences available Rapid Detection E. coli (B. Findley) Technologies to enhance food safety & security Eliminate enrichment Fast Sensitive Reliable Food, water & environment Existing Limitations • Molecular assays are limited by too few genomes to develop robust biomarker genes quickly • Genome evolution more complex that previously appreciated • Lack of information to create robust assays for improved detection using PCR • Genome sequencing will enable technological advances and increase reliability of PCR assays Rapid detection 0 min Capture with fluidized bed (15 mL to 40 L) 5 min 30–35 min bead Presumptive ID (ELISA, PCR) Detailed DNA/RNA analysis 40 min 120-135 min & beyond… Confirm ID, molecular serotype, sequence, toxin production, community analysis microbe Blake & Weimer, ‘97 AEM Weimer et al., ’00 JBBM Weimer et al., ’01 AEM Walsh et al., ’01 JBBM Desai et al., ‘08 AEM Maga et al., ‘12 AEM Detection in Food & Enrichment Broth Organism Food ImmunoFlow In Food Enrich Broth Commercial Immunoppt. E. coli O157 Apple juice Hamburger Beer Sprouts 0 80 0 100 LM Halibut 40 30 40 SE Chicken 60 0 20 40 0 30 0 100 % positive GlycoBind Detection • Replaces Ab–based tests • Capture ligands are cell receptors Water Apple juice Spinach Hamburger Salami Milk • Broad organism capture surface • Used in static & flow setting Aerobic Plate Count before Lower Food Type E. coli O157:H7 spike Detection Limit (cfu/gm) (cfu/gm) 0 8 577 24856 9350000 0 4 4 4 400 40000 40000 1 • Detection with RT-PCR • WGS has arrived... Myxobolus cerebralis No enrichment 3 hour total time 4 cell sensitivity <5% variation Desai et al., 2008. AEM 74:2254-2258 Molecular salmonella testing in 2 hours • • • • Colony Enrichment broth Lab medium Bead capture Lyse cells Add template PCR (45 mins) Action based on 1. Salmonella detection 2. serotype determination Nano electrophoresis (30 mins) Report Salmonella, serogroup & serotype Molecular detection validation • Validation Approach • ~1750 isolates tested • Designed to detect ~40 most common serotypes • Multiple matrices done • Verification & Validation • Used by 3 independent labs • 100% accurate in 6 independent blinded panels for Salmonella identification • 98% accurate in correct serotype determination Complex implementation • Total DNA/RNA extraction from food • • Possible and being done for PCR and qRT-PCR Metagenomics – HARD & UNCLEAR reliability • Rapid detection strategies • • Genomics and systems biology Reduction in time to result • Robust & accurate genomic tests • Requires access to genomes/genotypes • The longest step is enrichment • Eliminate pre-enrichment step for direct detection? • PulseNet & genomics • • • NGS is fast and direct Data rich Work flow remains unclear for best implementation Workflow innovation Culture independent capture/concentration Presumptive ID with Solid phase ELISA Nano Electrophoresis Enrichment & colony isolation Broth multiplex PCR Colony multiplex PCR Nano Electrophoresis Critical Needs • Robust biomarkers • Fast, actionable answers • Novel sequencing strategiesResult Total time 1.5 hours 6 hours 48 hours NGS of entire genome Library construction NGS Sequencing Bioinformatics & statistics ~24-130 hours* SE SNP Analysis Multiple SE snp-types ‘93 sea lion & otter in 3 tissues ‘93, ’02, ‘08, ’11, rodent, equine, sea mammal ‘98, ‘05, ’08, sea mammal Yearly SNP evolution 1993 elephant seal, lung sea lion, liver elephant seal, kidney ‘02 rodent feces equine feces ‘08, sea lion, liver ’05 sea lion, uterus ’98 otter, abdomen ’11 elephant seal urine ’11 elephant seal brain Open questions • Can individual food components modify the microbiome to enrich or exclude zoonotic pathogens? • Genomic variation & outbreaks • • • • • Host adaptation Transmission Virulence Antibiotic resistance Robust biomarkers & detection reliability • Serotype and genotype • Detection methods • BAM – • Sample prep – • capture concentration coupled to trusted methods (ELISA) • Molecular assays – PCR, robust biomarkers, mass spec • Genomics – sequencing, metagenomics, directed sequencing • Routine use vs outbreak investigation vs traceback Acknowledgements Weimer Lab • Dr. Yi Xie • Dr. Richard Jeannotte • Dr. Holly Ganz • Dr. Marie Forquin • Dr. Prerak Desai • Dr. Jigna Shah • Ms. Nugget Dao • Ms. Mai Lee Yang • Ms. Kao Thao • Ms. Winnie Ng cBio • Dr. Kumar Hari • Dr. Ravi Jane UCD/CAHFS/SVM • Dr. Kris Clothier • Dr. Barb Byrne • Dr. Woutrina Miller • Dr. Linda Harris • Dr. Maria Marco Agilent Technologies • Dr. Rudi Grimm • Dr. Lenore Kelly • Dr. Steffan Müeller • Dr. Steve Royce • Dr. Paul Zavitsanos PacBio • Jonas Korlach • Luke Hickey Thanks to the sponsors: FDA USDA DARPA US Air Force Agilent Technologies CA Dairy industry Pacific BioSciences Mars, Inc. UCD Wildlife Health Center Thank You… Questions? Bart Weimer Professor, UC Davis Director, BGI@UC Davis Director, 100K Genome Project Director, Integration Core, NIH-West Coast Metabolomics Center [email protected] 530.754.0109