Seminar Presentation - National Institute for Animal Agriculture

Transcription

Seminar Presentation - National Institute for Animal Agriculture
Genomics in food security:
100K Pathogen genome
Project
Bart Weimer, Ph.D.
Professor
UC Davis - School of Veterinary Medicine
Director BGI@UCDavis
The Agricultural System
Biosecurity
& Safety
Health
Agriculture
Environment
Changing world &
food safety challenges
• World population predicted to
reach 9.2 billion by 2050
10
Population (billions)
• Increased urbanization
•
Developing countries increase ~1.5
to 2.0
8
• ~25 Mega-cites around world
•
•
•
Increased density
Increased distance from food supply
Predicted increase in world-wide
food-related disease outbreaks
6
4
• Intensive agriculture needed to feed
the world
2
Industrialized = Aging
Developing = Baby boom
0
1990 2000 2010 2020 2030 2040 2050
(modified from Z_punkt)
Food Safety & Quality
• Example outbreaks associated with food:
• 2013 – Pet food, turtles, hedgehogs, tea, chicken, beef, pork, & Salmonella
• 2012 – Pet food, fruit, nuts, peanut butter, hamburger, tuna, chicken &
Salmonella
• 2011 – Veggies and Re-emergence of E. coli O104 (genomics required to solve)
• 2010 – Eggs and S. Enteritidis
• 2008 – Peanut butter and S. Typhimurium
• 2006 – Tomatoes, peppers and S. Typhimurium
• Estimated economic impact in the EU >€5 billion from Campylobacter &
Salmonella
(EFSA, 2012)
Food Safety News Jan 2012
Total FBI 2011
Staph aureus
3%
Campylobacter
spp.
9%
Toxoplasma E. coli O157:H7
gondii
2%
1%
Listeria
monocytogenes
1%
Clostridium
perfringens
11%
Salmonella
(non-typhoid)
12%
http://www.cdc.gov/foodborneburden/PDFs/FACTSHEET_A_FINDINGS_updated4-13.pdf
Norovirus
61%
FBI deaths 2011
Listeria
monocytogenes
21%
E. coli
O157:H7
4%
Norovirus
12%
Salmonella
(non-typhoid)
30%
Toxoplasma
gondii
26%
Staphylococcus
aureus
0%
Campylobacter
spp.
7%
http://www.cdc.gov/foodborneburden/PDFs/FACTSHEET_A_FINDINGS_updated4-13.pdf
Clostridium
perfringens
0%
Foodborne Pathogens
• Salmonella particularly devastating:
• High serotype diversity
• High mobile element diversity
• Frequent horizontal gene transfer
• Emerging stable hypervirulence
•
Heithoff et al.PLoS Pathog 8(4):e1002647
• Large genomic diversity within serotype
• E. coli O104 example of NGS & solutions
Salmonella Diversity
Salmonella species
S. enterica
Number of
serovars
2,557
S. enterica subsp. enterica
1,531
S. enterica subsp. salamae
505
S. enterica subsp. arizonae
99
S. enterica subsp. diarizonae
336
S. enterica subsp. houtenae
73
S. enterica subsp. indica
13
22
S. bongori
Total (genus Salmonella)
2,579
1,630 serotypes
important in food animals
• Approximately 50-60
serotypes are most
common to cause FBI
in humans
• New foods becoming
associated with new
serotypes
Dynamic Microbial
Communities & Disease
Newport
Relative rate (Log change)
Javiana
Enteritidis
Montevideo
Heidelberg
Typhimurium
As one serotype declines others increase
FoodNet
Salmonella
Phylogenomics
16s Alignment
SNPs in 16s rDNA
Whole Genome Alignment
Entire genome reflect
all similarities
Active Outbreaks &
Complete Genomes
• Salmonella enterica subsp.
enterica serovar Javiana
• Common in fresh cut
produce
• Only one previously
sequenced genome (JCVI,
2008), 19 contigs
• Isolate CFSAN001992_73:
• Clinical Arizona isolate from
produce-related 2012 outbreak
• Complete process from isolate to
finished genomic sequence <1
week
• 1 chromosome; 2 plasmids
containing never-seen sequence:
Collaboration with M. Allard, E. Brown, E. Strain, M. Hoffman, T. Muravanda,
S. Musser (FDA), B. Weimer (UC Davis), Jonas Kolach (PacBio)
Sequencing
discovers new genes
New gene families
Core gene families
Variable Salmonella genome
Pan-genome increases with each
isolate sequenced
Micro. Ecol. 2011
The Value of
Microbial Genomes
• We know <1% of the Earth’s microbiome
• Horizontal gene transfer is wide-spread and frequent
• High-quality, finished genomes are the starting point for:
• Functional genomic studies
• Comparative genomics
• Forensics
• Metagenomics
Fraser et al. (2002) J Bacteriology 184: 6403-6405
Chain et al. (2009) Science 326: 236-237
Pathogen Evolution
• Vibrio evolution rapid
•
•
Example for all enteric bacteria
Also shown with environmental organisms
• Enterobacteria genome evolution
•
•
HGT more common than appreciated
Genome rearrangements influenced by
biogeography & other bacterial community
members
• Evidence for local pressure to induce
population genome evolution
•
•
•
•
Biogeography differences
Likely to find footprints of geographical
origin
Requires large number of genomes to
estimate
Creates chimeric genomes
• Stress
•
•
Induces SNPs
Induces new virulence and drug
resistance
• Mutations in DNA repair genes
leads to SNPs
• Recombination events
•
•
•
SNPs
Large segments
HGT
Shapiro et al., ‘12 Science; Denef & Banfield, ’12 Science
New Detection Paradigm
Specific gene
(PCR)
Genome
(NGS, multiplex)
Microbial Communities
& Health
Community
structure
Kawamoto et al., ‘12 Science
Olszak et al., ‘12 Science
Desai & Weimer, ‘09
Who’s there?
What are they
doing?
Growth &
metabolism
Microbe changes
Host
association
Host changes
Bacteria Testing in
Food
Collect Sample
(25 or 325 g)
Protein
Pre-enrich
sample in 4 L
of broth
Nonculturable
bacteria
Fermentation
product
inhibitors
Fat
Immune
inhibitors
Detection
method
Detection timing
Aim: reduce time to result with low cost, sensitive, accuracy
Too little genomic information = difficult molecular methods
Project aims to
combine 2 of 3 steps
Detection
Serotyping
Strain Typing
Genus, species
Serogroup, Serovar
Genovar, Genome
qPCR, etc. 8-30 h
Traditional: 3-5 days
Serology: 4-5 days
PFGE, MLST, genotype
5-10 days
Faster response: < 30 h
More informative results
Specificity
Sensitivity
Surveillance
Outbreak response
Outbreak investigation
Reduce time to result
Traditional - Enrichment
Ship to lab
Pre-enrich
Collect
sample
Log sample
Prep
Enrich
Plate
Examine plate
Selective enrich Confirm
Presumptive
Bank
Examine plate
Confirm
Bank
ID
Characterize
Genomics
T0
T24
T36
T48
T60
T72+
Next generation –
Culture independent
Capture &
concentrate
Collect
sample
Presumptive
Relative amt.
Directed plate Multiplex PCR
Confirm PCR
Directed enrich qPCR
Sequence
DNA prep
Sequencing prep Bank DNA
ID
Characterize
In/out of event
T0
T0.5
T1
T2
T3
T4+
NGS Costs are Falling
(DeWitt et al., 2011)
Genomics & Rapid
detection
• German outbreak
• Chimeric genome – deleted genes = False negative
• Eae deletion – main PCR marker missing = False negative
• FSMA
• International reach with new regulations
• Open to investigate alternate detection methods - CID
• International trade now firmly in the FDA plan
• Public health effort
• Using whole genome sequencing (WGS) to investigate outbreaks
• Installing Illumina sequencers in field offices
• More robust biomarkers are needed
for routine testing
2013 Poultry Sci 92:562–572
100K Pathogen
genome project
2012 HHSInnovate Secretary’s Choice Awardee
 Increase food safety using microbe systems
biology
http://100kgenome.vetmed.ucdavis.edu
 Discover the genetic constituents that are
robust to be predictive biomarkers for specific
traits
 Rapid ID and tracking
 Understand evolution to build more robust
detection systems
 New isolate emergence and persistence
 Integration into current practices
integration of
100K Project
• Produce a
database of
phylogenomic
diversity of
important FBI
• Industry
representative
genomes
important
• Background
organisms
M. Allard
100K Consortium
•
Founding Members, Executive committee
•
•
•
•
NIH (NCBI)
CDC
USDA
Mars, Inc.
Pacific Biosciences
Steering committee provides guidance for overall
project direction and goals
ADDITIONAL
PARTICIPANTS WELCOME
Affiliate Members
•
•
•
•
•
•
•
•
•
•
•
•
Agilent Technologies
UC Davis (Weimer lab)
FDA
Additional Steering Committee members
•
•
•
•
•
•
•
•
UC Davis Food Science; Veterinary Diagnostic Lab
Salisbury University (US)
DoD - Walter Reed Hospital
Mass General - Harvard hospital system
RIVM (Netherlands)
DTU (Denmark)
MEFOSA (Lebanon)
Sydney Technical University (Australia)
Rajiv Gandhi Biotechnology Institute (India)
Institute of Environmental Science & Research (NZ)
Oak Ridge National Laboratory (ORNL)
Additional negotiations in process with groups from
Asia, Africa, Europe
Corporate Affiliates
•
•
•
•
•
Pacific Biosystems
cBio
OpGen
Kapa Biosystems
BGI@UCDavis
Process & outcome
Collect
metadata
Collect isolate
Sequence genome
Merge
genome
sequence &
metadata
Validate &
verify
sequence
Validate &
verify
information
Genome
evolution
Analyze
genome for
actionable
features
Traceability
features
Common
traits
Diagnostic
features
Ecology
features
Infection
features
Release
genomes to
public
Submission Logistics
• Affiliates
•
Many options
• In-kind
• Funding
• Isolates
• Analysis
• Data hosting
• Various levels of commitment
available
• Groups providing funding & linking
sequencing to important isolates
• Isolate submission
•
•
•
•
•
Isolate agreement
• MTA
• Timing and specific isolates
Submit isolates & metadata
Authentication
Bank isolates for DNA isolation &
library construction
Sequence –
• BGI@UCDavis
• Return data to submitter
• 12 months for review
• Deposit in NCBI for public access
• Data return & analysis
• Publication
Organisms of Interest
Initial focus
•
Salmonella
•
Norovirus
•
Listeria
•
Hepatitis A
•
Campylobacter
•
Enteroviruses
•
Vibrio
•
E. coli
•
Shigella
•
•
Yersinia
Short reads sequence followed by long read
technologies for sub-set of isolates to complete
genome
•
Clostridium
•
Optical mapping will be used for a selected set to
ensure genome quality
•
Enterococcus
•
•
Cronobacter
Long read technology will be used to close 1,000
genomes
•
Capture genomic diversity to represent pan-genome
for the most important organisms
•
World-wide representation
Isolate bank
Authenticated & banked Isolates
(~3,500 isolates)
Brenneria, 1, 0%
Brenneria
Carnobacterium, 1, 0%
Campylobacter
Campylobacter,
110, 5%
Bacillus, 8,
0%
Streptococcus, 2, 0%
Weisella, 1, 0%
Pending authentication & banking
Bacillus
(~15,000 isolates)
Streptococcus, 2, 0%
Carnobacterium
Staphylococcus, 1, 0%
Citrobacter
Shigella , 4, 0%
Cronobacter
Citrobacter, 1, 0%
Cronobacter, 3, 0%
Erwinia, 2, 0%
Weisella,
1, 0%
Bacillus, 8, 0%
Brenneria, 1, 0%
Vibrio,
305, 5%
Erwinia
Escherichia
Staphylococcus, 1, 0%
Enterococcus
Vibrio, 287, 13%
Shigella , 4, 0%
Enterobacter
Escherichia, 203,
9%
Enterobacter, 4, 0%
Exiguobacterium
Salmonella, 1674, 28%
Klebsiella
Exiguobacterium, 1, 0%
Enterococcus, 210, 9%
Klebsiella, 2, 0%
Listeria, 89, 4%
Lactococcus, 1, 0%
Leuconostoc, 3, 0%
Moraxella, 1, 0%
Salmonella, 1342, 59%
Proteus, 1, 0%
Rummeliicillus, 1, 0%
Campylobacter, 3295,
55%
Lactococcus
Rummeliicillus, 1, 0%
Listeria
Proteus, 2, 0%
Leuconostoc
MoraxellaMoraxella, 1, 0%
Leuconostoc, 3, 0%
Proteus
Rummeliicillus
Listeria, 281, 5%
SalmonellaExiguobacterium, 1, 0%
Carnobacterium, 1, 0%
ShigellaKlebsiella, 4, 0%
Lactococcus, 1, 0%
Staphylococcus
Enterobacter, 5, 0%
Streptococcus
Vibrio
Enterococcus, 210, 3%
Citrobacter, 6, 0%
Cronobacter, 3, 0%
Escherichia,
203, 3%
Erwinia, 2, 0%
Isolates by region
Number of Isolates by Region
Africa
3%
Middle
East
1%
Australia
0%
Asia
4%
South
America Europe
7%
1%
Unknown
10%
North America
South America
Europe
Asia
Australia
North America
74%
Africa
Middle East
Unknown
100K Sequencing
Process
Bank culture
Isolate DNA
Make library
Sequence library
Short read technologies
Library automation
Long read technologies
Optical mapping
Genome DB
Project progress
• Year 1:
•
•
•
•
•
Focus on the top 50 Salmonella outbreak serotypes
Banked ~3500 isolates
Developing world-wide partnerships
Automation of sequence library construction
Sequence 1800 isolates
• Year 2-5
•
•
•
•
•
•
•
Bank additional isolates
Automated, routine library construction
Sequence ~25K genomes/year
Finish 1000 genomes to a single closed genome
Generate epigenomic data
Define high resolution map assemblies for small set
Define need for additional bioinformatics
100K project Web site
http://100kgenome.vetmed.ucdavis.edu
Public health
Outcomes
• Federal agencies embracing NGS for outbreak investigation, trace
backs, and monitoring
• PFGE vs NGS
• Implementation is not a simple path
• Pan-genome value
• Discover new sets of genes that are >50% of the genome that we
ignore today
• New robust testing methods that allows routine testing in plant
• Human and animal public health
• 100K database enables a new era of diagnostics tools
• Definition of virulence, antibiotic resistance, source, insights for
mitigation, and window into emerging strain differences as a
sentinels
• Host adaptation, zoonotic movement, supply chain changes
Innovation in
Methods
• Culture independent methods (CIM) to capture &
concentrate
• Enrich
• Detect
• ID
• Coupling genomics & biomarkers with existing
methods and CIM
• Increase speed
• Increase information diversity
• Serotypes
• Pathogens
Microbial
Isolation & detection
• Classical approach –
grow what we can…
•
•
BAM/ISO & growth
• Enrichment and ELISA
• Limited by those we know
how to grow
• We know how to grow ~1% of
bacteria
Non-culturable bacteria common
• Rapid methods –
looking for bacterial signatures…
•
•
Finding new organisms without
growth
Enables customized approaches for
screening
• Molecular methods
•
• Viruses often don’t grow in vitro
– limits detection approaches
•
•
• Strain diversity and continued
outbreaks creating demand for
new, next generation methods
•
Provides bacterial community
structure information
Individual genome sequencing
Provides bacterial metabolism
capability
Can be linked to food characteristics
Examples of
molecular tools
• Data analysis post sequencing
• Comparison for SNPs
• Gene content = annotation
• Content comparison = forensics
• Gene
• Protein
• COGs/GO  use in statistical enrichment
• Sea mammal outbreak (SNP)
• Food outbreak (SNP)
• Going beyond SNPs and beyond
• Biomarker hunting
• Genomics – limited based on sequences available
Rapid Detection
E. coli (B. Findley)
 Technologies to enhance food safety
& security





Eliminate enrichment
Fast
Sensitive
Reliable
Food, water & environment
Existing Limitations
• Molecular assays are limited by too few genomes to
develop robust biomarker genes quickly
• Genome evolution more complex that previously
appreciated
• Lack of information to create robust assays for improved
detection using PCR
• Genome sequencing will enable technological advances
and increase reliability of PCR assays
Rapid detection
0 min
Capture
with fluidized bed
(15 mL to 40 L)
5 min
30–35 min
bead
Presumptive ID (ELISA, PCR)
Detailed DNA/RNA analysis
40 min
120-135 min
& beyond…
Confirm ID, molecular serotype,
sequence, toxin production,
community analysis
microbe
Blake & Weimer, ‘97 AEM
Weimer et al., ’00 JBBM
Weimer et al., ’01 AEM
Walsh et al., ’01 JBBM
Desai et al., ‘08 AEM
Maga et al., ‘12 AEM
Detection in Food &
Enrichment Broth
Organism Food
ImmunoFlow
In Food Enrich Broth
Commercial
Immunoppt.
E. coli
O157
Apple juice
Hamburger
Beer
Sprouts
0
80
0
100
LM
Halibut
40
30
40
SE
Chicken
60
0
20
40
0
30
0
100
% positive
GlycoBind Detection
• Replaces Ab–based tests
• Capture ligands are cell
receptors
Water
Apple juice
Spinach
Hamburger
Salami
Milk
• Broad organism capture
surface
• Used in static & flow
setting
Aerobic Plate Count before
Lower
Food Type
E. coli O157:H7 spike
Detection Limit
(cfu/gm)
(cfu/gm)
0
8
577
24856
9350000
0
4
4
4
400
40000
40000
1


• Detection with RT-PCR
• WGS has arrived...
Myxobolus cerebralis


No enrichment
3 hour total time
4 cell sensitivity
<5% variation
Desai et al., 2008.
AEM 74:2254-2258
Molecular salmonella
testing in 2 hours
•
•
•
•
Colony
Enrichment broth
Lab medium
Bead capture
Lyse cells
Add template
PCR
(45 mins)
Action based on
1. Salmonella detection
2. serotype determination
Nano
electrophoresis
(30 mins)
Report
Salmonella,
serogroup &
serotype
Molecular
detection validation
• Validation Approach
• ~1750 isolates tested
• Designed to detect ~40 most common serotypes
• Multiple matrices done
• Verification & Validation
• Used by 3 independent labs
• 100% accurate in 6 independent blinded panels for
Salmonella identification
• 98% accurate in correct serotype determination
Complex implementation
• Total DNA/RNA extraction from food
•
•
Possible and being done for PCR and qRT-PCR
Metagenomics – HARD & UNCLEAR reliability
• Rapid detection strategies
•
•
Genomics and systems biology
Reduction in time to result
• Robust & accurate genomic tests
•
Requires access to genomes/genotypes
• The longest step is enrichment
•
Eliminate pre-enrichment step for direct detection?
• PulseNet & genomics
•
•
•
NGS is fast and direct
Data rich
Work flow remains unclear for best implementation
Workflow innovation
Culture independent
capture/concentration
Presumptive ID
with
Solid phase ELISA
Nano
Electrophoresis
Enrichment
& colony isolation
Broth
multiplex
PCR
Colony
multiplex
PCR
Nano Electrophoresis
Critical Needs
• Robust biomarkers
• Fast, actionable answers
• Novel sequencing
strategiesResult
Total time
1.5 hours
6 hours
48 hours
NGS of
entire genome
Library
construction
NGS
Sequencing
Bioinformatics &
statistics
~24-130 hours*
SE SNP Analysis
Multiple SE snp-types
‘93 sea lion & otter
in 3 tissues
‘93, ’02, ‘08, ’11,
rodent, equine, sea
mammal
‘98, ‘05, ’08, sea
mammal
Yearly SNP evolution
1993
elephant seal, lung
sea lion, liver
elephant seal, kidney
‘02 rodent feces
equine feces
‘08, sea lion, liver
’05 sea lion, uterus
’98 otter, abdomen
’11 elephant seal urine
’11 elephant seal brain
Open questions
• Can individual food components
modify the microbiome to enrich
or exclude zoonotic pathogens?
• Genomic variation & outbreaks
•
•
•
•
•
Host adaptation
Transmission
Virulence
Antibiotic resistance
Robust biomarkers & detection
reliability
• Serotype and genotype
• Detection methods
• BAM –
• Sample prep –
• capture concentration
coupled to trusted
methods (ELISA)
• Molecular assays – PCR,
robust biomarkers, mass
spec
• Genomics – sequencing,
metagenomics, directed
sequencing
• Routine use vs outbreak
investigation vs traceback
Acknowledgements
Weimer Lab
• Dr. Yi Xie
• Dr. Richard Jeannotte
• Dr. Holly Ganz
• Dr. Marie Forquin
• Dr. Prerak Desai
• Dr. Jigna Shah
• Ms. Nugget Dao
• Ms. Mai Lee Yang
• Ms. Kao Thao
• Ms. Winnie Ng
cBio
• Dr. Kumar Hari
• Dr. Ravi Jane
UCD/CAHFS/SVM
• Dr. Kris Clothier
• Dr. Barb Byrne
• Dr. Woutrina Miller
• Dr. Linda Harris
• Dr. Maria Marco
Agilent Technologies
• Dr. Rudi Grimm
• Dr. Lenore Kelly
• Dr. Steffan Müeller
• Dr. Steve Royce
• Dr. Paul Zavitsanos
PacBio
• Jonas Korlach
• Luke Hickey
Thanks to the sponsors:
FDA
USDA
DARPA
US Air Force
Agilent Technologies
CA Dairy industry
Pacific BioSciences
Mars, Inc.
UCD Wildlife Health Center
Thank You…
Questions?
Bart Weimer
Professor, UC Davis
Director, BGI@UC Davis
Director, 100K Genome Project
Director, Integration Core, NIH-West Coast Metabolomics Center
[email protected]
530.754.0109