QIIME and the art of fungal community analysis

Comments

Transcription

QIIME and the art of fungal community analysis
QIIME and the art of fungal community analysis Greg Caporaso Sequencing output
(454, Illumina, Sanger)
Metadata
mapping file
fastq, fasta, qual, or sff/trace files
www.QIIME.org
OTU (or other sample by
observation) table
Pre-processing
e.g., remove primer(s), demultiplex,
quality filter
Denoise 454 Data
PyroNoise, Denoiser
Phylogenetic Tree
Evolutionary relationship
between OTUs
Database Submission
α-diversity and rarefaction
β-diversity and rarefaction
(In development)
e.g., Phylogenetic
Diversity, Chao1,
Observed Species
e.g., Weighted and
unweighted UniFrac, BrayCurtis, Jaccard
Pick OTUs and representative sequences
Reference based
BLAST, UCLUST,
USEARCH
De novo
e.g., UCLUST, CD-HIT,
MOTHUR, USEARCH
Assign taxonomy
Align sequences
BLAST, RDP
Classifier
e.g., PyNAST,
INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'
Build phylogenetic tree
i.e., sample by observation
matrix
e.g., FastTree, RAxML,
ClearCut
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction
plots, network visualization, jackknifed hierarchical clustering.
Legend
Currently supported for
marker-gene data only
Currently supported for
general sample by
observation data
(i.e., 'upstream' step)
(i.e., 'downstream' step)
Required step or input
Optional step or input
QIIME is not ( just) a 16S pipeline! 16S 18S nihH ITS Phage metagenome Coming soon: General metagenomics and metatranscriptomics OTU picking •  De Novo –  Reads are clustered based on similarity to one another. •  Reference-­‐based –  Closed reference: any reads which don’t hit a reference sequence are discarded –  Open reference: any reads which don’t hit a reference sequence are clustered de novo De novo OTU picking •  Pros –  All reads are clustered •  Cons –  Not parallelizable –  OTUs may be defined by erroneous reads Closed-­‐reference OTU picking •  Pros –  Built-­‐in quality filter –  Easily parallelizable –  OTUs are defined by high-­‐quality, trusted sequences •  Cons –  Reads that don’t hit reference dataset are excluded, so you can never observe new OTUs Percentage of reads that do not hit the reference collecYon, by environment type. Open-­‐reference OTU picking •  Pros –  All reads are clustered –  ParYally parallelizable •  Cons –  Only par$ally parallelizable –  Mix of high quality sequences defining OTUs (i.e., the database sequences) and possible low quality sequences defining OTUs (i.e., the sequencing reads) Sequencing output
(454, Illumina, Sanger)
Metadata
fastq, fasta, qual, or sff/trace files
mapping file
Pre-processing
www.QIIME.org
Processing amplicon data
with no reference sequences
OTU (or other sample by
observation) table
e.g., remove primer(s), demultiplex,
quality filter
Phylogenetic Tree
Evolutionary relationship
between OTUs
α-diversity and rarefaction
β-diversity and rarefaction
e.g., Phylogenetic
Diversity, Chao1,
Observed Species
e.g., Weighted and
unweighted UniFrac, BrayCurtis, Jaccard
Pick OTUs and representative sequences
Reference based
BLAST, UCLUST,
USEARCH
De novo
e.g., UCLUST, CD-HIT,
MOTHUR, USEARCH
Align sequences
e.g., PyNAST,
INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'
Build phylogenetic tree
i.e., sample by observation
matrix
e.g., FastTree, RAxML,
ClearCut
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction
plots, network visualization, jackknifed hierarchical clustering.
Legend
Currently supported for
marker-gene data only
Currently supported for
general sample by
observation data
(i.e., 'upstream' step)
(i.e., 'downstream' step)
Required step or input
Optional step or input
Sequencing output
(454, Illumina, Sanger)
Metadata
fastq, fasta, qual, or sff/trace files
mapping file
Pre-processing
www.QIIME.org
Processing amplicon data
with no reference tree
OTU (or other sample by
observation) table
e.g., remove primer(s), demultiplex,
quality filter
Phylogenetic Tree
Evolutionary relationship
between OTUs
α-diversity and rarefaction
β-diversity and rarefaction
e.g., Phylogenetic
Diversity, Chao1,
Observed Species
e.g., Weighted and
unweighted UniFrac, BrayCurtis, Jaccard
Pick OTUs and representative sequences
Reference based
BLAST, UCLUST,
USEARCH
De novo
e.g., UCLUST, CD-HIT,
MOTHUR, USEARCH
Assign taxonomy
Align sequences
BLAST, RDP
Classifier
e.g., PyNAST,
INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'
Build phylogenetic tree
i.e., sample by observation
matrix
e.g., FastTree, RAxML,
ClearCut
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction
plots, network visualization, jackknifed hierarchical clustering.
Legend
Currently supported for
marker-gene data only
Currently supported for
general sample by
observation data
(i.e., 'upstream' step)
(i.e., 'downstream' step)
Required step or input
Optional step or input
This work is licensed under the CreaYve Commons AZribuYon 3.0 United States License. To view a copy of this license, visit hZp://creaYvecommons.org/licenses/by/3.0/us/ or send a leZer to CreaYve Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. Feel free to use or modify these slides, but please credit me by placing the following aZribuYon informaYon where you feel that it makes sense: Greg Caporaso, www.caporaso.us. 

Similar documents