An Integrated System-Wide Maize Atlas: from Transcriptome to

Transcription

An Integrated System-Wide Maize Atlas: from Transcriptome to
1/19/2016
What shapes phenotype?
An Integrated System-Wide Maize Atlas:
from Transcriptome to Proteome Networks
Heirloom tomatoes (color, shape, taste…)
Justin Walley
Iowa State University
Butterfly spot patterns
Plant Pathology
& Microbiology
Dog breeds (pigmentation, morphology…)
Advancing our understanding of biological
systems thru transcriptomics
Root Spatiotemporal map
(Brady et al., Science 2007)
Mouse & human maps
(Su et al., PNAS 2004)
Maize gene expression atlas
Pericarp/
Aleurone
•27 DAP
Endosperm
•8 DAP
•10 DAP
•12 DAP
•27 DAP
Embryo
•20 DAP
•38 DAP
• Germinating
Anther Tassel
•1 mm
•2 mm
Vegetative Meristem
Leaf
•Symmetrical DZ
•Stomatal DZ
•Growth Zone
•Juvenile
•Mature
Silk
Spikelet
Ear Primordia
•1 mm
•2-4 mm
•6-8 mm
Internode
•6-7
•7-8
Primary Root
•Whole Root
•Stele
•Cortex
•Maturation Zone
•Elongation Zone
mRNA and Protein Quantification
Pollen
•Mature
•Germinated
Seminal Root
Chloroplast Mitochondria
•Seedling
Glyoxysome
•Bottom Leaf
•Middle Leaf
•Top Leaf
Complete maize atlas data summary
•Proteome: 33 tissues and developmental stages
•Sub-cellular organelles
•Transcriptome: 23 tissues and developmental stages
Gene A
Gene B
Total
Identified
• Counting based relative quantification
• The number of mRNA fragments (RNAseq) or spectra (protein) that
map to a given gene is proportional to the amount of the gene-product
• mRNA quantified using Cufflinks
• Protein quantified using Spectral Counting
Transcriptome (RNA-seq)
62,547
Non-Modified Proteome
18,646
Phosphorylation Sites
31,595
Most comprehensive integrated dataset for any organism
Liu et al., Analytical Chem 2004
www.maizegdb.org
Walley et al., Unpublished
1
1/19/2016
Low abundance mRNA rarely produce
protein
Assessing the relationship of mRNA to
protein
DNA
Transcriptional Regulation
mRNA
AAA
Post-transcriptional
Regulation
Translational Regulation
Post-translational
Regulation
Protein
Weakly positive correlations between
mRNA and protein
0.7
Pearson Correlation
Proteins are more likely to originate from
genes annotated as protein coding
Detected mRNA Detected Protein
5%
44%
56%
Filtered Set
0.6
0.5
0.4
0.3
0.2
0.1
0
95%
Working Set
• Calculated using the 14,030 genes who’s mRNA and protein were both quantified
mRNA, protein, and phosphoprotein
exhibit different accumulation patterns
Abundance
Phosphoprotein
Vegetative Meristem
Internode 6-7
Internode 7-8
Leaf Zone1
Leaf Zone 2
Leaf Zone 3
Mature Leaf
Primary Root
Root Meristem
Root Cortex
Root Elong Zone
Secondary Root
Mature Pollen
Female Spikelet
Ear Prim 2-4mm
Ear Prim 6-8mm
Silk
Endosperm 12 DAP
En Crown 27 DAP
Per/Aleu 27 DAP
Embryo 20 DAP
Embryo 38 DAP
Germ Kern 2 DAI
Non-modified Protein
Vegetative Meristem
Internode 6-7
Internode 7-8
Leaf Zone1
Leaf Zone 2
Leaf Zone 3
Mature Leaf
Primary Root
Root Meristem
Root Cortex
Root Elong Zone
Secondary Root
Mature Pollen
Female Spikelet
Ear Prim 2-4mm
Ear Prim 6-8mm
Silk
Endosperm 12 DAP
En Crown 27 DAP
Per/Aleu 27 DAP
Embryo 20 DAP
Embryo 38 DAP
Germ Kern 2 DAI
Vegetative Meristem
Internode 6-7
Internode 7-8
Leaf Zone1
Leaf Zone 2
Leaf Zone 3
Mature Leaf
Primary Root
Root Meristem
Root Cortex
Root Elong Zone
Secondary Root
Mature Pollen
Female Spikelet
Ear Prim 2-4mm
Ear Prim 6-8mm
Silk
Endosperm 12 DAP
En Crown 27 DAP
Per/Aleu 27 DAP
Embryo 20 DAP
Embryo 38 DAP
Germ Kern 2 DAI
mRNA Clustered
How do mRNA and protein
coexpression networks compare?
A
B
C
1
•
2
3
4
5
6
Sample
7
8
Gene coexpression networks are used to determine how genes are
related to one another
– Closely related genes often function in similar biological processes
• mRNA or proteins with correlated expression patterns are considered
coexpressed and are connected in a coexpression network
Horvath and Dong, PLoS Comp Bio 2008
2
1/19/2016
Coexpression network analyses
mRNA vs mRNA
Topology of mRNA and protein
coexpression networks is different
Protein vs Protein
Hypothetical
•
Observed
Suggests that different conclusions on the functional relatedness of
genes will reached
• Calculated using the 10,979 genes who’s mRNA and protein were both quantified
in at least 5 tissues
# of edges per node (mRNA)
Most hubs are not shared between mRNA
and protein co-expression networks
Gene Regulatory Network (GRN)
reconstruction
•
Used GENIE3 to build the GRNs
# of edges per node (protein)
Huynh-Thu et al. PLoS One 2010
GRN predicts known Kn1 targets
• GENIE3 predicts true TF targets determined using ChIP-seq
Topology of mRNA and protein derived
GRNs is different
•
Little overlap in the predicted TF targets when using mRNA versus
protein abundance as the predictor
Bolduc et al. GenesDev 2012
3
1/19/2016
Integrated network predicts more true Kn1
targets
Summary & Perspectives
• Proteomic profiling provides a genome-wide view of
cellular composition and signaling that cannot be inferred
from mRNA observations alone
• Limited overlap between mRNA and protein derived coexpression networks
• GRNs reconstructed using mRNA, protein, or
phosphorylation are largely complementary
• http://maizeproteome.ucsd.edu/
• http://www.maizegdb.org/
Topology of mRNA and protein
coexpression networks is different
Acknowledgments
Steve Briggs
Zhouxin Shen
Ryan Sartor
Kevin Wu
Josh Osborn
NRSA Postdoctoral
Fellowship
Collaborators
Joe Ecker (Salk Institute)
Bob Schmitz (Univ Georgia)
Vineet Bafna (UCSD)
Natalie Castellana
San Diego Center for
Systems Biology
Collaborators
Laurie Smith (UCSD)
Michelle Facette
Virginia Walbot (Stanford)
John Fowler (Oregon State Univ)
Frank Hocholdinger (Univ Bonn)
Eric Schmelz (USDA)
Alisa Huffaker (USDA)
Plant Genome Research
Program
Topology of mRNA and protein derived
GRNs is different
Some abundant proteins lack mRNA
Protein Abundance Rank
(Thousands)
Endosperm 12 DAP
Embryo 20 DAP
8
10
6
7.5
4
5
2
2.5
0
0
0
6
12
18
24 0
6
12
18
mRNA Abundance Rank
mRNA Abundance Rank
(Thousands)
(Thousands)
Protein > mRNA
Protein = mRNA
mRNA > Protein
24
• Little overlap in the predicted TF targets when using mRNA versus
protein abundance as the predictor
Microarray: Sekhon Plant J 2011
Walley et al., PNAS 2013
4
1/19/2016
Hypothesis 1: Transcript levels cycle
diurnally while the protein is stable
Model
Observed
20
Protein
Observed
16 DAP
Protein
mRNA
Abundance
# of Transcripts
Abundance
Model
Cycling
Non-Cycling
16
12
mRNA
Hypothesis 2: Transcription and
translation occur earlier in development
8
18 DAP
2
1
0
4
12
3
4
1
Time (h)
16
18
Time (DAP)
0
20
er
yo
sp
br
do
Em
En
20 DAP
m
• In Arabidopsis, of ~1700 diurnally cycling mRNA only 2 produced
proteins that cycled
Circadian Microarray: Khan et al. 2010
Baerenfaller et al., MSB 2012; Walley et al., PNAS 2013
Hypothesis 3: Protein moves from another
tissue
Model
Microarray: Sekhon Plant J 2011
Walley et al., PNAS 2013
Eukaryotic Gene Expression
Observed
# of Transcripts
Endosperm = En
mRNA
P
Embryo = Em
45
40
35
30
25
20
15
10
5
0
20 DAP
DNA
Transcriptional Regulation
mRNA
AAA
Post-transcriptional
Regulation
Translational Regulation
P
Post-translational
Regulation
Protein
Whole Seed = WS
Microarray: Sekhon Plant J 2011
Walley et al., PNAS 2013
Predicting protein kinase-substrate
relationships
Kinase activation is independent of
abundance
Activated Kinase
Kinases are activated by phosphorylation of their activation loop domain
Non-modified Kinase
67
AI
D
2
Em P
m A
P
er D
G 38 AP DA P
D 7 A
E m 20 u 2 D
l e 27
Emr/A n
w
P e C ro AP
D
E n 1 2 AP
D
E n 10 P A I
A
En D 2 D
8
En Em P
m A
P
er D
G 8 AP DA P
3
D 7 A
E m 20 u 2 D
l e 27
Emr/A n
w
P e C ro AP
D
E n 1 2 AP
D
E n 1 0 AP
En 8 D
En
Relative Abundance
0
0.2
0.4
0.6
0.8
1
Walley et al., PNAS 2013
5
1/19/2016
Abundance
Predict kinase-substrate relationships on
the basis of coexpression
Protein kinase-substrate network reconstruction
Activated kinase A
Phosphopeptide 1
Phosphopeptide 2
• Network of 9 activated
kinases and 762
substrates
1
2
• Significant overlap
between predicted
ZmMPK6 substrates and
known Arabidopsis MPK6
substrates
3
1
2
3
4
5
6
Sample
7
4
8
6
5
• Correlate the expression pattern for each activated kinase to the
expression pattern of all phosphopeptides
7
9
– Phosphopeptide 1 is a predicted substrate of activated kinase A
8
Yellow dots = Activated Kinase
Red dots = Predicted Substrate
Walley et al., PNAS 2013
Conserved amino acids are predicted to be
phosphorylated by the same kinase
OPH2
OPH1
• bZIP TFs OPH1 & 2 are predicted to
be phosphorylated by a GSK kinase
• Can mutation of this GSK enhance
the nutritional quality of maize?
*
OPH2
OPH1
OPH2
OPH1
a, g zein
GSK
OPH2
OPH1
Spatiotemporal patterns of developmental
and biochemical processes
Clustering of non-modified proteins (≥5 normalized spectral counts; 4439 proteins)
Germ Em 2 DAI
Em 38 DAP
Em 20 DAP
Per/Aleu 27 DAP
En Crown 27 DAP
En 12 DAP
En 10 DAP
En 8 DAP
sis
he
nt
Sy
&
ch e
e
ar as
as n
St GP ote atio
A
Pr rad
e
in eg
te D
ys h
C tarc
S
&
ds
oi
an
op
pr e
yl at
en on
Ph asm
J ll
a
lW
el
ed
t
C
la
gu
re
A
AB
m
is
ol
ab
et
M
d
pi
Li
s
si
he
nt
Sy
n
ei
ot
&
s
Pr
si
he
nt
Sy n
n tio
ei a
ot ad
Pr egr
D
OPH1
O2
OPH2
OPH1
Walley et al., PNAS 2013
Low
P
OPH2
OPH1
X
OPH2
OPH1
O2
– Suggests that this defense hormone plays a role in protecting the seed
a, g zein
OPH2
OPH1
OPH2
OPH1
High
• Lipid biosynthesis is enriched in the early embryo
• Starch metabolism is enriched in the crown endosperm
• Jasmonate biosynthesis is enriched in the pericarp
OPH1
Walley et al., PNAS 2013
Walley et al., PNAS 2013
Identified sites of phosphorylation that may
regulate starch biosynthesis
Cytosol
Starch
Sucrose
SP
SH1
UDP
P P
P P P
Fructose
P P
AE
Phosphorylation
regulates
mitochondrial carrier
SBEI
familySP
proteins
P
ZPU1
SUS1
P
SU1
P
ISO2
SUS2
ISO3
P
SBE2a
AE
P P
P
Phosphatase
SBE1
P
UGP1
P
P
SSIV
UGPa
DU1
SSIIIb
SSIIb
SSIIc
P
P
UGPb
SU2
UTP
AE
SP
UDP-glucose
SS1
Glucose 1-P
PPi
ATP
WX1
ADP-glucose
BT2
BT1
AGPa
AGPSLZM
AGPb
Glucose 6-P
AGPa
AGPSLZM
AGPb
P P P
Relative Abundance
High
P
Plastid
AP
D AP
27 D
u 7
le n 2
r/A w
Pe Cro AP
D
En 1 2 A P
D
I
En 1 0 P D A
A
En 8 D m 2
E
En r m A P
e D
G 38 A P
D
Em 2 0
PGMb
P P P
PGIa
FRK2
Low
PGMa
P P
PHI1
FRK1
AGPc
Glucose 1-P
GPT
Em
ADP
PGIb
P
AGPSEMZM
ATP
AGPc
P
SH2
P
PPi
FRKb
GBSII
BT2
P P P
P
AGPSEMZM
P
SBEI
ADP-glucose
SH2
P
ATP
FRKa
Kinase
P P
ADP
P
Walley et al., PNAS 2013
6