CATCH makes it easy Recover known profiles + Discover new

Transcription

CATCH makes it easy Recover known profiles + Discover new
those ChIP profiles!
H
C
T
A
C
Fiona G. Nielsen1,2, Kasper Markus3, Henk Stunnenberg1,
Lene Monrad Favrholdt3, Martijn Huynen2
Department of Molecular Biology, Nijmegen Centre for Molecular Life Sciences,
the Netherlands; 2) University of Nijmegen Medical Centre, the Netherlands; 3)
University of Southern Denmark Odense, Denmark
1)
What is noise, what is
biology?
CATCH algorithm
CATCH performs a hierachical clustering with alignment of profiles at every
clustering step.
Patterns correlating with
known annotation are
likely to be functional
repeat until no profiles left:
1. calculate all pairwise similarities of profiles
2. cluster the two most similar profiles P1 and P2
3. make a new average profile Pnew of P1, P2
4. add Pnew to the set of profiles
What about patterns
outside the annotation?
Similarity is evaluated at all possible alignments of each pair of profiles. The
distance measure used for clustering is the similarity of each pair of profiles
at their best alignment.
Unsupervised clustering
is needed to find the
recurring patterns
<<< annotated gene <<<
TSS
?
?
Recover known profiles
We used CATCH to cluster the epigenetic profiles of all TSSs from the
ENCODE ChIP-on-chip dataset from Heintzman et al. [1]
Active promoter
3
2.5
1.5
1
0.5
The profile is assymmetric, with the
highest signals extending in the direction of transcription.
-0.5
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
Inactive promoter
3
2.5
This pattern is similar to the average
active promoter, but symmetric and narrower pattern.
2
1.5
1
0.5
1.5
1
0.5
0
Positional analysis showed overlap of
this pattern with regions of bi-directional
promoters.
-0.5
-1
-1.5
0.0
1.0
2.0
3.0
4.0
5.0
6.0
Olfactory receptor (OR)
3
2.5
In the absense of transcription, the
ChIP signal averages to zero or even
shows a slight depletion of these histone modifications.
2
-0.5
This ChIP profile shows depletion of
common active marks as well as an increase in H3 core.
2
1.5
1
0.5
0
Profiles in this cluster was found to correspond to the OR gene promoters.
ORs are tightly regulated, with only one
OR expressed in each neuron. [2]
-0.5
H3
-1
-1.5
0.0
Bi-directional promoter
2.5
0
0
-1
0.0
profiles
To explore more patterns in the data we selected profiles around all peaks in
the dataset and browsed the profile patterns using CATCH.
3
Promoters show a characteristic pattern
of increase in acetylation and methylation marks.
2
new
+ Discover
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
H3ac
H4ac
RNAP
p300
H3K4me2
TAF
CATCH makes it easy
Read common file formats (.wig, .bed)
H3K4me1
-1
-1.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
H3K4me3
The OR genes were previously shown to be lacking H3K4me3. [3]
3
Browse the clustering results
2.5
Export profiles
2
1.5
1
0.5
0
-0.5
-1
-1.5
0.0
3
0.5
1.0
1.5
4.0
5.0
2.0
2.5
3.0
2.5
2
1.5
3
1
2.5
2
0.5
1.5
0
1
-0.5
-1
0.0
0.5
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
0
-0.5
-1
-1.5
0.0
1.0
2.0
3.0
6.0
Export cluster tree
References
[1] Heintzman et al, Nature Genetics, 39, 311-318, 2007
[2] Lomvardas et al, Cell, Volume 126:2, 403-413, 2006
[3] Guenther et al, Cell, Volume 130:1, 77-88, 2007
Acknowledgements
The analysis was facilitated by Moniek Riemersma and Maarten Kooyman
through their student projects at the CMBI. The CATCH user interface was mainly
developed by the GiPCATCH project team. The work of F. Nielsen is partly funded
by the European HEROIC grant for mouse epigenetics.
Please visit: www.cmbi.ru.nl/~fnielsen/CATCH