Méthodes d`Analyse Structurale

Transcription

Méthodes d`Analyse Structurale
Méthodes
d'Analyse Structurale
2006-2007
Pedro Coutinho
[email protected]
http://www.afmb.univ-mrs.fr/Pedro-M-Coutinho
M1 - BBSG
■
Introduction (PC)
■
1. Techniques d'Analyse Structurale (MAS1)
■
■
◆
Resonance Magnétique Nucléaire - RMN (HD)
◆
Resonance Paramagnétique Électronique - EPR (BG)
◆
Crystallographie des Rayons-X (MC,GP)
◆
Fluorescence, Dichroisme Circulaire, Infra-rouge (JS)
◆
Microscopie Électronique et de Force Atomique (PC,JS)
◆
Diffraction des Petits Angles et DLS (VRB)
2. Introduction Approfondie (MAS2)
◆
RMN (HD)
◆
EPR (BG)
◆
Crystallographie (MC,GP)
Final (PC)
Organisation
des Cours (1)
La Biochimie a
un support
(Bio)physique
La plupart des phénomènes
biologiques a lieu a une petite
échelle d'espace et de temps
Biologie Structurale: Dimensions
■
Les Biomolécules ont des dimensions trop petites pour
permettre une observation directe (ex: à l'instar de la
microscopie optique classique).
Magnification 1000X
Magnification 100X
(a) collection de cellules
(b) oeuf humain
(c) grain de sel
(d) cheveu humain
(e) protozoaire Paramecium multicronucleatum
(f) protozoaire Amoeba proteus
(a) 5 cellules de bactérie Escherichia coli 2000 nm
(b) 2 cellules de levure Saccharomyces cerevisiae
(c) globule rouge
(d) Globule blanc
(e) spermatozoïde
(f) cellule de l'épiderme
(g) cellule musculaire striée
(h) cellule nerveuse
Biologie Structurale: Dimensions
Magnification 100 000 X
(a) collection de molécules
(b) cellule de E. coli
2000 nm
(c) virus du mosaic du tabac
(d) HIV (virus da SIDA)
(e) bacteriophage
Magnification 1000 000X
(a) atome de carbone
0.1 nm
(h) virus de la poliomielite
30 nm
(b) glucose 0.7 nm
(c) ATP (adenosine triphosphate)(i) myosine
(j) ADN
(d) chlorophylle
(k) actine
(e) ARNt
(l) les 10 enzymes de la glycolise
(f) anticorps
(m) complexe pyruvate
(g) ribosome
déshydrogénases
Biologie Structurale: Définition
■
■
■
Étude de l'architecture et de la forme des
macromolécules biologiques (en particulier protéines
et acides nucléotidiques) et de tous les aspects
associés.
Importante pour les la Biologie car les macromolécules
participent à la majorité des fonctions cellulaires
Leur fonction est souvent dépendante de la possibilité
de ce replier dans une conformation particulière (de
façon à pouvoir réaliser leur fonctions)
Biologie Structurale:
Niveaux de Structure
■
■
La structure fonctionnelle est souvent désignée par
structure tertiaire et/ou structure quaternaire.
Ces niveaux de structure dépendent des niveaux
inférieurs de structure – structure primaire (liée à la
composition) et la structure secondaire (liée à des
arrangements locaux).
Structure Primaire
O
C
HN
HC
O
N
C
C N
R
inosina (I)
H3C
CH
N
HC
C
O
C
C
NH
HN
CH
C
C
R
pseuduridina (ψ)
O
HN
O
C
C
CH3
C
CH
N
R
ribotimidina (T)
O
HN
C
CH
N
O
O
N
C N
R
1-metilinosina (m 1I)
N
HN
C
C
N
O
C
N
CH2
CH2
R
5,6-diidrouridina
(DHU)
O
H3C
C
N
N
C
CH
C
C N
N
H2N
H
1-metilguanosina
(m 1G)
O
HN
C
C
N
CH
C
C N
C
C N
N
N
H3C N
HN
R
R
CH3
CH3
2N-metilguanosina 2,2N,N-dimetilguanosina
2
(m G)
(m 2 2G)
CH
Structure Secondaire
Structure Tertiaire
Structure Quaternaire...
Acides
Nucleiques:
ARN
vs
ADN
Domains are Structurally
Distinct Lobes in Proteins
■
■
■
■
■
Domains are structurally independent units that
each have the characteristics of a small globular
protein.
Most domains consist of 100 to 200 amino acid
residues and have an average diameter of ~25 Å.
Domains of recently evolved proteins are frequently encoded
by exons, reflecting gene fusion of simpler modules.
The number of domains defined by unique folds is probably
limited.
Databases of domain folds are available on the internet
including SCOP (http://scop.mrc-lmb.cam.ac.uk/scop) and
CATH (http://www.biochem.ucl.ac.uk/bsm/cath/).
C = Class
A = Architecture
T = Topology
H = Homologous
superfamily
Polypeptide Chains may Associate:
Form High Order Macromolecular Complexes
■
■
To form a complex machinery which is able to fulfil
complex functions - Ribosome, RNA polymerase II
To bring enzymes of a metabolic pathway close
together so that the loss of metabolic intermediates
is avoided - Pyruvate dehydrogenase multienzyme
complex
■
To build structures of a given geometry – Virus
■
To reduce the osmotic pressure - Insulin crystal
■
To enlarge the number of possible enzyme activity
characteristics by introducing cooperativity between
subunits and various types of regulation – Hemoglobin,
cAMP-dependent protein kinase
Approches/Contraintes en
Biologie Structurale
■
■
■
Nature/États des Macromolécules
◆
Solution
◆
Cristal (bi- et tridimensionnels)
◆
Adsorbé sur une surface
Nombre des Macromolécules
◆
Ensembles (solution, cristaux,...)
◆
Single (Macro)molécule (Microscopie; Biacore??)
Combinaison d'approches expérimentales
(Multi-resolution)
■
■
■
Biologie Structurale:
Molecules Analysées
Les méthodes de determination structurale sont
généralement basées sur des mesures simultanées d'un
grand nombre de molécules identiques
(Signaux Détectables / Moyenne des Résultats)
Ces méthodes incluent la cristallographie, et la plupart
des techniques spectroscopiques.
La majorité des techniques étudie les “états natifs” statiques
des macromolécules. Des variations de ces méthodes
permettent l'observation de phénomènes de nature
dynamique liées à la transition denaturé/natif ainsi que à des
changements conformationnels liées à leur fonction.
Origins of 3D Structural Data
■
■
Most 3D structures data in the PDB were obtained by
one of three methods:
✦
X-ray crystallography (over 80%)
✦
solution nuclear magnetic resonance (NMR) (about 16%)
✦
theoretical modeling (2%) - Non Experimental
A few structures were determined by other methods .
Proteins
Exp.
Method
X-ray
NMR
EM
Other
Total
30288
4796
90
76
32250
Molecule Type
Nucleic
Protein/NA
Acids
Complexes
916
1391
720
122
10
29
4
3
1650
1545
Other
Total
28
6
0
0
34
32623
5644
129
83
38479
Increasing
Number of Known
Protein Atomic
Structures
Structural
Determination
X-ray crystallography
NMR spectroscopy
Synchrotron radiation X-ray Cryo-Electron Microscopy
crystallography
High Field
NMR
Spectrometer
Electron Microscope
for cryo-EM
◆
Information de haute
resolution peut être obtenue à
partir d'images de
macromolecules avec une
resolution de 3-10 Å (0.3-1.0
nm)
Biologie Structurale:
Microscopie Électronique
cryo-EM
Electron microscopy
3D
Determination
(EM)
Provides shape information for macromolecules.
Techniques for averaging particles observed after
staining exist, but are limited to a resolution of 20 Å.
Cryo techniques have provided structures with
resolutions as high as 7 Å.
New methods of tomography are increasingly
providing shape information by combining multiple
images of a single particle.
It is sometimes possible to determine the interaction
interfaces between subunits and individual structures
may be fitted into the low-resolution density.
“Traditional” Experimental
Determination of 3D Structures
X-ray
X-rays
Diffraction
Pattern
NMR
RF
Resonance
RF
H0
Direct detection of
atom positions
Crystals
Indirect detection of
H-H distances
In solution
X-ray crystallography
3D Determination
Uses diffraction patterns observed after bombarding a
crystallized protein with X-rays to construct 3D structures
(up to 0.8 Å resolution). In principle there is no size limit
on the proteins studied using this technique, and the
majority of large complexes known to atomic resolution
have been solved by this method.
Often difficult to obtain
crystals, and large complexes
require high quality crystals for
diffraction at high resolution.
Synchrotron radiation
X-ray crystallography
X-Ray Structural Data
■
■
■
■
X-ray crystal diffraction usually cannot resolve the positions of
hydrogen atoms or reliably distinguish nitrogen from oxygen from
carbon.
The chemical identity of the terminal side-chain atoms is uncertain for
Asp, Gln and Thr and is usually inferred from the protein environment
of the side chain (i.e. the side chain orientation which forms the most
hydrogen bonds or makes the best electrostatic interactions is
selected and built).
Sometimes there is also uncertainty about whether an atom that is not
part of the protein is a bound water oxygen or a metal ion.
Some newer x-ray crystal diffraction PDB files contain hydrogen
positions; these hydrogens were added by modeling. Only for the
relatively small number of structures for which the resolution of the
data extended beyond about 1.2 Å is it possible to locate some of the
hydrogen positions based on the x-ray diffraction data
3D Determination
Nuclear Magnetic Resonance (NMR) Spectroscopy
Measures transitions between different nuclear spin states
within a magnetic field, which provide information about
distances between atoms within a macromolecule. Technique
limited to proteins of up to 40 kDa. It has an increasing
role in studying interaction interfaces between structures
determined independently.
High Field
NMR
Spectrometer
NMR 3D Structural Data
■
NMR determines structures of proteins in solution, but is limited to
molecules not much greater than 30 kD. NMR is the method of choice
for small proteins which are not readily crystallized, and yields the
positions of some hydrogen atoms. The results of NMR analysis are an
ensemble of alternative models, in contrast to the unique model
obtained by crystallography.
Modeling Data (in PDB)
■
Structures obtained by theoretical modeling tend to be less accurate
than those obtained by experimental methods. One kind of modeling,
called homology modeling, involves fitting a known sequence to the
experimentally determined 3D structure of a sequence-similar
molecule. Results of homology modeling are more likely to be reliable
than are results derived purely from theory (ab initio modeling).
Limitations of 3D Structural
Data
■
■
■
Crystallization sometimes distorts portions of a
structure due to contacts between neighboring
molecules in the crystal
However, protein crystals as used for diffraction
studies are highly hydrated ("wet and gelatinous") so
structures determined from crystals are not much
different from the structures of soluble proteins in
aqueous solution
Some molecules have been studied both by
crystallography and by solution NMR, and in these
cases the agreement has been excellent.
Atomic Resolution:
NMR or Crystallography?
■
Both techniques to determine protein structures
■
NMR uses protein in solution
■
X-ray crystallography uses protein crystals
■
■
Both techniques require large amounts of pure
protein
Both techniques require expensive equipment!
Xtal vs RMN
Paramètre
●
●
●
●
●
●
●
●
●
●
●
Résolution
Grandes Proteines
Détails du Centre Actif
Qualité Stereochimie
Structure Secondaire
Structure de la Surface
Structure en Solution
Préparation Echantillons
Prot. «Non-Xtalisables»
Inter. Intramoleculaires
Inter. Intermoleculaires
Xtal
RMN
+++
++
+++
+++/+
++
+
+
-
+
+
+
+++
+++/+
+++/+
+
+
++
++
PDB: exemple
HEADER
COMPND
COMPND
SOURCE
AUTHOR
REVDAT
JRNL
JRNL
JRNL
JRNL
JRNL
JRNL
REMARK
REMARK
REMARK
REMARK
REMARK
REMARK
REMARK
REMARK
REMARK
REMARK
REMARK
REMARK
REMARK
REMARK
...………
SHEET
SHEET
TURN
...
CRYST1
ORIGX1
ORIGX2
ORIGX3
SCALE1
SCALE2
SCALE3
ATOM
ATOM
ATOM
...
LYASE(OXO-ACID)
01-OCT-91
12CA
CARBONIC ANHYDRASE /II (CARBONATE DEHYDRATASE) (/HCA II)
2 (E.C.4.2.1.1) MUTANT WITH VAL 121 REPLACED BY ALA (/V121A)
HUMAN (HOMO SAPIENS) RECOMBINANT PROTEIN
S.K.NAIR,D.W.CHRISTIANSON
1
15-OCT-92 12CA
0
AUTH
S.K.NAIR,T.L.CALDERONE,D.W.CHRISTIANSON,C.A.FIERKE
TITL
ALTERING THE MOUTH OF A HYDROPHOBIC POCKET.
TITL 2 STRUCTURE AND KINETICS OF HUMAN CARBONIC ANHYDRASE
TITL 3 /II$ MUTANTS AT RESIDUE VAL-121
REF
J.BIOL.CHEM.
V. 266 17320 1991
REFN
ASTM JBCHA3 US ISSN 0021-9258
071
1
2
2 RESOLUTION. 2.4 ANGSTROMS.
3
3 REFINEMENT.
3
PROGRAM
PROLSQ
3
AUTHORS
HENDRICKSON,KONNERT
3
R VALUE
0.170
3
RMSD BOND DISTANCES
0.011 ANGSTROMS
3
RMSD BOND ANGLES
1.3
DEGREES
4
4 N-TERMINAL RESIDUES SER 2, HIS 3, HIS 4 AND C-TERMINAL
4 RESIDUE LYS 260 WERE NOT LOCATED IN THE DENSITY MAPS AND,
4 THEREFORE, NO COORDINATES ARE INCLUDED FOR THESE RESIDUES.
9
10
1
S10 LYS
S10 LYS
T1 GLN
257 ALA
39 TYR
28 VAL
258 -1 O LYS
257
N THR
40 1 O LYS
39
N ALA
31
TYPE VIB (CIS-PRO 30)
42.700
41.700
73.000 90.00 104.60
1.000000 0.000000 0.000000
0.000000 1.000000 0.000000
0.000000 0.000000 1.000000
0.023419 0.000000 0.006100
0.000000 0.023981 0.000000
0.000000 0.000000 0.014156
1 N
TRP
5
8.519 -0.751
2 CA TRP
5
7.743 -1.668
3 C
TRP
5
6.786 -2.502
193
258
90.00 P 21
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
10.738 1.00 13.37
11.585 1.00 13.42
10.667 1.00 13.47
2
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
12CA
12CA
12CA
74
75
76
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
12CA
82
83
84
85
86
87
88
89
90
91
Size Range of Xtal Macromolecular Assemblies
Size distribution in Protein Quaternary Structure (PQS: http://pqs.ebi.ac.uk).
■
■
■
a) Structures of: the PDZ domain of dishevelled; CheA, a dimeric
multidomain bacterial signalling molecule; aquaporin, which serves as a
transmembrane water channel; and 70S ribosome, which is the
molecular machine for protein biosynthesis.
b) Distribution of
the size of the
entries in the PQS
database.
The structures of
large complexes
are underrepresented, given
an estimated
average size of a
yeast complex of
Biologie Structurale:
Type d'Information Structurale
Subunit and assembly structure: atomic or near-atomic resolution ≤ 3 Å
■
Subunit and assembly shape: density or surface envelope at a resolution > 3Å
■
Subunit–subunit contact: protein pairs in contact with each other (in some
cases the face involved in the contact)
■
Subunit proximity: proteins close to each other in the assembly, but not
necessarily in direct contact.
■
Subunit stoichiometry: number of each subunit in the assembly
■
Assembly symmetry: symmetry of the subunits arrangement in the
assembly
■
Alert: extreme difficulty in obtaining part of this corresponding
information
■
■
Subunit and assembly structure: atomic or near-atomic resolution ≤ 3 Å
■
Subunit and assembly shape: density or surface envelope at a resolution > 3 Å
■
■
Macromolecular
“Structural”
Techniques
(1)
Subunit–subunit contact: protein pairs in contact with each other (in some cases the
face involved in the contact)
Subunit proximity: proteins close to each other in the assembly, but not necessarily
in direct contact.
■
Subunit stoichiometry: number of each subunit in the assembly
■
Assembly symmetry: symmetry of the subunits arrangement in the assembly
■
Grey boxes: extreme difficulty in obtaining the corresponding information
Macromolecular
“Structural”
Techniques
(2)
■
■
■
■
Aspects Complémentaires
Les données structurales issues des différentes
techniques de resolution structurale présentent des
aspects singuliers dans la description des
macromolécules et leurs complexes à différents niveaux
de détail.
Certaines techniques sont hautement complémentaires.
Ex: SAXS est complémentaire à autres techniques
d'analyse (Xtal, EM, ultracentrifugation analytique).
La complémentarité permet des approches
multiresolution.
Hybrid Approaches to Structure Determination
of Macromolecular Complexes
a Integration of a diverse set of
structures varying in reliability
and resolution into a hypothetical
hybrid assembly structure
b Hybrid assembly of the 80S
ribosome from yeast.
Superposition of a comparative
protein structure model for a
domain in protein L2 from Bacillus
stearothermophilus with the
actual structure (1RL2) (left). A
partial molecular model of the
whole yeast ribosome (right) was
calculated by fitting atomic rRNA
(not shown) and comparative
protein structure models (ribbon
representation) into the electron
density of the 80S ribosomal
particle.
BS complemente
les approches de
la Biologie
Moléculaire
3D Structure to Function
3D Structure to Function: Possibilities
■
The atomic structure reveals the overall organization
of the protein chain in three dimensions. From this we
can identify:
◆
the residues that are buried in the core or exposed to
solvent on the protein surface
◆
the shape and molecular composition of the surface
◆
◆
the relative juxtaposition of individual groups.
quaternary structure of the protein in the crystal
environment or in solution at high concentration.
3D Structure to Function: Complexes
■
■
■
Protein–ligand complexes are perhaps the most useful
for functional information:
◆
reveal the nature of the ligand and where it is bound
◆
the disposition of residues in the active site, from
which a catalytic mechanism may be postulated (for
enzymes)
Classically, structures of such complexes have been
determined by design, for example by adding the
appropriate ligand to the crystallization medium.
In structural genomics, where the ligand is unknown, several
examples have already been documented in which the
structure has inadvertently included a ligand from the
3D Structure to Function:
Biological Function
■
■
■
Structural data usually only carry information about
the biochemical function of the protein.
Its biological role in the cell or organism is much more
complex and additional experimental information is
needed in order to elucidate this.
In the search to determine biological function, some
clues to the biochemical function of the protein will
guide the choice of the appropriate experiments to
extend the structure-based functional predictions.
1
Gene
Sequence
Gene
finding
<- Depth: Rational Drug Design (Pysics)
BS origine des Données pour
Organiser et Comprendre les
Données Biologiques
Breadth: Homologs, Large Scale Surveys, Informatics ->
Protein
Sequence
Structure
prediction
Protein
Structure
Geometry
calculation
Protein
Surface
Molecular
simulation
Force
Field
Structure
docking
Ligand
Complex
Brea
Pairwise comparison,
sequence & structure
alignment
Multiple alignment,
patterns, templates,
trees
Databases, scoring
schemes, consensus
2
3-100
100+
3D to Structure: How to infer
functional knowledge?
■
■
Comparison of the protein fold or structural motifs
within a protein to the structural databases may
reveal similarities, from which biochemical and
biological functional information may be inferred.
As well as global similarities, it is sometimes possible
to identify local structural motifs that capture the
essence of the biochemical function and can be used
to assign function. These may result from divergent
or convergent evolution
Convergent
Evolution
Trypsin and
Chymotrypsin are
divergently evolved,
sharing the same
fold and active site.
Subtilisin is
convergently evolved
as compared to the
other two, having
the same catalytic
triad, but completely
different fold.
3D Profile:
Aspartic Proteases
True hits
False hits