[14] Macromolecular TLS Refinement in REFMAC at

Transcription

[14] Macromolecular TLS Refinement in REFMAC at
300
map interpretation and refinement
[14]
In general the crystallographer should reserve a day in which they have
no defined task to complete and practice on data that is not important (i.e.,
the supplied example data). In this way frustration can be reduced, and the
advantages of the applications described quickly become obvious.
[14] Macromolecular TLS Refinement in REFMAC at
Moderate Resolutions
By Martyn D. Winn, Garib N. Murshudov, and Miroslav Z. Papiz
Introduction
To interpret the results of an X-ray diffraction experiment completely,
one would like to supplement knowledge of the average atomic positions
with information about their variation in space and over time. Within the
usual Gaussian approximation,1,2 the full probability distribution is reduced to the mean square deviation of each atom from its mean position.
One is then left with two tasks: how best to estimate these mean square deviations, and how to interpret them physically.
In general, each atom can deviate anisotropically from its mean position, and six parameters are necessary to describe the mean square displacements fully. These parameters are referred to as the anisotropic
displacement parameters (ADPs),2 are usually denoted U, and can be
visualized as so-called thermal ellipsoids. The addition of an extra six parameters per atom in macromolecular refinement is usually not justified by
˚ ). One
the data, except when atomic resolution data are available (<1.2 A
can reduce the number of refinement parameters by assuming isotropic displacements, in which case there is a single extra parameter per atom, the
˚ ),
so-called Debye–Waller B factor. At medium resolutions (ca. 1.2–3.0 A
this is an oversimplification, ignoring anisotropy in the data that could be
modeled.
Often one might like to model anisotropic displacements without
resorting to the use of independent ADPs. Restraints are commonly applied in ADP refinement, which allows the use of ADPs at lower than
atomic resolutions. Alternatively, one can apply constraints rather than
1
B. T. M. Willis and A. W. Pryor, ‘‘Thermal Vibrations in Crystallography.’’ Cambridge
University Press, Cambridge, 1975.
2
K. N. Trueblood, H.-B. Bu¨rgi, H. Burzlaff, J. D. Dunitz, C. M. Gramaccioli, H. H. Schulz,
U. Shmueli, and S. C. Abrahams, Acta Crystallogr. A 52, 770 (1996).
METHODS IN ENZYMOLOGY, VOL. 374
Copyright 2003, Elsevier Inc.
All rights reserved.
0076-6879/03 $35.00
[14]
macromolecular TLS refinement in REFMAC
301
restraints to the ADPs, and this is done most easily by deriving the ADPs
from a more general model. The best known examples are the TLS (translation, rotation, screw-rotation) model,3 which we describe here, and the
normal mode model4–6; in principle, the approach could be very general.
Collective modes are devised to model the anisotropic displacements, from
which ADPs and hence calculated structure factors can be derived. To be
useful, these modes should be specified by fewer parameters than the full
ADP model, and to be successful the modes should be based on a reasonable physical model. Note that each individual mode need not be accurate,
provided the subspace spanned by all modes covers the necessary region of
parameter space. Thus, the Gaussian network model7,8 is based on a simple
single-parameter force field.9 The individual modes derived from this force
field are expected to be less accurate than those calculated from more sophisticated force fields, but nevertheless they have been found to describe
mean square atomic displacements well.7,8
In the current article, we consider the TLS parameterization of ADPs.3
The physical model here is that the anisotropic displacements can be modeled as those of a set of rigid bodies. In this case, the collective modes are
the three translations and three librations of each rigid body. Taking into
account cross-correlations, the mean square displacements of atoms in each
rigid body are described by 21 parameters, although in fact one of these
cannot be fixed by the usual second-order expansion of the libration. For
rigid groups of four atoms or larger, the TLS parameterization represents
a decrease in the number of parameters over a full ADP description.
Crucially, however, it still provides an anisotropic description.
TLS parameterization illustrates well the second task mentioned above,
that of interpreting the refined parameters. As is well known, Bragg reflection data give no information on correlations between atomic displacements (see, e.g., the discussion in Kidera and Go6). Therefore, although
such a correlation is built into the rigid body model with all atoms moving
in phase, refinement against Bragg reflection data does not distinguish this
model from a similar model in which, for example, some atoms move in
antiphase. Generally speaking, rigid body motion occurs as a lowfrequency mode, and is more likely to be thermally activated than more
3
V. Schomaker and K. N. Trueblood, Acta Crystallogr. B 24, 63 (1968).
R. Diamond, Acta Crystallogr. A 46, 425 (1990).
5
A. Kidera and N. Go, Proc. Natl. Acad. Sci. USA 87, 3718 (1990).
6
A. Kidera and N. Go, J. Mol. Biol. 225, 457 (1992).
7
T. Haliloglu, I. Bahar, and B. Erman, Phys. Rev. Lett. 79, 3090 (1997).
8
A. R. Atilgan, S. R. Durell, R. L. Jernigan, M. C. Demirel, O. Keskin, and I. Bahar,
Biophys. J. 80, 505 (2001).
9
M. M. Tirion, Phys. Rev. Lett. 77, 1905 (1996).
4
302
map interpretation and refinement
[14]
complex motions, but this is not proved by the data. In fact, TLS parameterization acts, like all parameterizations of mean square displacements, as
a sink for many kinds of displacements as well as model errors. This is particularly true when TLS is the only modeling of anisotropy that is being
used. Therefore, while rigid body motion may be a physically reasonable
model, its contribution to the observed mean square displacements is likely
to be overestimated.
If one assumes for the moment that the rigid body model is a good one
for the problem in hand, one must also remember that it may represent one
of several kinds of displacement, or more likely a combination of all kinds.
First, it may represent thermally activated motion occurring within the
time scale of the experiment. This is likely to be of greatest interest for
the biology of the protein. Second, it may represent static disorder, with
rigid bodies in different unit cells of the crystal lying in slightly different
positions. If these differences apply to the whole unit cell, then this represents lattice disorder. Finally, it may represent errors in the model that
happen to fit a rigid body form. From a single experiment at a single temperature, it is impossible to distinguish between these. Measuring data sets
for the same crystal at several temperatures may allow one to separate out
the dynamic and static contributions, but this relies on the data sets being
equal in all other respects.
Despite these provisos, the TLS model can be extremely useful. At the
moderate resolutions that are the subject of the current chapter, only a few
sets of TLS parameters are used by modeling entire domains or molecules
as quasi-rigid bodies. Thus, some modeling of anisotropic displacements is
achieved at the expense of a few tens of additional parameters. Isotropic
atomic B factors are retained, which model deviations from the rigid body
description. The TLS parameters may give some insight into domain displacements, while the B factors provide information on local displacements
after overall domain displacements have been removed.
The first application of the refinement of TLS parameters against X-ray
data for a macromolecule was that of Holbrook and co-workers,10,11 who
used a modified version of the corels program to refine TLS parameters
for each phosphate, ribose, and base of a duplex DNA dodecamer. Subsequently, Moss and co-workers extended the program restrain12,13 to
10
S. R. Holbrook, and S. Kim, J. Mol. Biol. 173, 361 (1984).
S. R. Holbrook, R. E. Dickerson, and S. Kim, Acta Crystallogr. B 41, 255 (1985).
12
H. Driessen, M. I. J. Haneef, G. W. Harris, B. Howlin, G. Khan, and D. S. Moss, J. Appl.
Crystallogr. 22, 510 (1989).
13
Collaborative Crystallographic Project No. 4, Acta Crystallogr. D Biol. Crystallogr. 50,
760 (1994).
11
[14]
macromolecular TLS refinement in REFMAC
303
allow refinement of TLS parameters, and applied this to bovine ribonuclease
˚ 14 and papain at 1.6 A
˚ .15 Sˇ ali et al.16 used restrain to refine
A at 1.45 A
˚.
TLS parameters for the two domains of an endothiapepsin complex at 1.8 A
17
˚
Stec et al. refined crambin against atomic resolution data (0.83 A) with the
protein molecule divided into one, two, or three TLS groups, and compared
the predicted anisotropic U values against directly refined values. Wilson
and Brunger18 have compared a full anisotropic refinement of calmodulin
with a TLS refinement, in order to identify domain displacements.
Thus, there have been several studies of the use of TLS parameters in
macromolecular refinement, but TLS refinement is not yet used regularly.
To encourage this, we included TLS refinement in the maximum likelihood
refinement program refmac,19 a detailed description of which is presented in Winn et al.20 In the current chapter we discuss some of the practical issues involved in running TLS refinement in refmac, and review a
number of applications. We also discuss the analysis program
tlsanl.13,21 In giving details of these programs, it should be noted that
the software continues to evolve, and the reader must examine the latest
documentation for up-to-date guidance.
TLS Parameterization
Definition of TLS Parameters
TLS parameterization has been described in detail by Schomaker and
Trueblood,3 with useful summaries in Howlin et al.14 and Schomaker
and Trueblood,22 and the reader is referred to these articles for background
theory.
A single set of TLS parameters is defined for each putative rigid body
identified in the model. An instantaneous displacement of one of these
14
B. Howlin, D. S. Moss, and G. W. Harris, Acta Crystallogr. A 45, 851 (1989).
G. W. Harris, R. W. Pickersgill, B. Howlin, and D. S. Moss, Acta Crystallogr. B 48, 67
(1992).
16
A. Sˇ ali, B. Veerapandian, J. B. Cooper, D. S. Moss, T. Hofmann, and T. L. Blundell,
Proteins Struct. Funct. Genet. 12, 158 (1992).
17
B. Stec, R. Zhou, and M. M. Teeter, Acta Crystallogr. D Biol. Crystallogr. 51, 663 (1995).
18
M. A. Wilson, and A. T. Brunger, J. Mol. Biol. 301, 1237 (2000).
19
G. N. Murshudov, A. A. Vagin, and E. J. Dodson, Acta Crystallogr. D Biol. Crystallogr. 53,
240 (1997).
20
M. D. Winn, M. N. Isupov, and G. N. Murshudov, Acta Crystallogr. D Biol. Crystallogr. 57,
122 (2001).
21
B. Howlin, S. A. Butler, D. S. Moss, G. W. Harris, and H. P. C. Driessen, J. Appl.
Crystallogr. 26, 622 (1993).
22
V. Schomaker, and K. N. Trueblood, Acta Crystallogr. B 54, 507 (1998).
15
304
map interpretation and refinement
[14]
rigid bodies can be described in terms of a rotation about an axis passing
through a fixed point, together with a translation of that fixed point. For
small rotations, the corresponding instantaneous displacement u of an
atom in the rigid body at point r relative to the fixed point is given by
u¼tþlr
(1)
where t is the translation, l is a vector along the rotation axis with a
magnitude equal to the angle of rotation, and denotes a cross-product.
Figure 1 illustrates these rigid body displacements for a protein divided
into two rigid bodies.
The TLS contribution to the ADP of an atom in the rigid body can be
derived from the square of the atomic displacement given in Eq. (1), averaged over all unit cells and over the time scale of the experiment:
UTLS huuT i ¼ T þ ST rT r S r L rT
(2)
The symmetric T tensor describes the mean square translation of the rigid
body; the symmetric L tensor describes the mean square libration of the
rigid body; and the nonsymmetric S tensor describes the mean square correlation between the translational and librational displacements. The expansion of the right-hand side of Eq. (2) does not include the trace of S,
t2
r2
r1
t1
l2
l1
Fig. 1. A schematic diagram, showing a protein molecule divided into two rigid groups.
The instantaneous displacement of each rigid group can be described in terms of a translation
t and a libration l. r1 and r2 denote the rest positions of atoms in the two groups. The
displacements of the two groups are assumed to be completely independent.
[14]
macromolecular TLS refinement in REFMAC
305
and so S contributes only eight independent parameters to Eq. (2). Together with the independent elements of T and L, there are thus 20 TLS
parameters per group.
Given a set of refined ADPs, Eq. (2) can be used to determine the TLS
parameters for each rigid body via a least-squares fit. This approach is
common practice in small molecule crystallography, where it is used as a
tool of analysis rather than for structure refinement. Alternatively, and in
the approach we use here, Eq. (2) can be used to derive ADPs and hence
calculated structure factors from TLS parameters during structure refinement. The TLS parameters are thus used as refinement parameters, rather
than the ADPs themselves. Usually, isotropic atomic B factors are added
to UTLS to give the total displacement parameter used in the calculated
structure factor.
Thus, the mean square displacements of each rigid group are described
by the 20 independent TLS parameters, together with a single isotropic
parameter for each atom, rather than the six parameters per atom of a full
anisotropic description. In other words, the large number of parameters involved in refining each atom anisotropically is reduced by introducing constraints relevant to rigid body motion. The number of extra parameters
used to model the anisotropy depends on the number of TLS groups defined. A single TLS group for the whole molecule may prove useful, and requires only 20 extra parameters. Alternatively, one may define a TLS
group for every rigid side chain, introducing a few thousand extra parameters (see, e.g., Harris et al.15). By using only a few TLS groups, however,
˚.
TLS refinement can be used at moderate resolution, for example, 2.0 A
Implementation of TLS in refmac
Maximum likelihood refinement of individual ADPs using a fast Fourier transform method was implemented previously in refmac.23 Refinement of TLS parameters, or indeed any other collective description, is then
implemented as a wrapper to ADP refinement: the necessary derivatives of
the likelihood function with respect to the elements of the tensors T, L, and
S for each TLS group defined are obtained by the chain rule from the derivatives with respect to individual ADPs. All calculations are done in an
orthogonal coordinate system, and in particular TLS parameters are referred to orthogonal axes.
TLS refinement is currently performed as a separate step to the
scaling calculation, and to the refinement of atomic positions and atomic
23
G. N. Murshudov, A. A. Vagin, A. Lebedev, K. S. Wilson, and E. J. Dodson, Acta
Crystallogr. D Biol. Crystallogr. 55, 247 (1999).
306
map interpretation and refinement
[14]
displacement parameters. In principle, there is some redundancy in the use
of overall scaling parameters, TLS parameters, and atomic parameters. For
example, any amount can be removed from the trace of T and added to the
individual B factors of atoms in the TLS group. While the refinement of the
different parameter classes is performed separately, there is no numerical
instability, although the redundancy should be remembered when interpreting actual values of T or individual B factors.
Using refmac to do TLS Refinement
TLS refinement is designed to provide a simple description of anisotropic displacements when the resolution is not good enough to justify refinement of atomic ADPs. At marginal resolutions, TLS groups may be
assigned to individual side chains, and a detailed description of the atomic
˚ and Harris et al.15 at
displacements obtained (e.g., Howlin et al.14 at 1.45 A
˚ ). However, a more common scenario is at medium resolution (the
1.6 A
examples described below span the approximate resolution range 1.5 to
˚ ) when the data-to-parameter ratio justifies the modeling of anisot3.0 A
ropy only at the molecule or domain level. Such anisotropy may or may
not be apparent from the data, but seems to occur frequently. Indeed, the
fact that a crystal does not diffract to high resolution may indicate the presence of large molecular displacements.
If there are several molecules in the asymmetric unit, then the average
mean square displacements often differ from molecule to molecule. An
additional benefit of TLS refinement of molecules or domains is to account
for these differences at a global level. In fact, if the anisotropy is small, the
same benefit may be obtained by applying global B factors to each molecule, but such a case is included automatically in the TLS refinement.
We therefore recommend that TLS refinement at the level of molecules
should be considered for all refinements. In this sense, TLS refinement can
be considered an extension of the refinement of scaling parameters. More
thought needs to go into a decision to do detailed TLS refinement. Does
the data-to-parameter ratio justify it? Is the choice of TLS model a good
one? For a detailed description, other models such as ones based on normal
modes may be more appropriate.
Having asserted that TLS refinement should be used to model overall
molecular displacements, one should mention that TLS refinement appears
to work better toward the end of the refinement. There is anecdotal evidence that TLS refinement can be unstable when the model is incomplete
and contains many errors. In this scenario, ADPs would describe predominantly the errors in the model, and such errors may not conform well to a
rigid body description.
[14]
macromolecular TLS refinement in REFMAC
307
In other cases, the TLS refinement of an incomplete model may be
stable but may give unrealistic values to the TLS parameters. An example
of this is a refinement of a siderophore-binding protein,24 described below.
Alternatively, large discrepancies between the individually refined B
factors and those derived from the TLS parameters may indicate model
errors25 or unmodeled multiple conformations.17
At atomic resolutions, it is more usual to refine individual atomic ADPs
rather than TLS parameters. Having obtained refined atomic ADPs, one
can attempt to fit TLS parameters to these ADPs (using, e.g., the program
THMA22 or the program anisoanl13,26). Note that this is a process that
is distinct from the refinement of TLS parameters. It may be possible to
refine both TLS parameters and residual anisotropic U parameters, but
there has been no systematic study of this approach.
Choice of Rigid Groups
For the moderate resolutions considered here, the data-to-parameter
ratio justifies the use of only a few TLS groups. Some choices may be obvious, for example, treating each molecule in the asymmetric unit as a single
group. Of course, macromolecules are not rigid, but there may be a significant component of the atomic displacements that can be attributed to a
rigid body motion, with nonrigid displacements superimposed on top. In
addition, a molecule may have obvious domains that can be treated as
separate groups, as illustrated schematically in Fig. 1.
As well as modeling the protein molecules, it may be useful to model
displacements of large cofactors via TLS parameters. For example, in the
study of light-harvesting complex II described below,27 the bacteriochlorophyll and carotenoid molecules were treated as separate TLS groups. In
their study of class I peptide–MHC complexes, Rudolph et al.28 modeled
the bound peptides as a TLS group. In addition, it could be argued that
tightly bound waters should be included within large TLS groups, since
their displacements would follow those of the protein to which they are
bound. In fact, we have not found this to be helpful, and we recommend
that waters be omitted from TLS group definitions.
More robust definitions of TLS groups require additional information.
If more than one crystal form is available, then dynamic domains can be
24
D. Goetz, M. A. Holmes, N. Borregaard, M. E. Bluhm, K. N. Raymond, Strong RK Mol.
Cell. 10, 1033(2002).
25
J. Kuriyan and W. I. Weis, Proc. Natl. Acad. Sci. USA 88, 2773 (1991).
26
M. D. Winn, CCP4 Newsletter, No. 39 (2001).
27
M. Z. Papiz, S. M. Prince, T. Howard, R. J. Cagdell, N. W. Isaacs, J. Mol. Biol. 326,
1523(2003).
28
M. Rudolph, et al., unpublished work (2001).
308
map interpretation and refinement
[14]
identified from relative displacements of atoms between the crystal forms.
In conflating these dynamic domains with TLS groups, the assumption is
that relative displacements between different crystal forms reflect likely
displacements within a single crystal form. The estimation of dynamic
domains has been implemented in the computer program dyndom.29 A
different method for identifying dynamic domains has been implemented
in the program escet.30 Multiple configurations may also be generated
by molecular dynamic simulations. One of us has used this method in a
TLS refinement of light-harvesting complex II, using restrain.27
TLS groups may also be optimized by fitting to the refined ADPs of a
related, high-resolution structure. Such an approach was adopted by Holbrook et al.,10,11 who compared seven different rigid body models of deoxycytidine 50 -phosphate, and used the best (as measured by two indices for
the agreement between refined and derived U values) in subsequent TLS
refinements of other nucleic acids. Another approach using refined individual ADPs is to use the rigid body criterion,31 in which a matrix is
built up between all pairs of atoms, with elements equal to the difference
in the projected U values along the interatomic vector. Pairs of atoms
belonging to the same quasi-rigid group should have a value close to
zero. Brock et al.32 used this approach in their analysis of triphenylphosphine oxide, and similar ideas were used by Schneider33 for the protein
SP445. This approach has been implemented in the computer program
anisoanl.13,26
Program Input
To do TLS refinement in refmac, the program needs the following
information: first, the choice of TLS groups is specified in the TLSIN file.
The format of this file follows that used by restrain,12,13 but for input
one generally needs to use only the TLS and RANGE records. The TLS
record starts a group definition, and includes an optional title. The group
definition that follows consists of one or more RANGE records, which specify a range of atoms to be included in the group. Ranges contributing to a
single group need not be contiguous, since protein domains are often made
up of stretches separated along the primary sequence.
The program also needs to know the number of cycles of TLS refinement. These cycles are performed after initial estimation of scaling
29
S. Hayward and H. J. C. Berendsen, Proteins Struct. Funct. Genet. 30, 144 (1998).
T. R. Schneider, Acta Crystallogr. D Biol. Crystallogr. 56, 714 (2000).
31
R. E. Rosenfield, K. N. Trueblood, and J. D. Dunitz, Acta Crystallogr. A 34, 828 (1978).
32
C. P. Brock, W. B. Schweizer, and J. D. Dunitz, J. Am. Chem. Soc. 107, 6964 (1985).
33
T. R. Schneider, in ‘‘Proceedings of the CCP4 Study Weekend,’’ p. 133 (1996).
30
[14]
macromolecular TLS refinement in REFMAC
309
parameters and before refining coordinates and B factors. The convergence
of the free R value for this stage should be checked to see if more cycles are
needed. Convergence of the TLS refinement is usually improved if all individual B factors are initialized to a constant value (e.g., the average B value
from earlier rounds of refinement, or the Wilson B factor). The precise
value is not important since it will be compensated for by the scaling function. The individual B values will be refined individually after the TLS
parameters have been determined.
When the asymmetric unit contains several molecules of the same
species, it is often useful to apply NCS restraints. However, if these molecules have widely differing displacement parameters, as is often the case,
then restraining B factors to be similar can be problematic. In the case of
TLS refinement, however, B factor restraints are applied between residual
B values, which are more likely to be similar, and NCS restraints can be
applied.
The parameters needed for TLS refinement can be set by keywords to
refmac, and the reader is referred to the refmac documentation for
details. TLS refinement is also implemented in the CCP4 Graphical User
Interface ‘‘ccp4i,’’ and can be selected in the protocol section of the
refmac interface. There are additional interfaces for preparing the
TLSIN file, and for analyzing the refined TLS parameters.
Interpretation of Results
The output from TLS refinement in refmac is in most respects identical to other modes of refinement. Hence, one should check global statistics such as free R value and correlation coefficient, as well as detailed
geometric information. In addition, one obtains the following:
Refined TLS parameters written to the log file, to the TLSOUT file,
and also to the header of the XYZOUT file. The TLS parameters
can be analyzed further with the auxiliary program tlsanl13,21
(see below).
Refined B factors in the XYZOUT file. These are the ‘‘residual’’ B
factors that are refined separately after the TLS parameters are
determined. It is important to note that the residual B factors do not
contain any contribution from the TLS parameters, and do not
represent the full mean square displacement of the atom.
The TLSOUT file containing the refined TLS parameters has the same
format as the TLSIN file described above. In addition to the TLS and
RANGE records, it will usually have ORIGIN, T, L, and S records. The
T and L records list the six elements of the symmetric T and L tensors,
while the S record lists the eight determinable elements of the asymmetric
310
map interpretation and refinement
[14]
S tensor. The values of the T and S tensors depend on the origin of
calculation, and this is given in the ORIGIN record.
˚ 2, L in units of deg2, and S in units of
The T tensor is given in units of A
˚ deg. T contributes additively to all ADPs in the TLS group [see Eq. (2)],
A
and its values can be compared directly to overall U values. The values of
the L tensor are found to depend to a large extent on the size of the TLS
group. Finally, values of the S tensor tend to be small for the domain-sized
groups considered here. In general, the sizes of the TLS parameters are a
good guide to the overall disorder of the group. In particular, if there are
several copies of the same molecule in the asymmetric unit, they often
have different levels of disorder that are mirrored in the relative values
of the TLS parameters (see, e.g., mannitol dehydrogenase, below34). In
contrast, the residual B factors are often similar between molecules. NCS
restraints are therefore applied between residual B factors rather than full
displacement parameters.
The size dependence of the L tensor can be rationalized as follows. Derived U values on the periphery of a TLS group are proportional to the
radius of the group squared [see Eq. (2)]. If these U values retain typical
values irrespective of the size of the TLS group, as seems to be the case,
then the size of the L tensor must decrease as the reciprocal of the radius
squared. Alternatively, if one adopts a purely dynamic interpretation of the
L values (which is unlikely to be true) then one can assume from classic
equipartition that a constant amount of energy goes into each libration.
Given that the moment of inertia increases as the radius of the group
squared, the mean square libration must decrease by a similar factor. An
example of the size dependency is given by the peptide–MHC complex
described in Case Studies (below).28
One can pass the output files XYZOUT and TLSOUT from refmac
to the auxiliary program tlsanl in order to get a clearer picture of the
rigid body displacements represented by the T, L, and S tensors (see the
next section). In particular, one can look to see whether there are any dominant displacements that may have interesting implications. As noted in the
first section, however, caution should be exercised when interpreting TLS
results, since these will tend to overestimate rigid body displacements,
and will not discriminate between dynamic and static displacements.
Having examined the results from a TLS refinement procedure, one
should consider other possible choices for TLS groups. In particular,
one can look to break up large TLS groups into component domains, to
see whether an improved description can be obtained. One can, for
34
S. Ho¨ rer, J. Stoop, H. Mooibroek, U. Baumann, and J. Sassoon, J. Biol. Chem. 276, 27555
(2001).
[14]
macromolecular TLS refinement in REFMAC
311
example, look for step improvements in free R value (see, e.g., the study of
light-harvesting complex II described below).27 It should be noted, however, that global statistics are not sensitive to the precise make-up of each
TLS group when one is using domain-sized groups.
Analysis with tlsanl
tlsanl13,21 provides various analyses of TLS tensors that can be
useful. The TLS parameters are provided via the TLSIN file (the TLSOUT
file from refmac or restrain) and analyzed in the context of the
structure provided in XYZIN (the XYZOUT file from refmac or restrain). Program operation is controlled by keywords as usual, and we
mention briefly two of these in connection with the B factors held in the
ATOM records of PDB files. These B factors may represent the ‘‘residual’’
B factors, the equivalent isotropic displacement factors derived from the
TLS tensors, or the sum of these two contributions. The keyword BRESID
is used to specify that XYZIN contains the residual B factor only, as is the
case for refmac output. The keyword ISOOUT controls which of
the three possibilities is written to XYZOUT, and is useful for comparing
the different contributions.
tlsanl also outputs ANISOU records to XYZOUT, and these include both the contribution from TLS [see Eq. (2)] and the individual isotropic contribution, irrespective of the keyword ISOOUT. This
information is useful for a detailed examination of the anisotropy, and
for preparing ORTEP-style pictures.35 It must be remembered, however,
that the ADPs held in the ANISOU records are not independent, but are
derived from the TLS model. Inspection of individual ADPs may inform
the choice of TLS group, in that atoms having unreasonable ADPs should
be excluded from the TLS group and the refinement rerun.
For each TLS group, tlsanl gives several representations of the T,
L, and S tensors. Two coordinate origins are considered:
1. The origin used in refinement, and given by the ORIGIN record in
TLSIN.
2. The center of reaction, which is the origin that makes S symmetric
and minimizes the trace of T.3
Three axial systems are considered.
1. Orthogonal axes, as used in XYZIN and TLSIN
35
M. N. Burnett and C. K. Johnson, ‘‘ORTEP-III: Oak Ridge Thermal Ellipsoid Plot
Program for Crystal Structure Illustrations.’’ Oak Ridge National Laboratory Report
ORNL-6895 (1996).
312
map interpretation and refinement
[14]
2. Libration axes, that is, the principal axes of the L tensor
3. Rigid body axes, that is, as calculated from the atomic coordinates
Full details are can be found in Howlin et al.,21 but some representations
are useful for checking or for interpretation.
‘‘INPUT TENSOR MATRICES WRT ORTHOGONAL AXES
USING ORIGIN OF CALCULATIONS’’ should echo the contents
of the TLSIN file, with the values now displayed as matrices
‘‘TENSOR MATRICES WRT ORTHOGONAL AXES USING
CENTRE OF REACTION’’: The change of origin implies changes
to the T and S matrices, but not L. In particular, S should now be
symmetric.
‘‘FOR TLS TENSOR USING CENTRE OF REACTION’’: For this
choice of origin, T, L, and S can be diagonalized to give principal axes.
This section gives the orientation of these axes in various coordinate
frames, as well as the magnitudes along these axes. A selection of
these axes can be output in a format suitable for inclusion in
molscript,36 using the AXES keyword. Figure 2 gives an example of
libration axes superimposed against the chain of a class I MHC.
The principal axes of T and L should be checked for dominant translations or librations. It is usually convenient to quote the eigenvalues of T
and L, that is, the magnitudes of the mean square displacements along or
about the principal axes, rather than the entire tensor. It may be possible
to relate the principal axes and the associated eigenvalues to features of
the atomic structure.
The S tensor describes the mean square correlation between the translational and librational displacements. The correlation is typically small for
large TLS groups, but may play a significant role for smaller groups. The
librational displacements may be reinterpreted as screw rotations by
the use of nonintersecting axes,3 and a rough guide to the significance of
the screw component may be obtained by comparing the nonintersecting
screw axes with the original libration axes.
Case Studies
Applications of TLS refinement to glyceraldehyde-3-phosphate dehydrogenase37 and GerE38 have been described in detail elsewhere.20 Here
we describe briefly a number of case studies; a summary is given in Table I.
36
37
P. J. Kraulis, J. Appl. Crystallogr. 24, 946 (1991).
M. N. Isupov, T. M. Fleming, A. R. Dalby, G. S. Crowhurst, P. C. Bourne, and J. A.
Littlechild, J. Mol. Biol. 291, 651 (1999).
[14]
macromolecular TLS refinement in REFMAC
313
Fig. 2. Example of libration axes as output by the AXES option of tlsanl. The structure
is the chain of a class I MHC, as obtained from PDB entry 1fzk.41 Superimposed are the
libration axes derived from a TLS refinement (see Class I Peptide–MHC Complexes). The axes
intersect at the center of reaction of the TLS, and the length of each axis is proportional to the
mean square libration about that axis. Prepared using Molscript36 and Raster3D.39
Light-Harvesting Complex II
Light-harvesting complex II (LH2) from Rhodopseudomonas acidophila is an integral membrane complex composed of bacteriochlorophyll a
(Bchl a), carotenoids, and small peptides.40 By trapping solar energy it is
involved in the early stages of photosynthesis. The complex is a nonamer
38
V.M-A. Ducros, R. J. Lewis, C. S. Verma, E. J. Dodson, G. Leonard, J. P. Turkenburg, G.
N. Murshudov, A. J. Wilkinson, and J. A. Brannigan, J. Mol. Biol. 306, 759 (2001).
39
E. A. Merritt and D. J. Bacon, Methods Enzymol. 277, 505 (1997).
40
G. McDermott, S. M. Prince, A. A. Freer, A. M. Hawthornethwaite-Lawless, M. Z. Papiz,
R. J. Cogdell, and N. W. Isaacs, Nature 374, 517 (1995).
314
[14]
map interpretation and refinement
TABLE I
Examples of TLS Refinement Usinga Refmac
Protein
˚)
Resolution (A
Comments on TLS model
PDB code
GAPDH
GerE
Light-harvesting complex II
Mannitol dehydrogenase
Class I peptide–MHC
complexes
Siderophore-binding
protein
S100A12
Benzoate dioxygenase
reductase
GABARAP
rhoE
DLM-1–Z-DNA complex
2.05
2.05
2.0
1.5
1.9–1.7
2.4
1, 2, and 4 groups compared
1 TLS group per monomer
Several models compared
1 TLS group per monomer
TLS group for bound
peptide
TLS groups for ligands
1b7g
1fse
1kzu
1h5q
1fzj, 1fzk,
1fzm, 1fzo
—
1.95
1.5
1 TLS group per monomer
1 TLS group per domain
1e8a
—
1.75
2.1
1.85
—
—
1j75
Thioredoxin reductase
3.0
1 TLS group
1 TLS group per monomer
Nucleotides divided into 3
groups
1 TLS group per monomer
a
1h6v
Note that when TLS refinement occurred after deposition, the quoted PDB entry is for
coordinates only.
composed of 63 separate molecules; within the crystal the nonameric axis is
incorporated into the rhombohedral (R32) 3-fold axis with one-third of the
complex in the asymmetric unit. Each asymmetric unit has three copies of
the nonameric repeat with an and peptide, 3 Bchl a, and one carotenoid
molecule (the second carotenoid molecule is ill-defined and excluded).
˚ resolution and a model was initially refined
Data were collected to 2.0-A
with isotropic B factors to an R value of 0.219 and a free R value of 0.249.
Subsequently, the model was rerefined with TLS parameters.27 The
choice of TLS groups was made by exploring the significance of the improvement made to the refinement as the structure was progressively divided into smaller TLS groups of atoms. For example, one TLS tensor
for the whole asymmetric unit reduces R to 0.201 and free R to 0.224 while
further subdividing to three TLS tensors, one for each NCS unit, changes
the R to 0.200 and free R to 0.223. A natural choice of groups are the individual molecules, 18 in all for the asymmetric unit and for 18 TLS tensors
we see a further improvement to an R of 0.176 and a free R of 0.198. The a
and b peptides each form into three domains, two surface-lying segments
and one long transmembrane -helical domain. With three TLS groups
for each peptide, there is a total of 30 groups, but there is only a small
further improvement to an R of 0.175 and a free R of 0.197.
[14]
315
macromolecular TLS refinement in REFMAC
Figure 3 shows the variation of the R values with the number of TLS
parameters. The initial decrease associated with the use of a single TLS
group suggests that there are a group of motions correlated over the whole
asymmetric unit. This apparent motion may also represent disorder in the
lattice and not real nuclear motions. However, a second stepwise improvement occurs when a group of intracomplex nuclear motions is defined by
subdividing into the individual molecular groups. In general, for this complex, the TLS motions are dominated by vibrations in the plane of the
membrane and in a direction tangential to the ring of the nonamer. Since
there are three copies of each molecule in the asymmetric unit, it is possible
to check the quality of the TLS refinement which, unlike the coordinates, is
not NCS constrained. The equivalent isotropic atomic B factors agree to
within 5% for equivalent molecules.
Mannitol Dehydrogenase
Ho¨ rer et al.34 have solved the structure of mannitol dehydrogenase
from Agaricus bisporus, which crystallized with three tetramers in the
˚ . One of the three tetramers was
asymmetric unit, and diffracted to 1.5 A
0.26
R values
0.24
0.22
0.2
0.18
0.16
0
200
400
600
Number of TLS parameters
Fig. 3. Plot of R value (dashed line with circles) and free R value (solid line with squares)
against the number of TLS parameters included in the model for light-harvesting complex II
from R. acidophila.27 The plot shows the sharp improvements associated with the initial 20
parameters, and the increase to 360 parameters, as compared with the smaller improvements
with 60 and 600 parameters.
316
[14]
map interpretation and refinement
found to have poorer electron density and higher B factors, correlated to
the fact that this tetramer has fewer crystal contacts (38) than the other
two (49 and 50).
TLS refinement was carried out to account for these differences, with a
single TLS group refined for each monomer (12 groups overall). After TLS
refinement, all tetramers had similar residual B factors and the electron
density was much clearer. The TLS parameters for the ‘‘bad’’ tetramer refined to larger values, reflecting the larger overall displacements of this tetramer. The improved description of this tetramer was reflected in the R and
free R values, which fell by 3.0 and 2.7%, respectively.
Figure 4 shows the full equivalent B factors from TLS refinement (top
three curves), together with the residual B factors (lower three curves), for
the three tetrameric units. One of the three tetramers has different total B
factors than the other two, reflecting greater disorder. With the TLS contribution removed, the residual B factors are close for all tetramers. Note,
Average main−chain B factor
80
60
40
20
0
0
200
400
600
Residue number
800
1000
Fig. 4. Equivalent isotropic B values for mannitol dehydrogenase.34 The ordinate runs over
the 1040 residues of the biological tetramer. The upper three curves show the equivalent
isotropic B values from the TLS refinement for the three tetramers in the asymmetric unit.
One tetramer (the top line) is clearly more disordered than the other two. The lower three
curves show the residual B values, with the TLS contribution removed. These curves are close,
and almost indistinguishable. For each residue, the B factors are averaged over the main-chain
atoms.
[14]
macromolecular TLS refinement in REFMAC
317
however, that NCS restraints were applied to the residual B factors in
this plot.
Class I Peptide–MHC Complexes
Rudolph et al.41 have solved the structures of four similar class I peptide–MHC complexes (PDB entries 1fzj, 1fzk, 1fzm, and 1fzo) at reso˚ . MHC molecules specifically bind
lutions in the range of 1.7 to 1.9 A
peptides in an extended conformation and present them to the T cell receptor on cytotoxic T cells. Class I MHC molecules are heterodimers consisting of a heavy chain () and a light chain (). The bound peptides were
eight or nine amino acids long.
The deposited models were rerefined,28 using the TLS procedure of
refmac. The chain was modeled with one or two TLS groups (residues
1–180 and 181–274), while the chain and the bound peptide were each
treated as a single TLS group. The TLS refinement produced no apparent
differences in the maps, but a slight decrease in free R value for two of the
models: 1fzj reduced from 0.222 to 0.216 and 1fzm from 0.223 to 0.211. The
other models had unchanged free R values.
Therefore, in this particular case TLS refinement had only a minor
effect, but the resulting TLS tensors are illustrative. In all cases the libration tensor for the peptide is asymmetric, with the largest libration about
an axis parallel to the long axis of the peptide. For example, for 1fzk the
eigenvalues of L are 64.692, 0.852, and 4.108 (see Table II). Note that
the negative eigenvalue of L is allowed by the refinement procedure, although it can no longer be interpreted in terms of a rigid body model.
These results confirm the expectation that the dominant displacement is a
libration about the major axis of the extended peptide which has minimal
steric hindrance.
In addition, the largest eigenvalue of L for the peptide is significantly
greater than any of the eigenvalues for the protein, and also the eigenvalues of L for the chain (see Fig. 2) are larger than those for the chain.
This is the size effect noted previously.
Siderophore-Binding Protein
Goetz and co-workers24 have solved the structure of a siderophore
˚ . The
binding protein complexed with enterobactin using data to 2.4 A
asymmetric unit contained three copies of the protein plus ligand. Each
protein molecule has 177 residues, while enterobactin is a cyclic trimer of
41
M. G. Rudolph, J. A. Speir, A. Brunmark, N. Mattsson, M. R. Jackson, P. A. Peterson, L.
Teyton, and I. A. Wilson, Immunity 14, 231 (2001).
318
map interpretation and refinement
[14]
TABLE II
Eigenvalues of T Tensor with Respect to Center of Reaction and Eigenvalues of L
Tensor for Class I MHC Complexed with Sendai Virus Nucleoproteina
Chain
Number of peptides
˚ 2)
Eigenvalues of T (A
Eigenvalues of L (deg2)
A
B
P
274
99
9
0.047, 0.012, 0.000
0.039, 0.014, 0.003
0.202, 0.071, 0.037
1.327, 0.547, 0.354
5.259, 1.342, 0.831
64.692, 0.852, 4.108
a
PDB code 1fzk.
2,3-dihydroxybenzoylserine, which forms a compact sphere of approximate
˚.
radius 5 A
TLS refinement was used throughout building and refinement, with
one TLS group for each protein and enterobactin molecule, that is, a total
of six TLS groups. For the first round of refinement, all B factors were
set to a constant value equal to the Wilson B factor, while for subsequent
rounds the B factors were not reset. The TLS parameters were, however,
reset to zero for each round of refinement. Refinement was also carried
out in CNS (without TLS refinement) and this provided a number of
interesting comparisons.
The final free R value was 0.27 compared with 0.32, using CNS without
TLS refinement. The difference density was in general flatter from REFMAC than from CNS, indicating that the TLS parameters were modeling
some of the differences. For the ligand, however, it became clear that some
of these differences were in fact model errors. Enterobactin is known to be
quite unstable in solution, and had degraded during the time it took to
grow the crystals. The TLS refinement tried to compensate for the use of
the full enterobactin model, and this is evident from the abnormally large
values of the diagonal elements of L (up to 300 deg2). In contrast, the CNS
refinement without TLS showed large negative peaks around the ligand
atoms that were nonexistent due to degradation, and this suggested likely
degradation products.
After modeling the most likely degradation products by removing
atoms and breaking bonds, the TLS parameters were more realistic, with
diagonal elements of L on the order of 50 deg2. If there are a number
of degradation products, then it is possible that a significant part of the
TLS parameters is still reflecting model error rather than rigid body
displacements.
This experience supports the view that TLS refinement is most effective
when the model is complete and accurate. Conversely, if TLS refinement
performs poorly, then it may indicate problems with the model.
[14]
macromolecular TLS refinement in REFMAC
319
Other Examples
Moroz et al.42 have solved the structure of human calcium-binding pro˚ resolution, using TLS refinement in the final stages.
tein S100A12 at 1.95-A
S100A12 crystallized with two molecules in the asymmetric unit, and each
molecule was treated as a single TLS group. TLS refinement contributed a
drop of 3% in R and free R values. Chain B was found to have slightly
greater TLS parameters than chain A, but the difference in disorder was
too slight to be observed in the electron density.
Karlsson and coworkers43 used TLS refinement on benzoate dioxygenase reductase, a 348-residue protein from an Acinetobacter species. Each
protein molecule consists of three similarly sized domains, each binding
FAD, NADH, and a 2Fe2S center. There are two molecules in the asym˚.
metric unit and the best data set extends to 1.5 A
Conventional refinement without TLS gave an R value of 0.268 and a
free R value of 0.291. When one examines the maps and B factors, one
can see that one of the two molecules in the asymmetric unit was more disordered, and in particular two of the three domains in this molecule were
disordered. Running refmac with six TLS groups, one for each domain
of the two molecules, lowered the R and free R values to 0.241 and 0.249,
respectively. In addition, the quality of the maps was improved.
Keep and co-workers44 have used TLS refinement on GABARAP and
rhoE structures. GABARAP is a 117-amino acid protein that crystallized
in P2, 2, 2, with a single copy in the asymmetric unit, and diffracted to
˚ . The whole protein, excluding water, was modeled with a single
1.75 A
TLS group. Refinement gave R and free R values of 0.203 and 0.230, respectively, compared with 0.215 and 0.239 without TLS. Furthermore, it
lowered the free R value more than a full anisotropic analysis (while using
considerably fewer refinement parameters).
The rhoE structure consists of two copies of a 180 amino acid protein
˚ . Each copy of the protein, with associated cofactors,
with cofactors at 2.1 A
was treated as a separate group. TLS refinement gave R and free R values
of 0.181 and 0.214, respectively, compared to 0.192 and 0.226 without TLS.
For both structures, differences in the electron density maps were at the
level of a few water peaks and some hints of alternative conformations,
rather than any major new features.
42
O. V. Moroz, A. A. Antson, G. N. Murshudov, N. J. Maitland, G. G. Dodson, K. S. Wilson,
I. Skibshoj, E. M. Lukanidin, and I. B. Bronstein, Acta Crystallogr. D Biol. Crystallogr. 57,
20 (2001).
43
A. Karlsson, Z. M. Beharry, D. M. Eby, E. D. Coulter, E. L. Neilde, D. M. Kurtz, H.
Eklund, S. Ramaswamy, Journal of Molecular Biology 318, 261(2002).
44
N. Keep, H. Garavini, K. Reinto, J. P. Phelan, M. S. B. McAlister, and A. J. Ridley,
Biochemistry 41, 6303(2002).
320
map interpretation and refinement
[14]
Schwartz et al.45 included TLS refinement as the final stage of the structure
˚ . They achieved a
determination of a DLM-1–Z-DNA complex at 1.85 A
3% reduction in R values, but comment that the selection of TLS groups
was crucial. Each nucleotide was divided into three groups (in a similar
manner to Holbrook and Kim10 and Holbrook et al.11), while the protein
chain was treated as a single group.
Finally, Sandalova et al.46 used TLS refinement on a model of a mam˚ . The structure has three dimers in
malian thioredoxin reductase at 3.0 A
the asymmetric unit, and each monomer was treated as a separate TLS
group.
Conclusions
TLS parameterization of anisotropic displacement parameters has been
around for many years, including a number of examples of TLS refinement
of macromolecules. However, it is only now that TLS refinement is beginning to be used routinely. It is clear from the above-described examples
that, with the addition of a small number of extra refinement parameters,
a much improved description of diffraction data can be obtained.
As well as the global improvement resultant on the inclusion of anisotropy, as evidenced by the free R value, a number of other pieces of information may be obtained from TLS refinement.
1. The overall level of disorder of monomers or domains can be
quantified and perhaps rationalized, as in the case of mannitol
dehydrogenase described above.
2. Removal of the domain-level displacements may reveal local
displacements that are features of the local geometry, and therefore exist
for all copies of a molecule.
3. If the TLS refinement performs poorly, and the TLS parameters are
unreasonable, then this may be a clue to the necessity of rebuilding, as in
the case of the siderophore binding protein above.
4. Comparison of different parameterizations can help to categorize
TLS modes as lattice modes, monomer displacements, or internal modes;
see the analysis of light-harvesting complex II given above.
The displacement parameters determined from diffraction data have
many sources. The TLS parameterizations described in this chapter can
45
T. Schwartz, J. Behlke, K. Lowenhaupt, U. Heinemann, and A. Rich, Nat. Struct. Biol. 8,
761 (2001).
46
T. Sandalova, L. Zhong, Y. Lindqvist, A. Holmgren, and G. Schneider, Proc. Natl. Acad.
Sci. 98, 9533 (2001).
[15]
structural information content at high resolution
321
be seen as a simple attempt to separate global displacements from local displacements. Other parameterizations may identify other collective displacements, such as molecular normal modes.4–6 With a correct
interpretation of these modes, we can perhaps say something about the
biology of these molecules. This is still a relatively unexplored area but
has great potential.
Acknowledgments and Program Availability
M.D.W. and G.N.M. are supported by the BBSRC through a CCP4 grant (B10200).
M.D.W. is grateful for all feedback from users of the TLS option in refmac. We are
particularly grateful to Stefan Ho¨ rer for the description of the refinement of mannitol
dehydrogenase, and for the data used to produce Fig. 4; to Markus Rudolph for the description
of the refinement of peptide–MHC complexes, and for the data used to produce Fig. 2; and to
David Goetz, Andreas Karlsson, and Nicholas Keep for the examples of siderophore-binding
protein, benzoate dioxygenase reductase, and GABARAP and rhoE, respectively.
refmac, tlsanl, and anisoanl are all available as part of the CCP4 software
suite, from release 4.1. See http://www.ccp4.ac.uk for information on downloading and
licensing. The operation of these programs as described in this chapter is that pertaining to
version 4.1.
[15] Structural Information Content at High Resolution:
MAD versus Native
By Alberto Podjarny, Thomas R. Schneider, Raul E. Cachau, and
Andrzej Joachimiak
Introduction
Structure determination by X-ray crystallography is firmly established
as the main way of obtaining accurate three-dimensional information about
molecular structure. In the case of macromolecules, the last decade has
seen an exponential growth in the number of structures solved,1 and this
tendency is gaining even more speed with current efforts in structural genomics.2 The increase in the speed and number of structures has been accompanied by a remarkable improvement in the quality of the data collected
thanks to the introduction of third-generation synchrotron X-ray sources.
Thus, today X-ray crystallography is the method of choice to obtain macromolecular details at subatomic resolutions in the absence of neutron
1
2
W. G. Schulz, Chem. Eng. News 79, 23 (2001).
U. Heinemann, G. Illing, and H. Oschkinat, Curr. Opin. Biotechnol. 12, 348 (2001).
METHODS IN ENZYMOLOGY, VOL. 374
Copyright 2003, Elsevier Inc.
All rights reserved.
0076-6879/03 $35.00