[14] Macromolecular TLS Refinement in REFMAC at
Transcription
[14] Macromolecular TLS Refinement in REFMAC at
300 map interpretation and refinement [14] In general the crystallographer should reserve a day in which they have no defined task to complete and practice on data that is not important (i.e., the supplied example data). In this way frustration can be reduced, and the advantages of the applications described quickly become obvious. [14] Macromolecular TLS Refinement in REFMAC at Moderate Resolutions By Martyn D. Winn, Garib N. Murshudov, and Miroslav Z. Papiz Introduction To interpret the results of an X-ray diffraction experiment completely, one would like to supplement knowledge of the average atomic positions with information about their variation in space and over time. Within the usual Gaussian approximation,1,2 the full probability distribution is reduced to the mean square deviation of each atom from its mean position. One is then left with two tasks: how best to estimate these mean square deviations, and how to interpret them physically. In general, each atom can deviate anisotropically from its mean position, and six parameters are necessary to describe the mean square displacements fully. These parameters are referred to as the anisotropic displacement parameters (ADPs),2 are usually denoted U, and can be visualized as so-called thermal ellipsoids. The addition of an extra six parameters per atom in macromolecular refinement is usually not justified by ˚ ). One the data, except when atomic resolution data are available (<1.2 A can reduce the number of refinement parameters by assuming isotropic displacements, in which case there is a single extra parameter per atom, the ˚ ), so-called Debye–Waller B factor. At medium resolutions (ca. 1.2–3.0 A this is an oversimplification, ignoring anisotropy in the data that could be modeled. Often one might like to model anisotropic displacements without resorting to the use of independent ADPs. Restraints are commonly applied in ADP refinement, which allows the use of ADPs at lower than atomic resolutions. Alternatively, one can apply constraints rather than 1 B. T. M. Willis and A. W. Pryor, ‘‘Thermal Vibrations in Crystallography.’’ Cambridge University Press, Cambridge, 1975. 2 K. N. Trueblood, H.-B. Bu¨rgi, H. Burzlaff, J. D. Dunitz, C. M. Gramaccioli, H. H. Schulz, U. Shmueli, and S. C. Abrahams, Acta Crystallogr. A 52, 770 (1996). METHODS IN ENZYMOLOGY, VOL. 374 Copyright 2003, Elsevier Inc. All rights reserved. 0076-6879/03 $35.00 [14] macromolecular TLS refinement in REFMAC 301 restraints to the ADPs, and this is done most easily by deriving the ADPs from a more general model. The best known examples are the TLS (translation, rotation, screw-rotation) model,3 which we describe here, and the normal mode model4–6; in principle, the approach could be very general. Collective modes are devised to model the anisotropic displacements, from which ADPs and hence calculated structure factors can be derived. To be useful, these modes should be specified by fewer parameters than the full ADP model, and to be successful the modes should be based on a reasonable physical model. Note that each individual mode need not be accurate, provided the subspace spanned by all modes covers the necessary region of parameter space. Thus, the Gaussian network model7,8 is based on a simple single-parameter force field.9 The individual modes derived from this force field are expected to be less accurate than those calculated from more sophisticated force fields, but nevertheless they have been found to describe mean square atomic displacements well.7,8 In the current article, we consider the TLS parameterization of ADPs.3 The physical model here is that the anisotropic displacements can be modeled as those of a set of rigid bodies. In this case, the collective modes are the three translations and three librations of each rigid body. Taking into account cross-correlations, the mean square displacements of atoms in each rigid body are described by 21 parameters, although in fact one of these cannot be fixed by the usual second-order expansion of the libration. For rigid groups of four atoms or larger, the TLS parameterization represents a decrease in the number of parameters over a full ADP description. Crucially, however, it still provides an anisotropic description. TLS parameterization illustrates well the second task mentioned above, that of interpreting the refined parameters. As is well known, Bragg reflection data give no information on correlations between atomic displacements (see, e.g., the discussion in Kidera and Go6). Therefore, although such a correlation is built into the rigid body model with all atoms moving in phase, refinement against Bragg reflection data does not distinguish this model from a similar model in which, for example, some atoms move in antiphase. Generally speaking, rigid body motion occurs as a lowfrequency mode, and is more likely to be thermally activated than more 3 V. Schomaker and K. N. Trueblood, Acta Crystallogr. B 24, 63 (1968). R. Diamond, Acta Crystallogr. A 46, 425 (1990). 5 A. Kidera and N. Go, Proc. Natl. Acad. Sci. USA 87, 3718 (1990). 6 A. Kidera and N. Go, J. Mol. Biol. 225, 457 (1992). 7 T. Haliloglu, I. Bahar, and B. Erman, Phys. Rev. Lett. 79, 3090 (1997). 8 A. R. Atilgan, S. R. Durell, R. L. Jernigan, M. C. Demirel, O. Keskin, and I. Bahar, Biophys. J. 80, 505 (2001). 9 M. M. Tirion, Phys. Rev. Lett. 77, 1905 (1996). 4 302 map interpretation and refinement [14] complex motions, but this is not proved by the data. In fact, TLS parameterization acts, like all parameterizations of mean square displacements, as a sink for many kinds of displacements as well as model errors. This is particularly true when TLS is the only modeling of anisotropy that is being used. Therefore, while rigid body motion may be a physically reasonable model, its contribution to the observed mean square displacements is likely to be overestimated. If one assumes for the moment that the rigid body model is a good one for the problem in hand, one must also remember that it may represent one of several kinds of displacement, or more likely a combination of all kinds. First, it may represent thermally activated motion occurring within the time scale of the experiment. This is likely to be of greatest interest for the biology of the protein. Second, it may represent static disorder, with rigid bodies in different unit cells of the crystal lying in slightly different positions. If these differences apply to the whole unit cell, then this represents lattice disorder. Finally, it may represent errors in the model that happen to fit a rigid body form. From a single experiment at a single temperature, it is impossible to distinguish between these. Measuring data sets for the same crystal at several temperatures may allow one to separate out the dynamic and static contributions, but this relies on the data sets being equal in all other respects. Despite these provisos, the TLS model can be extremely useful. At the moderate resolutions that are the subject of the current chapter, only a few sets of TLS parameters are used by modeling entire domains or molecules as quasi-rigid bodies. Thus, some modeling of anisotropic displacements is achieved at the expense of a few tens of additional parameters. Isotropic atomic B factors are retained, which model deviations from the rigid body description. The TLS parameters may give some insight into domain displacements, while the B factors provide information on local displacements after overall domain displacements have been removed. The first application of the refinement of TLS parameters against X-ray data for a macromolecule was that of Holbrook and co-workers,10,11 who used a modified version of the corels program to refine TLS parameters for each phosphate, ribose, and base of a duplex DNA dodecamer. Subsequently, Moss and co-workers extended the program restrain12,13 to 10 S. R. Holbrook, and S. Kim, J. Mol. Biol. 173, 361 (1984). S. R. Holbrook, R. E. Dickerson, and S. Kim, Acta Crystallogr. B 41, 255 (1985). 12 H. Driessen, M. I. J. Haneef, G. W. Harris, B. Howlin, G. Khan, and D. S. Moss, J. Appl. Crystallogr. 22, 510 (1989). 13 Collaborative Crystallographic Project No. 4, Acta Crystallogr. D Biol. Crystallogr. 50, 760 (1994). 11 [14] macromolecular TLS refinement in REFMAC 303 allow refinement of TLS parameters, and applied this to bovine ribonuclease ˚ 14 and papain at 1.6 A ˚ .15 Sˇ ali et al.16 used restrain to refine A at 1.45 A ˚. TLS parameters for the two domains of an endothiapepsin complex at 1.8 A 17 ˚ Stec et al. refined crambin against atomic resolution data (0.83 A) with the protein molecule divided into one, two, or three TLS groups, and compared the predicted anisotropic U values against directly refined values. Wilson and Brunger18 have compared a full anisotropic refinement of calmodulin with a TLS refinement, in order to identify domain displacements. Thus, there have been several studies of the use of TLS parameters in macromolecular refinement, but TLS refinement is not yet used regularly. To encourage this, we included TLS refinement in the maximum likelihood refinement program refmac,19 a detailed description of which is presented in Winn et al.20 In the current chapter we discuss some of the practical issues involved in running TLS refinement in refmac, and review a number of applications. We also discuss the analysis program tlsanl.13,21 In giving details of these programs, it should be noted that the software continues to evolve, and the reader must examine the latest documentation for up-to-date guidance. TLS Parameterization Definition of TLS Parameters TLS parameterization has been described in detail by Schomaker and Trueblood,3 with useful summaries in Howlin et al.14 and Schomaker and Trueblood,22 and the reader is referred to these articles for background theory. A single set of TLS parameters is defined for each putative rigid body identified in the model. An instantaneous displacement of one of these 14 B. Howlin, D. S. Moss, and G. W. Harris, Acta Crystallogr. A 45, 851 (1989). G. W. Harris, R. W. Pickersgill, B. Howlin, and D. S. Moss, Acta Crystallogr. B 48, 67 (1992). 16 A. Sˇ ali, B. Veerapandian, J. B. Cooper, D. S. Moss, T. Hofmann, and T. L. Blundell, Proteins Struct. Funct. Genet. 12, 158 (1992). 17 B. Stec, R. Zhou, and M. M. Teeter, Acta Crystallogr. D Biol. Crystallogr. 51, 663 (1995). 18 M. A. Wilson, and A. T. Brunger, J. Mol. Biol. 301, 1237 (2000). 19 G. N. Murshudov, A. A. Vagin, and E. J. Dodson, Acta Crystallogr. D Biol. Crystallogr. 53, 240 (1997). 20 M. D. Winn, M. N. Isupov, and G. N. Murshudov, Acta Crystallogr. D Biol. Crystallogr. 57, 122 (2001). 21 B. Howlin, S. A. Butler, D. S. Moss, G. W. Harris, and H. P. C. Driessen, J. Appl. Crystallogr. 26, 622 (1993). 22 V. Schomaker, and K. N. Trueblood, Acta Crystallogr. B 54, 507 (1998). 15 304 map interpretation and refinement [14] rigid bodies can be described in terms of a rotation about an axis passing through a fixed point, together with a translation of that fixed point. For small rotations, the corresponding instantaneous displacement u of an atom in the rigid body at point r relative to the fixed point is given by u¼tþlr (1) where t is the translation, l is a vector along the rotation axis with a magnitude equal to the angle of rotation, and denotes a cross-product. Figure 1 illustrates these rigid body displacements for a protein divided into two rigid bodies. The TLS contribution to the ADP of an atom in the rigid body can be derived from the square of the atomic displacement given in Eq. (1), averaged over all unit cells and over the time scale of the experiment: UTLS huuT i ¼ T þ ST rT r S r L rT (2) The symmetric T tensor describes the mean square translation of the rigid body; the symmetric L tensor describes the mean square libration of the rigid body; and the nonsymmetric S tensor describes the mean square correlation between the translational and librational displacements. The expansion of the right-hand side of Eq. (2) does not include the trace of S, t2 r2 r1 t1 l2 l1 Fig. 1. A schematic diagram, showing a protein molecule divided into two rigid groups. The instantaneous displacement of each rigid group can be described in terms of a translation t and a libration l. r1 and r2 denote the rest positions of atoms in the two groups. The displacements of the two groups are assumed to be completely independent. [14] macromolecular TLS refinement in REFMAC 305 and so S contributes only eight independent parameters to Eq. (2). Together with the independent elements of T and L, there are thus 20 TLS parameters per group. Given a set of refined ADPs, Eq. (2) can be used to determine the TLS parameters for each rigid body via a least-squares fit. This approach is common practice in small molecule crystallography, where it is used as a tool of analysis rather than for structure refinement. Alternatively, and in the approach we use here, Eq. (2) can be used to derive ADPs and hence calculated structure factors from TLS parameters during structure refinement. The TLS parameters are thus used as refinement parameters, rather than the ADPs themselves. Usually, isotropic atomic B factors are added to UTLS to give the total displacement parameter used in the calculated structure factor. Thus, the mean square displacements of each rigid group are described by the 20 independent TLS parameters, together with a single isotropic parameter for each atom, rather than the six parameters per atom of a full anisotropic description. In other words, the large number of parameters involved in refining each atom anisotropically is reduced by introducing constraints relevant to rigid body motion. The number of extra parameters used to model the anisotropy depends on the number of TLS groups defined. A single TLS group for the whole molecule may prove useful, and requires only 20 extra parameters. Alternatively, one may define a TLS group for every rigid side chain, introducing a few thousand extra parameters (see, e.g., Harris et al.15). By using only a few TLS groups, however, ˚. TLS refinement can be used at moderate resolution, for example, 2.0 A Implementation of TLS in refmac Maximum likelihood refinement of individual ADPs using a fast Fourier transform method was implemented previously in refmac.23 Refinement of TLS parameters, or indeed any other collective description, is then implemented as a wrapper to ADP refinement: the necessary derivatives of the likelihood function with respect to the elements of the tensors T, L, and S for each TLS group defined are obtained by the chain rule from the derivatives with respect to individual ADPs. All calculations are done in an orthogonal coordinate system, and in particular TLS parameters are referred to orthogonal axes. TLS refinement is currently performed as a separate step to the scaling calculation, and to the refinement of atomic positions and atomic 23 G. N. Murshudov, A. A. Vagin, A. Lebedev, K. S. Wilson, and E. J. Dodson, Acta Crystallogr. D Biol. Crystallogr. 55, 247 (1999). 306 map interpretation and refinement [14] displacement parameters. In principle, there is some redundancy in the use of overall scaling parameters, TLS parameters, and atomic parameters. For example, any amount can be removed from the trace of T and added to the individual B factors of atoms in the TLS group. While the refinement of the different parameter classes is performed separately, there is no numerical instability, although the redundancy should be remembered when interpreting actual values of T or individual B factors. Using refmac to do TLS Refinement TLS refinement is designed to provide a simple description of anisotropic displacements when the resolution is not good enough to justify refinement of atomic ADPs. At marginal resolutions, TLS groups may be assigned to individual side chains, and a detailed description of the atomic ˚ and Harris et al.15 at displacements obtained (e.g., Howlin et al.14 at 1.45 A ˚ ). However, a more common scenario is at medium resolution (the 1.6 A examples described below span the approximate resolution range 1.5 to ˚ ) when the data-to-parameter ratio justifies the modeling of anisot3.0 A ropy only at the molecule or domain level. Such anisotropy may or may not be apparent from the data, but seems to occur frequently. Indeed, the fact that a crystal does not diffract to high resolution may indicate the presence of large molecular displacements. If there are several molecules in the asymmetric unit, then the average mean square displacements often differ from molecule to molecule. An additional benefit of TLS refinement of molecules or domains is to account for these differences at a global level. In fact, if the anisotropy is small, the same benefit may be obtained by applying global B factors to each molecule, but such a case is included automatically in the TLS refinement. We therefore recommend that TLS refinement at the level of molecules should be considered for all refinements. In this sense, TLS refinement can be considered an extension of the refinement of scaling parameters. More thought needs to go into a decision to do detailed TLS refinement. Does the data-to-parameter ratio justify it? Is the choice of TLS model a good one? For a detailed description, other models such as ones based on normal modes may be more appropriate. Having asserted that TLS refinement should be used to model overall molecular displacements, one should mention that TLS refinement appears to work better toward the end of the refinement. There is anecdotal evidence that TLS refinement can be unstable when the model is incomplete and contains many errors. In this scenario, ADPs would describe predominantly the errors in the model, and such errors may not conform well to a rigid body description. [14] macromolecular TLS refinement in REFMAC 307 In other cases, the TLS refinement of an incomplete model may be stable but may give unrealistic values to the TLS parameters. An example of this is a refinement of a siderophore-binding protein,24 described below. Alternatively, large discrepancies between the individually refined B factors and those derived from the TLS parameters may indicate model errors25 or unmodeled multiple conformations.17 At atomic resolutions, it is more usual to refine individual atomic ADPs rather than TLS parameters. Having obtained refined atomic ADPs, one can attempt to fit TLS parameters to these ADPs (using, e.g., the program THMA22 or the program anisoanl13,26). Note that this is a process that is distinct from the refinement of TLS parameters. It may be possible to refine both TLS parameters and residual anisotropic U parameters, but there has been no systematic study of this approach. Choice of Rigid Groups For the moderate resolutions considered here, the data-to-parameter ratio justifies the use of only a few TLS groups. Some choices may be obvious, for example, treating each molecule in the asymmetric unit as a single group. Of course, macromolecules are not rigid, but there may be a significant component of the atomic displacements that can be attributed to a rigid body motion, with nonrigid displacements superimposed on top. In addition, a molecule may have obvious domains that can be treated as separate groups, as illustrated schematically in Fig. 1. As well as modeling the protein molecules, it may be useful to model displacements of large cofactors via TLS parameters. For example, in the study of light-harvesting complex II described below,27 the bacteriochlorophyll and carotenoid molecules were treated as separate TLS groups. In their study of class I peptide–MHC complexes, Rudolph et al.28 modeled the bound peptides as a TLS group. In addition, it could be argued that tightly bound waters should be included within large TLS groups, since their displacements would follow those of the protein to which they are bound. In fact, we have not found this to be helpful, and we recommend that waters be omitted from TLS group definitions. More robust definitions of TLS groups require additional information. If more than one crystal form is available, then dynamic domains can be 24 D. Goetz, M. A. Holmes, N. Borregaard, M. E. Bluhm, K. N. Raymond, Strong RK Mol. Cell. 10, 1033(2002). 25 J. Kuriyan and W. I. Weis, Proc. Natl. Acad. Sci. USA 88, 2773 (1991). 26 M. D. Winn, CCP4 Newsletter, No. 39 (2001). 27 M. Z. Papiz, S. M. Prince, T. Howard, R. J. Cagdell, N. W. Isaacs, J. Mol. Biol. 326, 1523(2003). 28 M. Rudolph, et al., unpublished work (2001). 308 map interpretation and refinement [14] identified from relative displacements of atoms between the crystal forms. In conflating these dynamic domains with TLS groups, the assumption is that relative displacements between different crystal forms reflect likely displacements within a single crystal form. The estimation of dynamic domains has been implemented in the computer program dyndom.29 A different method for identifying dynamic domains has been implemented in the program escet.30 Multiple configurations may also be generated by molecular dynamic simulations. One of us has used this method in a TLS refinement of light-harvesting complex II, using restrain.27 TLS groups may also be optimized by fitting to the refined ADPs of a related, high-resolution structure. Such an approach was adopted by Holbrook et al.,10,11 who compared seven different rigid body models of deoxycytidine 50 -phosphate, and used the best (as measured by two indices for the agreement between refined and derived U values) in subsequent TLS refinements of other nucleic acids. Another approach using refined individual ADPs is to use the rigid body criterion,31 in which a matrix is built up between all pairs of atoms, with elements equal to the difference in the projected U values along the interatomic vector. Pairs of atoms belonging to the same quasi-rigid group should have a value close to zero. Brock et al.32 used this approach in their analysis of triphenylphosphine oxide, and similar ideas were used by Schneider33 for the protein SP445. This approach has been implemented in the computer program anisoanl.13,26 Program Input To do TLS refinement in refmac, the program needs the following information: first, the choice of TLS groups is specified in the TLSIN file. The format of this file follows that used by restrain,12,13 but for input one generally needs to use only the TLS and RANGE records. The TLS record starts a group definition, and includes an optional title. The group definition that follows consists of one or more RANGE records, which specify a range of atoms to be included in the group. Ranges contributing to a single group need not be contiguous, since protein domains are often made up of stretches separated along the primary sequence. The program also needs to know the number of cycles of TLS refinement. These cycles are performed after initial estimation of scaling 29 S. Hayward and H. J. C. Berendsen, Proteins Struct. Funct. Genet. 30, 144 (1998). T. R. Schneider, Acta Crystallogr. D Biol. Crystallogr. 56, 714 (2000). 31 R. E. Rosenfield, K. N. Trueblood, and J. D. Dunitz, Acta Crystallogr. A 34, 828 (1978). 32 C. P. Brock, W. B. Schweizer, and J. D. Dunitz, J. Am. Chem. Soc. 107, 6964 (1985). 33 T. R. Schneider, in ‘‘Proceedings of the CCP4 Study Weekend,’’ p. 133 (1996). 30 [14] macromolecular TLS refinement in REFMAC 309 parameters and before refining coordinates and B factors. The convergence of the free R value for this stage should be checked to see if more cycles are needed. Convergence of the TLS refinement is usually improved if all individual B factors are initialized to a constant value (e.g., the average B value from earlier rounds of refinement, or the Wilson B factor). The precise value is not important since it will be compensated for by the scaling function. The individual B values will be refined individually after the TLS parameters have been determined. When the asymmetric unit contains several molecules of the same species, it is often useful to apply NCS restraints. However, if these molecules have widely differing displacement parameters, as is often the case, then restraining B factors to be similar can be problematic. In the case of TLS refinement, however, B factor restraints are applied between residual B values, which are more likely to be similar, and NCS restraints can be applied. The parameters needed for TLS refinement can be set by keywords to refmac, and the reader is referred to the refmac documentation for details. TLS refinement is also implemented in the CCP4 Graphical User Interface ‘‘ccp4i,’’ and can be selected in the protocol section of the refmac interface. There are additional interfaces for preparing the TLSIN file, and for analyzing the refined TLS parameters. Interpretation of Results The output from TLS refinement in refmac is in most respects identical to other modes of refinement. Hence, one should check global statistics such as free R value and correlation coefficient, as well as detailed geometric information. In addition, one obtains the following: Refined TLS parameters written to the log file, to the TLSOUT file, and also to the header of the XYZOUT file. The TLS parameters can be analyzed further with the auxiliary program tlsanl13,21 (see below). Refined B factors in the XYZOUT file. These are the ‘‘residual’’ B factors that are refined separately after the TLS parameters are determined. It is important to note that the residual B factors do not contain any contribution from the TLS parameters, and do not represent the full mean square displacement of the atom. The TLSOUT file containing the refined TLS parameters has the same format as the TLSIN file described above. In addition to the TLS and RANGE records, it will usually have ORIGIN, T, L, and S records. The T and L records list the six elements of the symmetric T and L tensors, while the S record lists the eight determinable elements of the asymmetric 310 map interpretation and refinement [14] S tensor. The values of the T and S tensors depend on the origin of calculation, and this is given in the ORIGIN record. ˚ 2, L in units of deg2, and S in units of The T tensor is given in units of A ˚ deg. T contributes additively to all ADPs in the TLS group [see Eq. (2)], A and its values can be compared directly to overall U values. The values of the L tensor are found to depend to a large extent on the size of the TLS group. Finally, values of the S tensor tend to be small for the domain-sized groups considered here. In general, the sizes of the TLS parameters are a good guide to the overall disorder of the group. In particular, if there are several copies of the same molecule in the asymmetric unit, they often have different levels of disorder that are mirrored in the relative values of the TLS parameters (see, e.g., mannitol dehydrogenase, below34). In contrast, the residual B factors are often similar between molecules. NCS restraints are therefore applied between residual B factors rather than full displacement parameters. The size dependence of the L tensor can be rationalized as follows. Derived U values on the periphery of a TLS group are proportional to the radius of the group squared [see Eq. (2)]. If these U values retain typical values irrespective of the size of the TLS group, as seems to be the case, then the size of the L tensor must decrease as the reciprocal of the radius squared. Alternatively, if one adopts a purely dynamic interpretation of the L values (which is unlikely to be true) then one can assume from classic equipartition that a constant amount of energy goes into each libration. Given that the moment of inertia increases as the radius of the group squared, the mean square libration must decrease by a similar factor. An example of the size dependency is given by the peptide–MHC complex described in Case Studies (below).28 One can pass the output files XYZOUT and TLSOUT from refmac to the auxiliary program tlsanl in order to get a clearer picture of the rigid body displacements represented by the T, L, and S tensors (see the next section). In particular, one can look to see whether there are any dominant displacements that may have interesting implications. As noted in the first section, however, caution should be exercised when interpreting TLS results, since these will tend to overestimate rigid body displacements, and will not discriminate between dynamic and static displacements. Having examined the results from a TLS refinement procedure, one should consider other possible choices for TLS groups. In particular, one can look to break up large TLS groups into component domains, to see whether an improved description can be obtained. One can, for 34 S. Ho¨ rer, J. Stoop, H. Mooibroek, U. Baumann, and J. Sassoon, J. Biol. Chem. 276, 27555 (2001). [14] macromolecular TLS refinement in REFMAC 311 example, look for step improvements in free R value (see, e.g., the study of light-harvesting complex II described below).27 It should be noted, however, that global statistics are not sensitive to the precise make-up of each TLS group when one is using domain-sized groups. Analysis with tlsanl tlsanl13,21 provides various analyses of TLS tensors that can be useful. The TLS parameters are provided via the TLSIN file (the TLSOUT file from refmac or restrain) and analyzed in the context of the structure provided in XYZIN (the XYZOUT file from refmac or restrain). Program operation is controlled by keywords as usual, and we mention briefly two of these in connection with the B factors held in the ATOM records of PDB files. These B factors may represent the ‘‘residual’’ B factors, the equivalent isotropic displacement factors derived from the TLS tensors, or the sum of these two contributions. The keyword BRESID is used to specify that XYZIN contains the residual B factor only, as is the case for refmac output. The keyword ISOOUT controls which of the three possibilities is written to XYZOUT, and is useful for comparing the different contributions. tlsanl also outputs ANISOU records to XYZOUT, and these include both the contribution from TLS [see Eq. (2)] and the individual isotropic contribution, irrespective of the keyword ISOOUT. This information is useful for a detailed examination of the anisotropy, and for preparing ORTEP-style pictures.35 It must be remembered, however, that the ADPs held in the ANISOU records are not independent, but are derived from the TLS model. Inspection of individual ADPs may inform the choice of TLS group, in that atoms having unreasonable ADPs should be excluded from the TLS group and the refinement rerun. For each TLS group, tlsanl gives several representations of the T, L, and S tensors. Two coordinate origins are considered: 1. The origin used in refinement, and given by the ORIGIN record in TLSIN. 2. The center of reaction, which is the origin that makes S symmetric and minimizes the trace of T.3 Three axial systems are considered. 1. Orthogonal axes, as used in XYZIN and TLSIN 35 M. N. Burnett and C. K. Johnson, ‘‘ORTEP-III: Oak Ridge Thermal Ellipsoid Plot Program for Crystal Structure Illustrations.’’ Oak Ridge National Laboratory Report ORNL-6895 (1996). 312 map interpretation and refinement [14] 2. Libration axes, that is, the principal axes of the L tensor 3. Rigid body axes, that is, as calculated from the atomic coordinates Full details are can be found in Howlin et al.,21 but some representations are useful for checking or for interpretation. ‘‘INPUT TENSOR MATRICES WRT ORTHOGONAL AXES USING ORIGIN OF CALCULATIONS’’ should echo the contents of the TLSIN file, with the values now displayed as matrices ‘‘TENSOR MATRICES WRT ORTHOGONAL AXES USING CENTRE OF REACTION’’: The change of origin implies changes to the T and S matrices, but not L. In particular, S should now be symmetric. ‘‘FOR TLS TENSOR USING CENTRE OF REACTION’’: For this choice of origin, T, L, and S can be diagonalized to give principal axes. This section gives the orientation of these axes in various coordinate frames, as well as the magnitudes along these axes. A selection of these axes can be output in a format suitable for inclusion in molscript,36 using the AXES keyword. Figure 2 gives an example of libration axes superimposed against the chain of a class I MHC. The principal axes of T and L should be checked for dominant translations or librations. It is usually convenient to quote the eigenvalues of T and L, that is, the magnitudes of the mean square displacements along or about the principal axes, rather than the entire tensor. It may be possible to relate the principal axes and the associated eigenvalues to features of the atomic structure. The S tensor describes the mean square correlation between the translational and librational displacements. The correlation is typically small for large TLS groups, but may play a significant role for smaller groups. The librational displacements may be reinterpreted as screw rotations by the use of nonintersecting axes,3 and a rough guide to the significance of the screw component may be obtained by comparing the nonintersecting screw axes with the original libration axes. Case Studies Applications of TLS refinement to glyceraldehyde-3-phosphate dehydrogenase37 and GerE38 have been described in detail elsewhere.20 Here we describe briefly a number of case studies; a summary is given in Table I. 36 37 P. J. Kraulis, J. Appl. Crystallogr. 24, 946 (1991). M. N. Isupov, T. M. Fleming, A. R. Dalby, G. S. Crowhurst, P. C. Bourne, and J. A. Littlechild, J. Mol. Biol. 291, 651 (1999). [14] macromolecular TLS refinement in REFMAC 313 Fig. 2. Example of libration axes as output by the AXES option of tlsanl. The structure is the chain of a class I MHC, as obtained from PDB entry 1fzk.41 Superimposed are the libration axes derived from a TLS refinement (see Class I Peptide–MHC Complexes). The axes intersect at the center of reaction of the TLS, and the length of each axis is proportional to the mean square libration about that axis. Prepared using Molscript36 and Raster3D.39 Light-Harvesting Complex II Light-harvesting complex II (LH2) from Rhodopseudomonas acidophila is an integral membrane complex composed of bacteriochlorophyll a (Bchl a), carotenoids, and small peptides.40 By trapping solar energy it is involved in the early stages of photosynthesis. The complex is a nonamer 38 V.M-A. Ducros, R. J. Lewis, C. S. Verma, E. J. Dodson, G. Leonard, J. P. Turkenburg, G. N. Murshudov, A. J. Wilkinson, and J. A. Brannigan, J. Mol. Biol. 306, 759 (2001). 39 E. A. Merritt and D. J. Bacon, Methods Enzymol. 277, 505 (1997). 40 G. McDermott, S. M. Prince, A. A. Freer, A. M. Hawthornethwaite-Lawless, M. Z. Papiz, R. J. Cogdell, and N. W. Isaacs, Nature 374, 517 (1995). 314 [14] map interpretation and refinement TABLE I Examples of TLS Refinement Usinga Refmac Protein ˚) Resolution (A Comments on TLS model PDB code GAPDH GerE Light-harvesting complex II Mannitol dehydrogenase Class I peptide–MHC complexes Siderophore-binding protein S100A12 Benzoate dioxygenase reductase GABARAP rhoE DLM-1–Z-DNA complex 2.05 2.05 2.0 1.5 1.9–1.7 2.4 1, 2, and 4 groups compared 1 TLS group per monomer Several models compared 1 TLS group per monomer TLS group for bound peptide TLS groups for ligands 1b7g 1fse 1kzu 1h5q 1fzj, 1fzk, 1fzm, 1fzo — 1.95 1.5 1 TLS group per monomer 1 TLS group per domain 1e8a — 1.75 2.1 1.85 — — 1j75 Thioredoxin reductase 3.0 1 TLS group 1 TLS group per monomer Nucleotides divided into 3 groups 1 TLS group per monomer a 1h6v Note that when TLS refinement occurred after deposition, the quoted PDB entry is for coordinates only. composed of 63 separate molecules; within the crystal the nonameric axis is incorporated into the rhombohedral (R32) 3-fold axis with one-third of the complex in the asymmetric unit. Each asymmetric unit has three copies of the nonameric repeat with an and peptide, 3 Bchl a, and one carotenoid molecule (the second carotenoid molecule is ill-defined and excluded). ˚ resolution and a model was initially refined Data were collected to 2.0-A with isotropic B factors to an R value of 0.219 and a free R value of 0.249. Subsequently, the model was rerefined with TLS parameters.27 The choice of TLS groups was made by exploring the significance of the improvement made to the refinement as the structure was progressively divided into smaller TLS groups of atoms. For example, one TLS tensor for the whole asymmetric unit reduces R to 0.201 and free R to 0.224 while further subdividing to three TLS tensors, one for each NCS unit, changes the R to 0.200 and free R to 0.223. A natural choice of groups are the individual molecules, 18 in all for the asymmetric unit and for 18 TLS tensors we see a further improvement to an R of 0.176 and a free R of 0.198. The a and b peptides each form into three domains, two surface-lying segments and one long transmembrane -helical domain. With three TLS groups for each peptide, there is a total of 30 groups, but there is only a small further improvement to an R of 0.175 and a free R of 0.197. [14] 315 macromolecular TLS refinement in REFMAC Figure 3 shows the variation of the R values with the number of TLS parameters. The initial decrease associated with the use of a single TLS group suggests that there are a group of motions correlated over the whole asymmetric unit. This apparent motion may also represent disorder in the lattice and not real nuclear motions. However, a second stepwise improvement occurs when a group of intracomplex nuclear motions is defined by subdividing into the individual molecular groups. In general, for this complex, the TLS motions are dominated by vibrations in the plane of the membrane and in a direction tangential to the ring of the nonamer. Since there are three copies of each molecule in the asymmetric unit, it is possible to check the quality of the TLS refinement which, unlike the coordinates, is not NCS constrained. The equivalent isotropic atomic B factors agree to within 5% for equivalent molecules. Mannitol Dehydrogenase Ho¨ rer et al.34 have solved the structure of mannitol dehydrogenase from Agaricus bisporus, which crystallized with three tetramers in the ˚ . One of the three tetramers was asymmetric unit, and diffracted to 1.5 A 0.26 R values 0.24 0.22 0.2 0.18 0.16 0 200 400 600 Number of TLS parameters Fig. 3. Plot of R value (dashed line with circles) and free R value (solid line with squares) against the number of TLS parameters included in the model for light-harvesting complex II from R. acidophila.27 The plot shows the sharp improvements associated with the initial 20 parameters, and the increase to 360 parameters, as compared with the smaller improvements with 60 and 600 parameters. 316 [14] map interpretation and refinement found to have poorer electron density and higher B factors, correlated to the fact that this tetramer has fewer crystal contacts (38) than the other two (49 and 50). TLS refinement was carried out to account for these differences, with a single TLS group refined for each monomer (12 groups overall). After TLS refinement, all tetramers had similar residual B factors and the electron density was much clearer. The TLS parameters for the ‘‘bad’’ tetramer refined to larger values, reflecting the larger overall displacements of this tetramer. The improved description of this tetramer was reflected in the R and free R values, which fell by 3.0 and 2.7%, respectively. Figure 4 shows the full equivalent B factors from TLS refinement (top three curves), together with the residual B factors (lower three curves), for the three tetrameric units. One of the three tetramers has different total B factors than the other two, reflecting greater disorder. With the TLS contribution removed, the residual B factors are close for all tetramers. Note, Average main−chain B factor 80 60 40 20 0 0 200 400 600 Residue number 800 1000 Fig. 4. Equivalent isotropic B values for mannitol dehydrogenase.34 The ordinate runs over the 1040 residues of the biological tetramer. The upper three curves show the equivalent isotropic B values from the TLS refinement for the three tetramers in the asymmetric unit. One tetramer (the top line) is clearly more disordered than the other two. The lower three curves show the residual B values, with the TLS contribution removed. These curves are close, and almost indistinguishable. For each residue, the B factors are averaged over the main-chain atoms. [14] macromolecular TLS refinement in REFMAC 317 however, that NCS restraints were applied to the residual B factors in this plot. Class I Peptide–MHC Complexes Rudolph et al.41 have solved the structures of four similar class I peptide–MHC complexes (PDB entries 1fzj, 1fzk, 1fzm, and 1fzo) at reso˚ . MHC molecules specifically bind lutions in the range of 1.7 to 1.9 A peptides in an extended conformation and present them to the T cell receptor on cytotoxic T cells. Class I MHC molecules are heterodimers consisting of a heavy chain () and a light chain (). The bound peptides were eight or nine amino acids long. The deposited models were rerefined,28 using the TLS procedure of refmac. The chain was modeled with one or two TLS groups (residues 1–180 and 181–274), while the chain and the bound peptide were each treated as a single TLS group. The TLS refinement produced no apparent differences in the maps, but a slight decrease in free R value for two of the models: 1fzj reduced from 0.222 to 0.216 and 1fzm from 0.223 to 0.211. The other models had unchanged free R values. Therefore, in this particular case TLS refinement had only a minor effect, but the resulting TLS tensors are illustrative. In all cases the libration tensor for the peptide is asymmetric, with the largest libration about an axis parallel to the long axis of the peptide. For example, for 1fzk the eigenvalues of L are 64.692, 0.852, and 4.108 (see Table II). Note that the negative eigenvalue of L is allowed by the refinement procedure, although it can no longer be interpreted in terms of a rigid body model. These results confirm the expectation that the dominant displacement is a libration about the major axis of the extended peptide which has minimal steric hindrance. In addition, the largest eigenvalue of L for the peptide is significantly greater than any of the eigenvalues for the protein, and also the eigenvalues of L for the chain (see Fig. 2) are larger than those for the chain. This is the size effect noted previously. Siderophore-Binding Protein Goetz and co-workers24 have solved the structure of a siderophore ˚ . The binding protein complexed with enterobactin using data to 2.4 A asymmetric unit contained three copies of the protein plus ligand. Each protein molecule has 177 residues, while enterobactin is a cyclic trimer of 41 M. G. Rudolph, J. A. Speir, A. Brunmark, N. Mattsson, M. R. Jackson, P. A. Peterson, L. Teyton, and I. A. Wilson, Immunity 14, 231 (2001). 318 map interpretation and refinement [14] TABLE II Eigenvalues of T Tensor with Respect to Center of Reaction and Eigenvalues of L Tensor for Class I MHC Complexed with Sendai Virus Nucleoproteina Chain Number of peptides ˚ 2) Eigenvalues of T (A Eigenvalues of L (deg2) A B P 274 99 9 0.047, 0.012, 0.000 0.039, 0.014, 0.003 0.202, 0.071, 0.037 1.327, 0.547, 0.354 5.259, 1.342, 0.831 64.692, 0.852, 4.108 a PDB code 1fzk. 2,3-dihydroxybenzoylserine, which forms a compact sphere of approximate ˚. radius 5 A TLS refinement was used throughout building and refinement, with one TLS group for each protein and enterobactin molecule, that is, a total of six TLS groups. For the first round of refinement, all B factors were set to a constant value equal to the Wilson B factor, while for subsequent rounds the B factors were not reset. The TLS parameters were, however, reset to zero for each round of refinement. Refinement was also carried out in CNS (without TLS refinement) and this provided a number of interesting comparisons. The final free R value was 0.27 compared with 0.32, using CNS without TLS refinement. The difference density was in general flatter from REFMAC than from CNS, indicating that the TLS parameters were modeling some of the differences. For the ligand, however, it became clear that some of these differences were in fact model errors. Enterobactin is known to be quite unstable in solution, and had degraded during the time it took to grow the crystals. The TLS refinement tried to compensate for the use of the full enterobactin model, and this is evident from the abnormally large values of the diagonal elements of L (up to 300 deg2). In contrast, the CNS refinement without TLS showed large negative peaks around the ligand atoms that were nonexistent due to degradation, and this suggested likely degradation products. After modeling the most likely degradation products by removing atoms and breaking bonds, the TLS parameters were more realistic, with diagonal elements of L on the order of 50 deg2. If there are a number of degradation products, then it is possible that a significant part of the TLS parameters is still reflecting model error rather than rigid body displacements. This experience supports the view that TLS refinement is most effective when the model is complete and accurate. Conversely, if TLS refinement performs poorly, then it may indicate problems with the model. [14] macromolecular TLS refinement in REFMAC 319 Other Examples Moroz et al.42 have solved the structure of human calcium-binding pro˚ resolution, using TLS refinement in the final stages. tein S100A12 at 1.95-A S100A12 crystallized with two molecules in the asymmetric unit, and each molecule was treated as a single TLS group. TLS refinement contributed a drop of 3% in R and free R values. Chain B was found to have slightly greater TLS parameters than chain A, but the difference in disorder was too slight to be observed in the electron density. Karlsson and coworkers43 used TLS refinement on benzoate dioxygenase reductase, a 348-residue protein from an Acinetobacter species. Each protein molecule consists of three similarly sized domains, each binding FAD, NADH, and a 2Fe2S center. There are two molecules in the asym˚. metric unit and the best data set extends to 1.5 A Conventional refinement without TLS gave an R value of 0.268 and a free R value of 0.291. When one examines the maps and B factors, one can see that one of the two molecules in the asymmetric unit was more disordered, and in particular two of the three domains in this molecule were disordered. Running refmac with six TLS groups, one for each domain of the two molecules, lowered the R and free R values to 0.241 and 0.249, respectively. In addition, the quality of the maps was improved. Keep and co-workers44 have used TLS refinement on GABARAP and rhoE structures. GABARAP is a 117-amino acid protein that crystallized in P2, 2, 2, with a single copy in the asymmetric unit, and diffracted to ˚ . The whole protein, excluding water, was modeled with a single 1.75 A TLS group. Refinement gave R and free R values of 0.203 and 0.230, respectively, compared with 0.215 and 0.239 without TLS. Furthermore, it lowered the free R value more than a full anisotropic analysis (while using considerably fewer refinement parameters). The rhoE structure consists of two copies of a 180 amino acid protein ˚ . Each copy of the protein, with associated cofactors, with cofactors at 2.1 A was treated as a separate group. TLS refinement gave R and free R values of 0.181 and 0.214, respectively, compared to 0.192 and 0.226 without TLS. For both structures, differences in the electron density maps were at the level of a few water peaks and some hints of alternative conformations, rather than any major new features. 42 O. V. Moroz, A. A. Antson, G. N. Murshudov, N. J. Maitland, G. G. Dodson, K. S. Wilson, I. Skibshoj, E. M. Lukanidin, and I. B. Bronstein, Acta Crystallogr. D Biol. Crystallogr. 57, 20 (2001). 43 A. Karlsson, Z. M. Beharry, D. M. Eby, E. D. Coulter, E. L. Neilde, D. M. Kurtz, H. Eklund, S. Ramaswamy, Journal of Molecular Biology 318, 261(2002). 44 N. Keep, H. Garavini, K. Reinto, J. P. Phelan, M. S. B. McAlister, and A. J. Ridley, Biochemistry 41, 6303(2002). 320 map interpretation and refinement [14] Schwartz et al.45 included TLS refinement as the final stage of the structure ˚ . They achieved a determination of a DLM-1–Z-DNA complex at 1.85 A 3% reduction in R values, but comment that the selection of TLS groups was crucial. Each nucleotide was divided into three groups (in a similar manner to Holbrook and Kim10 and Holbrook et al.11), while the protein chain was treated as a single group. Finally, Sandalova et al.46 used TLS refinement on a model of a mam˚ . The structure has three dimers in malian thioredoxin reductase at 3.0 A the asymmetric unit, and each monomer was treated as a separate TLS group. Conclusions TLS parameterization of anisotropic displacement parameters has been around for many years, including a number of examples of TLS refinement of macromolecules. However, it is only now that TLS refinement is beginning to be used routinely. It is clear from the above-described examples that, with the addition of a small number of extra refinement parameters, a much improved description of diffraction data can be obtained. As well as the global improvement resultant on the inclusion of anisotropy, as evidenced by the free R value, a number of other pieces of information may be obtained from TLS refinement. 1. The overall level of disorder of monomers or domains can be quantified and perhaps rationalized, as in the case of mannitol dehydrogenase described above. 2. Removal of the domain-level displacements may reveal local displacements that are features of the local geometry, and therefore exist for all copies of a molecule. 3. If the TLS refinement performs poorly, and the TLS parameters are unreasonable, then this may be a clue to the necessity of rebuilding, as in the case of the siderophore binding protein above. 4. Comparison of different parameterizations can help to categorize TLS modes as lattice modes, monomer displacements, or internal modes; see the analysis of light-harvesting complex II given above. The displacement parameters determined from diffraction data have many sources. The TLS parameterizations described in this chapter can 45 T. Schwartz, J. Behlke, K. Lowenhaupt, U. Heinemann, and A. Rich, Nat. Struct. Biol. 8, 761 (2001). 46 T. Sandalova, L. Zhong, Y. Lindqvist, A. Holmgren, and G. Schneider, Proc. Natl. Acad. Sci. 98, 9533 (2001). [15] structural information content at high resolution 321 be seen as a simple attempt to separate global displacements from local displacements. Other parameterizations may identify other collective displacements, such as molecular normal modes.4–6 With a correct interpretation of these modes, we can perhaps say something about the biology of these molecules. This is still a relatively unexplored area but has great potential. Acknowledgments and Program Availability M.D.W. and G.N.M. are supported by the BBSRC through a CCP4 grant (B10200). M.D.W. is grateful for all feedback from users of the TLS option in refmac. We are particularly grateful to Stefan Ho¨ rer for the description of the refinement of mannitol dehydrogenase, and for the data used to produce Fig. 4; to Markus Rudolph for the description of the refinement of peptide–MHC complexes, and for the data used to produce Fig. 2; and to David Goetz, Andreas Karlsson, and Nicholas Keep for the examples of siderophore-binding protein, benzoate dioxygenase reductase, and GABARAP and rhoE, respectively. refmac, tlsanl, and anisoanl are all available as part of the CCP4 software suite, from release 4.1. See http://www.ccp4.ac.uk for information on downloading and licensing. The operation of these programs as described in this chapter is that pertaining to version 4.1. [15] Structural Information Content at High Resolution: MAD versus Native By Alberto Podjarny, Thomas R. Schneider, Raul E. Cachau, and Andrzej Joachimiak Introduction Structure determination by X-ray crystallography is firmly established as the main way of obtaining accurate three-dimensional information about molecular structure. In the case of macromolecules, the last decade has seen an exponential growth in the number of structures solved,1 and this tendency is gaining even more speed with current efforts in structural genomics.2 The increase in the speed and number of structures has been accompanied by a remarkable improvement in the quality of the data collected thanks to the introduction of third-generation synchrotron X-ray sources. Thus, today X-ray crystallography is the method of choice to obtain macromolecular details at subatomic resolutions in the absence of neutron 1 2 W. G. Schulz, Chem. Eng. News 79, 23 (2001). U. Heinemann, G. Illing, and H. Oschkinat, Curr. Opin. Biotechnol. 12, 348 (2001). METHODS IN ENZYMOLOGY, VOL. 374 Copyright 2003, Elsevier Inc. All rights reserved. 0076-6879/03 $35.00