Stepwise [FeFe]-hydrogenase H-cluster assembly revealed in the
Vol 465 | 13 May 2010 | doi:10.1038/nature08993
Stepwise [FeFe]-hydrogenase H-cluster assembly
revealed in the structure of HydADEFG
David W. Mulder1,2, Eric S. Boyd1,2, Ranjana Sarma1,2, Rachel K. Lange1,2, James A. Endrizzi1,2, Joan B. Broderick1,2
& John W. Peters1,2
Complex enzymes containing Fe–S clusters are ubiquitous in nature,
where they are involved in a number of fundamental processes
including carbon dioxide fixation, nitrogen fixation and hydrogen
metabolism1,2. Hydrogen metabolism is facilitated by the activity of
three evolutionarily and structurally unrelated enzymes: the [NiFe]hydrogenases, [FeFe]-hydrogenases and [Fe]-hydrogenases3,4 (Hmd).
The catalytic core of the [FeFe]-hydrogenase (HydA), termed the
H-cluster, exists as a [4Fe–4S] subcluster linked by a cysteine thiolate
to a modified 2Fe subcluster with unique non-protein ligands5,6. The
2Fe subcluster and non-protein ligands are synthesized by the hydrogenase maturation enzymes HydE, HydF and HydG; however, the
mechanism, synthesis and means of insertion of H-cluster components remain unclear7–10. Here we show the structure of HydADEFG
(HydA expressed in a genetic background devoid of the active site
H-cluster biosynthetic genes hydE, hydF and hydG) revealing the
presence of a [4Fe–4S] cluster and an open pocket for the 2Fe subcluster. The structure indicates that H-cluster synthesis occurs in a
stepwise manner, first with synthesis and insertion of the [4Fe–4S]
subcluster by generalized host-cell machinery11,12 and then with
synthesis and insertion of the 2Fe subcluster by specialized hydE-,
hydF- and hydG-encoded maturation machinery7–10. Insertion of the
2Fe subcluster presumably occurs through a cationically charged
channel that collapses following incorporation, as a result of conformational changes in two conserved loop regions. The structure,
together with phylogenetic analysis, indicates that HydA emerged
within bacteria most likely from a Nar1-like ancestor lacking the
2Fe subcluster, and that this was followed by acquisition in several
The biosynthesis and assembly of active-site metallo-cofactors
requires multiple enzymes, scaffolds and carriers2,11,13. For [FeFe]hydrogenases, the gene products HydE, HydF and HydG are required
for the maturation of the active-site H-cluster14 (Fig. 1). These gene
products function to couple radical S-adenosyl-L-methionine (SAM)
chemistry and nucleotide binding and hydrolysis to ligand synthesis,
cluster assembly and insertion, and, ultimately, [FeFe]-hydrogenase
maturation7–9,15,16. Although several plausible schemes have been proposed for the generation of the carbon monoxide, cyanide and dithiolate ligands at the Fe site, including radical SAM-mediated sulphur
insertion coupled to the decomposition or condensation of amino
acids7–9,17,18, the precise mechanism by which the various enzymes, scaffolds and carriers coordinate H-cluster maturation is unknown. Owing
to their high catalytic rates of hydrogen production, much interest
surrounds [FeFe]-hydrogenases as alternative biological catalysts to
those containing precious metals such as platinum in hydrogen-fuelcell technology. Advancements in understanding how the H-cluster is
synthesized by HydE, HydF and HydG could contribute significantly to
both the genetic engineering of hydrogen-producing microorganisms
and the synthesis of biomimetic hydrogen-production catalysts.
HydF has been shown to transfer a cluster precursor to HydA in
the final stage of [FeFe]-hydrogenase maturation and may act as a
scaffold on which an H-cluster precursor is assembled8. Our most
recent results indicate that the H-cluster maturation machinery (the
activities of HydE, HydF and HydG) is directed at the synthesis of
only the 2Fe unit of the 6Fe cluster, and that the [4Fe–4S] subcluster
can be synthesized independently10. To better understand the synthesis and insertion of the individual [4Fe–4S]-subcluster and 2Fesubcluster components of the H-cluster during [FeFe]-hydrogenase
maturation, we determined the X-ray crystal structure of HydADEFG.
We determined the structure of HydADEFG from Chlamydomonas
reinhardtii, heterologously expressed in Escherichia coli, by molecular
replacement, using the structure of the [FeFe]-hydrogenase from
Clostridium pasteurianum5 (CpI) as a search model, and refined it
to a resolution of 1.97 Å (Fig. 2a). We focused the study on the
[FeFe]-hydrogenase from C. reinhardtii because our complementary
biochemical and spectroscopic analyses examining maturation were
conducted using this enzyme10. In addition, C. reinhardtii HydA is of
biotechnological interest and no structural information about it yet
The overall structure of C. reinhardtii HydADEFG (Fig. 2a) is similar
to that observed for active-site domains of the previously characterized
[FeFe]-hydrogenases from C. pasteurianum5 (Fig. 2c) and Desulfovibrio
Figure 1 | Ball-and-stick representation of the H-cluster in [FeFe]hydrogenase from Clostridium pasteurianum29. The H-cluster is bound to
the protein (tube representation) by four cysteine ligands of the [4Fe–4S]
subcluster, which is further linked by a cysteine thiolate ligand to a 2Fe
subcluster with unique non-protein ligands including CO, CN– and a
dithiolate ligand30. A water molecule is coordinated to the distal Fe of the 2Fe
subcluster in the presumed-active, oxidized state. Dark red, Fe; orange, S;
grey, C; blue, N; light red, oxygen; magenta, central atom of the dithiolate
ligand. Protein Data Bank ID, 3C8Y.
Astrobiology Biogeocatalysis Research Center, 2Department of Chemistry & Biochemistry, Montana State University, Bozeman, Montana 59717, USA.
©2010 Macmillan Publishers Limited. All rights reserved
NATURE | Vol 465 | 13 May 2010
desulfuricans6 (DdH) purified from the native hosts. The primary difference in [FeFe]-hydrogenase expressed in a background devoid of the
maturation machinery (that is, in HydADEFG) is the absence of the 2Fe
subcluster (Figs 2b, d and 3), which leaves an open cavity adjacent to the
[4Fe–4S] cluster that is linked to the protein surface through an open,
cationically charged channel. The channel is formed by significant
structural rearrangement in two loop regions (loop 1, residues
240–255; loop 2, residues 279–286) in HydADEFG, as revealed by the
superimposition of HydADEFG and CpI (Supplementary Fig. 1). These
structural differences indicate that the two loop regions adopt an
alternative conformation upon insertion of the 2Fe subcluster, effectively closing the channel and shielding the active site from surface
exposure. A sequence alignment of HydA from a diversity of organisms
indicates that loop 1 is highly conserved and that loop 2 is partly
conserved in HydA from bacteria and several unicellular eukaryotes
including C. reinhardtii, but not in eukaryotic Nar1 homologues (Supplementary Figs 2 and 3). Nar1 homologues contain only the [4Fe–4S]
cluster and are present in the genomes of nearly all eukaryotes, where
they function in cytosolic and nuclear Fe–S-cluster maturation19.
Although it is clear that an intact 2Fe subcluster is not present at
the active site of HydADEFG (Fig. 3 and Supplementary Fig. 4), analysis of Fo–Fc electron density maps reveals some residual density
adjacent to the [4Fe–4S] cluster where the 2Fe subcluster would be
expected to reside (Fig. 3a and Supplementary Figs 4a and 5a). To
verify the absence of Fe in the residual density, we performed Fe-edge
anomalous-difference Fourier analysis, which confirmed that the
only anomalous scatters at the Fe absorption edge were within the
[4Fe–4S] cluster (Fig. 3a and Supplementary Fig. 4a and 5b). An
acetate molecule and a chloride ion, both of which were present in
the crystallization buffer, could be modelled and refined reasonably
well into the unknown density. Thus, the active site seems to exist as a
[4Fe–4S] cluster adjoined to an open binding cavity for insertion of
the 2Fe subcluster.
The overall surface representations of C. reinhardtii HydADEFG
reveal a positively charged channel leading to the active-site area
(Fig. 4a–c), a feature that is not present in the intact active hydrogenase,
presumably owing to conformational change following insertion of the
2Fe subcluster (Fig. 4d). The channel for 2Fe-subcluster insertion is
solvent accessible, 8–15 Å in width and ,25 Å deep, with positively
Figure 2 | X-ray crystal structure of
C. reinhardtii HydADEFG determined to a
resolution of 1.97 Å, compared with HydA from
CpI. a, Ribbon diagram of the overall HydADEFG
structure, with a space-filling representation of
the associated [4Fe–4S] cluster. The two
conserved loop regions thought to undergo
major conformational rearrangement are
coloured green. b, HydADEFG active-site area,
where a 2Fe subcluster recipient cavity is adjacent
to the [4Fe–4S] cluster. c, Ribbon diagram of the
overall CpI structure in the same orientation as
HydADEFG, with a space-filling representation of
the intact H-cluster. The regions of CpI
corresponding to the loop regions of HydADEFG
shown in a are coloured green. d, CpI active-site
region, with ball-and-stick representation of the
H-cluster. Protein representations are coloured
according to secondary structure (light blue,
a-helices and loops; violet, b-sheets). The atomic
colouring scheme is the same as in Fig. 1.
charged residues (Arg 275, Lys 288 and Lys 409) lining the channel
entrance. Lys 188, whose equivalent residue (Lys 358) hydrogen-bonds
to a cyanide ligand of the H-cluster in the intact CpI hydrogenase5, lies
at the end of the channel at the active cavity and may have a role in
orienting the 2Fe subcluster during insertion. Importantly, the
bridging cysteine thiolate ligand (Cys 381) between the [4Fe–4S] cluster and 2Fe subcluster is located ,19 Å into the channel, and its sulphur side chain is exposed on the surface of the channel, providing the
site for covalent attachment of the 2Fe subcluster following insertion.
The structure of HydADEFG presented here provides strong support for a stepwise mechanism for H-cluster biosynthesis in which
[4Fe–4S]-subcluster insertion precedes 2Fe-subcluster insertion.
Previously we showed that the [4Fe–4S]-subcluster synthesis and 2Fesubcluster synthesis occur independently10. Here we show that the
[4Fe–4S] subcluster makes up part of a binding cavity that is linked
to a channel leading to the surface of the protein. If the [4Fe–4S] subcluster were absent, the binding cavity would not exist. Furthermore,
because the [4Fe–4S] subcluster is at the base of the channel, it would be
impossible to insert it after insertion of the 2Fe subcluster as the pathway would effectively be blocked. In addition, if the process were conserved through all organisms, alternative paths for insertion of the
[4Fe–4S] cluster would be precluded in the majority of organisms that
express [FeFe]-hydrogenases with accessory Fe–S-cluster domains
adjacent to the [4Fe-4S] subcluster of the H-cluster. Therefore, the
structural observations from HydADEFG indicate that the [4Fe–4S]
subcluster must be synthesized and inserted first, by Fe–S-cluster
generalized host-cell machinery11,12, and that this is followed by the
synthesis and insertion of the 2Fe subcluster by Hyd-protein maturation machinery7–10.
The implied structural rearrangement that occurs during 2Fesubcluster insertion reveals an interesting parallel to the maturation
of the nitrogenase complex. Molybdenum nitrogenase comprises an
iron protein (NifH) and a MoFe-protein (NifDK), with the latter
containing P-clusters (8Fe–7S) and FeMo-cofactors (Mo–7Fe–9S–
X–homocitrate, where X 5 C, O, N)20,21. Structural determination
of a form of Azotobacter vinelandii molybdenum nitrogenase (Av1)
NifDK deficient in FeMo-cofactor (Av1DnifB) reveals major structural
rearrangement in the domain involved in FeMo-cofactor binding,
resulting in channel formation to the FeMo-cofactor site in Av1DnifB
©2010 Macmillan Publishers Limited. All rights reserved
NATURE | Vol 465 | 13 May 2010
Figure 3 | Active-site comparison between C. reinhardtii HydADEFG and
HydA from CpI. a, The [4Fe–4S]-cluster active-site environment in
HydADEFG. The anomalous-difference Fourier map (blue) is shown
contoured at 4.5s, indicating the positions of the Fe atoms localized to the
[4Fe–4S] cluster. An acetate molecule and a chloride ion are modelled into the
Fo–Fc map (magenta, contoured at 3.5s) of the HydADEFG cavity. Residues are
labelled according to single-letter amino-acid abbreviations and sequence
number. The side chain of Cys 129 has increased conformational freedom and
can be refined in three conformations (one shown). b, H-cluster active-site
environment in CpI, shown in the same orientation as HydADEFG in a. The
atomic colouring scheme is the same as in Fig. 1.
(ref. 22; Fig. 4e–g). Superimposition of Av1 NifDK (containing
FeMo-cofactor; ref. 20) and Av1DnifB NifDK (lacking FeMo-cofactor;
ref. 22) indicates that the FeMo-cofactor site is at the end of a cationic
channel (Fig. 4f, g) that is absent from the structure of Av1 NifDK
(Fig. 4h). The structural rearrangements resulting in the formation of
a cationic channel in both HydA and NifDK indicate that the process
for complex Fe–S-cluster insertion into apoproteins may be conserved and that the evolution of functional HydA may have occurred
stepwise, a feature that is consistent with the evolution of nitrogenase23 and possibly Fe–S enzymes in general.
Phylogenetic analysis indicates that HydA and Nar1 homologues
form two separate stem lineages (Supplementary Figs 2 and 3), one of
which comprises Nar1 lineages from eukaryotic genomes that lack hydE,
hydF and hydG, and the other of which comprises well-supported
lineages of HydA from bacteria and several eukaryotic genomes that
generally contain hydE, hydF and hydG. The earliest-branching HydA
lineages are from bacteria with HydA from C. reinhardtii and
Trichomonas vaginalis nested among bacterial sequences, suggesting
Figure 4 | Channels for insertion into hydrogenase and nitrogenase during
complex Fe–S-cluster assembly. a, Surface representation of HydADEFG (grey)
superimposed on native CpI (green ribbon and ball and stick representations).
b, HydADEFG channel leading to the active-site 2Fe-subcluster recipient cavity.
c, d, Electrostatic surface representations of HydADEFG (c) and the catalytic
domain of CpI (d), calculated using the PYMOL plug-in APBS. Red, negative
(210kBT/e); blue, (10kBT/e); kB, Boltzmann’s constant; e, elementary charge.
e, Surface representation of the FeMo-cofactor-deficient form of NifDK
(Av1DnifB) (grey; PDB ID, 1L5H; ref. 22) superimposed on NifDK from native
nitrogenase MoFe-protein (Av1) (PDB ID, 3MIN; ref. 20) (green ribbon and
ball and stick representations). f, Av1DnifB NifDK channel leading to the activesite FeMo-cofactor recipient cavity. g, h, Electrostatic surface representations
of Av1DnifB NifDK (g) and Av1 NifDK (h) calculated with identical APBS
parameters as in c and d and represented using the same colouring scheme. The
atomic colouring scheme is the same as in Fig. 1 and the unknown atom (X) of
the FeMo-cofactor in f is blue.
that these HydA derive from lateral gene transfer from a bacterium
and/or endosymbiosis of a bacterium24. Collectively, these results indicate that the origin of the 2Fe subcluster containing HydA post-dates the
divergence of bacteria and archaea, a proposal that is consistent with the
absence of HydA from archaea. Consistent with previous results24,25, this
set of observations implies that Nar1 was acquired in eukarya by means
of endosymbiosis of, or lateral gene transfer with, a bacterium24. This
event is likely to have occurred before the recruitment of hydE, hydF and
©2010 Macmillan Publishers Limited. All rights reserved
NATURE | Vol 465 | 13 May 2010
hydG and the development of the ability to generate the 2Fe subcluster, a
hypothesis that is supported by the lack both of conservation in Nar1
loop regions and of hydE, hydF and hydG in eukaryotes with Nar1 (see
Supplementary Information for further discussion of this).
The structure of HydADEFG presented here reveals the stepwise
assembly of the H-cluster in [FeFe]-hydrogenases and the structural
pathway for 2Fe-subcluster insertion in the final step of [FeFe]hydrogenase maturation. By providing significant insights into
H-cluster biosynthesis, the results provide a foundation for enhancing
mechanisms of biological hydrogen production in genetically engineered hydrogenases, for the synthesis of biomimetic catalysts and,
thus, ultimately for developing hydrogen as a renewable fuel. In addition, this structure reveals several novel and unifying themes for Fe–Senzyme biosynthesis and evolution. Both the FeMo-cofactors of
NifDK and the 2Fe subcluster of the H-cluster of HydA are synthesized
on specialized scaffold proteins and are inserted into the respective
enzymes through a conserved, cationically charged channel, resulting
in catalytically active proteins7–9,13,26. Recent evolutionary studies of
proteins involved in synthesizing the FeMo-cofactor of nitrogenase
indicate that, like the 2Fe subcluster of HydA, the FeMo-cofactor in
the active site of NifDK was not present in the last universal common
ancestor and is thus a more recent innovation23. Therefore, the stepwise evolution of complex Fe–S metalloenzymes may be pervasive in
biology, resulting in protein complexes with improved specificity and/
or enhanced catalytic efficiency.
HydADEFG from C. reinhardtii was heterologously expressed in E. coli and purified
as described previously10. Crystal screens were set up in a nitrogen-atmosphere
glovebox (Unilab, MBRAUN), using the anaerobic microcapillary batch diffusion
method27. Favourable conditions for crystal growth at room temperature (298 K)
were found using 25.5% polyethylene glycol 8000 as a precipitate, 0.085 M sodium
cacodylate (pH 6.5), 0.17 M sodium acetate trihydrate and 1 mM dithionite. The
data were collected from a single flash-cooled crystal on beamline 9-1 (wavelength,
0.954 Å; resolution, 1.97 Å) and beamline 9-2 (wavelength, 1.74 Å; resolution,
3.00 Å) at the Stanford Synchrotron Radiation Lightsource (SSRL; Supplementary Table 1). The structure of HydADEFG was solved using molecular replacement
with CpI as a search model (PDB ID, 1FEH; ref. 5), and the final model was refined
using data at 1.97 Å to Rfactor 5 17.0% and Rfree 5 21.6%. Homologues of
HydADEFG were compiled using BLASTP, and phylogenetic analyses were conducted using Bayesian inference and maximum-likelihood approaches. Figures
were prepared using PYMOL28.
Full Methods and any associated references are available in the online version of
the paper at www.nature.com/nature.
Received 21 October 2009; accepted 5 March 2010.
Published online 25 April 2010.
Drennan, C. L. & Peters, J. W. Surprising cofactors in metalloenzymes. Curr. Opin.
Struct. Biol. 13, 220–226 (2003).
Fontecilla-Camps, J. C., Amara, P., Cavazza, C., Nicolet, Y. & Volbeda, A.
Structure-function relationships of anaerobic gas-processing metalloenzymes.
Nature 460, 814–822 (2009).
Vignais, P. M. & Billoud, B. Occurrence, classification, and biological function of
hydrogenases: an overview. Chem. Rev. 107, 4206–4272 (2007).
Shima, S. & Thauer, R. K. A third type of hydrogenase catalyzing H2 activation.
Chem. Rec. 7, 37–46 (2007).
Peters, J. W., Lanzilotta, W. N., Lemon, B. J. & Seefeldt, L. C. X-ray crystal structure
of the Fe-only hydrogenase (Cpl) from Clostridium pasteurianum to 1.8 angstrom
resolution. Science 282, 1853–1858 (1998).
Nicolet, Y., Piras, C., Legrand, P., Hatchikian, C. E. & Fontecilla-Camps, J. C.
Desulfovibrio desulfuricans iron hydrogenase: the structure shows unusual
coordination to an active site Fe binuclear center. Structure 7, 13–23 (1999).
Nicolet, Y. et al. X-ray structure of the [FeFe]-hydrogenase maturase HydE from
Thermotoga maritima. J. Biol. Chem. 283, 18861–18872 (2008).
McGlynn, S. E. et al. HydF as a scaffold protein in [FeFe] hydrogenase H-cluster
biosynthesis. FEBS Lett. 582, 2183–2187 (2008).
Pilet, E. et al. The role of the maturase HydG in [FeFe]-hydrogenase active site
synthesis and assembly. FEBS Lett. 583, 506–511 (2009).
10. Mulder, D. W. et al. Activation of HydADEFG requires a preformed [4Fe-4S]
cluster. Biochemistry 48, 6240–6248 (2009).
11. Lill, R. Function and biogenesis of iron-sulphur proteins. Nature 460, 831–838 (2009).
12. Johnson, D. C., Dean, D. R., Smith, A. D. & Johnson, M. K. Structure, function, and
formation of biological iron-sulfur clusters. Annu. Rev. Biochem. 74, 247–281 (2005).
13. Schwarz, G., Mendel, R. R. & Ribbe, M. W. Molybdenum cofactors, enzymes and
pathways. Nature 460, 839–847 (2009).
14. Posewitz, M. C. et al. Discovery of two novel radical S-adenosylmethionine
proteins required for the assembly of an active [Fe] hydrogenase. J. Biol. Chem.
279, 25711–25720 (2004).
15. Rubach, J. K., Brazzolotto, X., Gaillard, J. & Fontecave, M. Biochemical
characterization of the HydE and HydG iron-only hydrogenase maturation
enzymes from Thermatoga maritima. FEBS Lett. 579, 5055–5060 (2005).
16. Brazzolotto, X. et al. The [Fe-Fe]-hydrogenase maturation protein HydF from
Thermotoga maritima is a GTPase with an iron-sulfur cluster. J. Biol. Chem. 281,
17. McGlynn, S. E., Mulder, D. W., Shepard, E. M., Broderick, J. B. & Peters, J. W.
Hydrogenase cluster biosynthesis: organometallic chemistry nature’s way. Dalton
Trans. 22, 4274–4285 (2009).
18. Peters, J. W., Szilagyi, R. K., Naumov, A. & Douglas, T. A radical solution for the
biosynthesis of the H-cluster of hydrogenase. FEBS Lett. 580, 363–367 (2006).
19. Balk, J., Pierik, A. J., Netz, D. J., Muhlenhoff, U. & Lill, R. The hydrogenase-like
Nar1p is essential for maturation of cytosolic and nuclear iron-sulphur proteins.
EMBO J. 23, 2105–2115 (2004).
20. Peters, J. W. et al. Redox-dependent structural changes in the nitrogenase
P-cluster. Biochemistry 36, 1181–1187 (1997).
21. Einsle, O. et al. Nitrogenase MoFe-protein at 1.16 Å resolution: a central ligand in
the FeMo-cofactor. Science 297, 1696–1700 (2002).
22. Schmid, B. et al. Structure of a cofactor-deficient nitrogenase MoFe protein.
Science 296, 352–356 (2002).
23. Fani, R., Gallo, R. & Lio, P. Molecular evolution of nitrogen fixation: the evolutionary
history of the nifD, nifK, nifE, and nifN genes. J. Mol. Evol. 51, 1–11 (2000).
24. Meyer, J. [FeFe] hydrogenases and their evolution: a genomic perspective. Cell.
Mol. Life Sci. 64, 1063–1084 (2007).
25. Hug, L. A., Stechmann, A. & Roger, A. J. Phylogenetic distributions and histories of
proteins involved in anaerobic pyruvate metabolism in eukaryotes. Mol. Biol. Evol.
27, 311–324 (2010).
26. Rubio, L. M. & Ludden, P. W. Biosynthesis of the iron-molybdenum cofactor of
nitrogenase. Annu. Rev. Microbiol. 62, 93–111 (2008).
27. Georgiadis, M. M. et al. Crystallographic structure of the nitrogenase iron protein
from Azotobacter vinelandii. Science 257, 1653–1659 (1992).
28. DeLano, W. L. PyMOL Molecular Viewer Æhttp://www.pymol.orgæ (2002).
29. Pandey, A. S., Harris, T. V., Giles, L. J., Peters, J. W. & Szilagyi, R. K.
Dithiomethylether as a ligand in the hydrogenase H-cluster. J. Am. Chem. Soc. 130,
30. Silakov, A., Wenk, B., Reijerse, E. & Lubitz, W. 14N HYSCORE investigation of the
H-cluster of [FeFe] hydrogenase: evidence for a nitrogen in the dithiol bridge.
Phys. Chem. Chem. Phys. 11, 6592–6599 (2009).
Supplementary Information is linked to the online version of the paper at
Acknowledgments This work was supported by a US Air Force Office of Scientific
Research Multidisciplinary University Research Initiative Award
(FA9550-05-01-0365, J.W.P.) and the NASA Astrobiology Institute (NAI)-funded
Astrobiology Biogeocatalysis Research Center (NNA08C-N85A, J.B.B. and
J.W.P.). E.S.B. was supported by a NAI postdoctoral fellowship. Portions of this
research were carried out at the Stanford Synchrotron Radiation Lightsource
(SSRL), a national user facility operated by Stanford University on behalf of the US
Department of Energy, Office of Basic Energy Sciences. The SSRL Structural
Molecular Biology programme is supported by the US Department of Energy,
Office of Biological and Environmental Research, the US National Institutes of
Health, National Center for Research Resources, Biomedical Technology
programme, and the US National Institute of General Medical Sciences.
Author Contributions The structural work was conducted by D.W.M. with
contributions from R.S. and J.A.E. E.S.B. led the phylogenetic work with
contributions from R.K.L. J.W.P. supervised the work with assistance from J.B.B.
D.W.M., E.S.B. and J.W.P. led the manuscript preparation with contributions from
J.B.B., R.S., R.K.L. and J.A.E.
Author Information Coordinates and structure factors of C. reinhardtii HydADEFG
have been deposited in the Protein Data Bank under the accession code 3LX4.
Reprints and permissions information is available at www.nature.com/reprints.
The authors declare no competing financial interests. Correspondence and
requests for materials should be addressed to J.W.P.
©2010 Macmillan Publishers Limited. All rights reserved
Structure determination and refinement. HydADEFG from C. reinhardtii was
heterologously expressed in E. coli and purified under strict anaerobic conditions
as described previously10. Before crystallization, purified HydADEFG was diluted
over a Sephadex G-25 column (GE Healthcare) in 50 mM Tris buffer (pH 7.8)
with 300 mM NaCl, 20% glycerol and 1 mM dithionite, to a final concentration
of 28 mg ml21. Crystals were obtained at room temperature (298 K) by means of
the anaerobic microcapillary batch diffusion method27 in a nitrogen-atmosphere
glovebox (Unilab, MBRAUN), using 25.5% polyethylene glycol 8000 as precipitate and 0.085 M sodium cacodylate (pH 6.5) with 0.17 M sodium acetate trihydrate and 1 mM dithionite.
The initial data were collected from a single flash-cooled crystal on beamline
9-1 at the SSRL, with a continuous flow of liquid nitrogen at 100 K, and a singlewavelength data set (wavelength, 0.954 Å) was collected up to a resolution of
1.97 Å (Supplementary Table 1). An additional single anomalous data set at the
Fe edge (wavelength, 1.74 Å) was collected at a later time using the same crystal
on beamline 9-2 at the SSRL, up to a resolution of 3.00 Å (Supplementary Table
1). The data at 1.97 Å were processed and scaled using DENZO and SCALPACK
of the HKL-2000 software package (version 1.98.7)31.
The structure was solved by molecular replacement using AutoMR of the
CCP4 suite of programs (version 6.0)32 with CpI (PDB ID, 1FEH; ref. 5) as a
search model. The solution was subjected to rigid-body refinement in REFMAC5
(version 5.2.0019)33 and further improved using ARP/wARP (version 7.0)34. The
maps obtained were solvent-flattened and histogram-averaged using RESOLVE
(version 2.10)35. Model-building was subsequently completed manually using
COOT (version 0.3.3)36 with subsequent refinement (REFMAC5) using NCS
and B-factor restraints. This resulted in the determined structure. The final
model was refined on the data at 1.97 Å to Rfactor 5 17.0% and Rfree 5 21.6%
(Supplementary Table 1). We could not determine the positions of residues
1–24, 116–120, 201–203, 312–322 and 451–457, owing to disorder. Also, the
side-chain sulphur atom of Cys 129 could be modelled and refined in three
different conformational states. In the final model, 98.4% of all residues were
in favoured regions and 100% of all residues were in allowed regions of the
Ramachandran plot (calculated with MOLPROBITY (version 3.17)37).
Because of undetermined density adjacent to the [4Fe–4S]-cluster site, an
additional single anomalous data set at the Fe edge (wavelength, 1.74 Å), at a
resolution of 3.00 Å, was used to confirm the Fe positions in the determined
structure. The data were processed as describe above and Fe positions determined using SOLVE (version 2.13)38. The native and anomalous reflection data
was merged using CAD in CCP4 and a difference Fourier synthesis between the
native model phases and anomalous scattering was calculated using FFT39 in
Phylogenetic analysis. BLASTP was used to compile all HydA, HydE, HydF,
HydG and HydA-homologue deduced amino-acid sequences from genomic
sequences using the DOE-IMG and the NCBI Genome Blast servers
(Supplementary Table 2). A total of 435 HydA and HydA-homologue sequences
were compiled and these were aligned using CLUSTALX (version 2.0.8) with the
Gonnet 250 protein matrix and default gap extension and opening penalties40.
The alignment was scrutinized and manually aligned using known catalytic
residues24. A single maximum-likelihood phylogenetic tree was computed
with PHYML (version 3.0)41 using the LG substitution matrix42 (Supplementary Fig. 2). This tree was used to empirically select 90 HydA sequences
that represented the primary lineages. Sequences were trimmed to contain only
the H-cluster domain present in HydADEFG in C. reinhardtii24. The trimmed
sequences were realigned using CLUSTALX as described above.
PROTTEST (version 2.0)43 was used to select the WAG1I1G as the best-fit
protein evolutionary model. The phylogeny of each locus was evaluated using
PHYML with the WAG evolutionary model with a proportion of invariable sites
and gamma-distributed rate variation (I1G). A composite phylogram was constructed from 100 bootstrap replicate phylograms and the tree was projected
using FIGTREE (version 1.2.2) (http://tree.bio.ed.ac.uk/software/figtree)
(Supplementary Fig. 3). Similarly, the phylogeny of each locus was evaluated
using MRBAYES (version 3.1.2)44,45. A composite phylogram was constructed
using the WAG evolutionary model with gamma-distributed rate variation with
a proportion of invariable sites (I1G). Using MRBAYES, we sampled tree topologies at likelihood stationary during two separate runs every 500 generations
over 3.7 3 106 generations, with a ‘burnin’ parameter of 2.0 3 106 (standard
deviation of split trees was ,0.07).
31. Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in
oscillation mode. Methods Enzymol. 276, 307–326 (1997).
32. Collaborative. Computation Project, Number 4. The CCP4 suite: programs for
protein crystallography. Acta Crystallogr. D 50, 760–763 (1994).
33. Murshudov, G. N., Vagin, A. A. & Dodson, E. J. Refinement of macromolecular
structures by the maximum-likelihood method. Acta Crystallogr. D 53, 240–255
34. Perrakis, A., Morris, R. & Lamzin, V. S. Automated protein model building
combined with iterative structure refinement. Nature Struct. Biol. 6, 458–463
35. Terwilliger, T. C. SOLVE and RESOLVE: automated structure solution and density
modification. Methods Enzymol. 374, 22–37 (2003).
36. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta
Crystallogr. D 60, 2126–2132 (2004).
37. Davis, I. W. et al. MolProbity: all-atom contacts and structure validation for
proteins and nucleic acids. Nucleic Acids Res. 35, W375–W383 (2007).
38. Terwilliger, T. C. & Berendzen, J. Automated MAD and MIR structure solution.
Acta Crystallogr. D 55, 849–861 (1999).
39. Ten Eyck, L. F. Crystallographic fast Fourier transforms. Acta Crystallogr. A 29,
40. Larkin, M. A. et al. Clustal W and Clustal X version 2.0. Bioinformatics 23,
41. Guindon, S. & Gascuel, O. A simple, fast, and accurate algorithm to estimate large
phylogenies by maximum likelihood. Syst. Biol. 52, 696–704 (2003).
42. Le, S. Q. & Gascuel, O. An improved general amino acid replacement matrix. Mol.
Biol. Evol. 25, 1307–1320 (2008).
43. Abascal, F., Zardoya, R. & Posada, D. ProtTest: selection of best-fit models of
protein evolution. Bioinformatics 21, 2104–2105 (2005).
44. Ronquist, F. & Huelsenbeck, J. P. MrBayes 3: Bayesian phylogenetic inference
under mixed models. Bioinformatics 19, 1572–1574 (2003).
45. Huelsenbeck, J. P. & Ronquist, F. MRBAYES: Bayesian inference of phylogenetic
trees. Bioinformatics 17, 754–755 (2001).
©2010 Macmillan Publishers Limited. All rights reserved