Crystal Structure of the Extracellular Protein Secretion NTPase EpsE

Transcription

Crystal Structure of the Extracellular Protein Secretion NTPase EpsE
doi:10.1016/j.jmb.2003.07.015
J. Mol. Biol. (2003) 333, 657–674
Crystal Structure of the Extracellular Protein Secretion
NTPase EpsE of Vibrio cholerae
Mark A. Robien1, Brian E. Krumm1,2, Maria Sandkvist3 and
Wim G. J. Hol1,2*
1
Departments of Biochemistry
and Biological Structure
Biomolecular Structure Center
University of Washington
P.O. Box 357742, Seattle
WA 98195, USA
2
Howard Hughes Medical
Institute, University of
Washington, Seattle
WA 98195, USA
3
American Red Cross Holland
Laboratory, Department of
Biochemistry, Rockville, MD
20855, USA
Type II secretion systems consist of an assembly of 12 –15 Gsp proteins
responsible for transporting a variety of virulence factors across the outer
membrane in several pathogenic bacteria. In Vibrio cholerae, the major
virulence factor cholera toxin is secreted by the Eps Type II secretion
apparatus consisting of 14 Eps proteins. One of these, EpsE, is a cytoplasmic putative NTPase essential for the functioning of the Eps system
and member of the GspE subfamily of Type II secretion ATPases. The
crystal structure of a truncated form of EpsE in nucleotide-liganded and
unliganded state has been determined, and reveals a two-domain architecture with the four characteristic sequence “boxes” of the GspE subfamily clustering around the nucleotide-binding site of the C-domain.
This domain contains two C-terminal subdomains not reported before in
this superfamily of NTPases. One of these subdomains contains a fourcysteine motif that appears to be involved in metal binding as revealed
by anomalous difference density. The EpsE subunits form a right-handed
helical arrangement in the crystal with extensive and conserved contacts
between the C and N domains of neighboring subunits. Combining the
most conserved interface with the quaternary structure of the C domain
in a distant homolog, a hexameric model for EpsE is proposed which
may reflect the assembly of this critical protein in the Type II secretion
system. The nucleotide ligand contacts both domains in this model. The
N2-domain-containing surface of the hexamer appears to be highly
conserved in the GspE family and most likely faces the inner membrane
interacting with other members of the Eps system.
q 2003 Elsevier Ltd. All rights reserved.
*Corresponding author
Keywords: Type II secretion system; GspE secretion ATPases; Type IV
secretion system
Introduction
Many Gram-negative bacteria possess a sophisticated multiprotein system for translocating periplasmic proteins across the outer membrane
known as the Type II protein secretion system
(T2SS), which is also referred to as the “general
secretion pathway” (gsp) system.1 This machinery
is responsible for the final stage of protein translocation from the periplasm to the extracellular
space, while the initial stages of post-translational
processing and transportation across the inner
membrane are accomplished by the sec mechaAbbreviation used: gsp, general secretion pathway.
E-mail address of the corresponding author:
[email protected]
nism. In many pathogenic bacteria, such as
Pseudomonas aeruginosa, Klebsiella pneumoniae,
enterohemorrhagic and enterotoxigenic Escherichia
coli (EHEC and ETEC), and Vibrio cholerae, key
virulence factors are transported across the outer
membrane by the T2SS. The T2SS proteins in the
human pathogen V. cholerae are encoded by the
“extracellular protein secretion” (eps) genes.2 – 4
This T2SS of V. cholerae system mediates secretion
of cholera toxin and several hydrolytic enzymes
across the outer membrane. Quite remarkably the
, 86 kDa AB5 cholera toxin is translocated in a
folded state across this membrane5,6 and is the
primary responsible agent for the acute, lifethreatening diarrhea that is the hallmark of
cholera.7
The typical T2SS consists of 12– 15 different
0022-2836/$ - see front matter q 2003 Elsevier Ltd. All rights reserved.
Figure 1 (legend opposite)
Figure 1. Sequence and structural homologs of EpsE. (A) Representative aligned sequences of five subfamilies of ATPases within the Type II secretion family of ATPases are
shown. These sequences are grouped by subfamily: the first group represents seven selected members of the GpsE subfamily, including EpsE; the second group are three TFP
ATPases from the PilB/HofB subfamily; the third group is the TFP ATPase Vibrio cholerae TcpT, which is not closely related to members of other subfamilies in the Planet
analysis;17 the fourth group are five selected members of another large TFP ATPase subfamily, the PilT/PilU subfamily; and the fifth group are representatives of the
ComG1 subfamily of presumptive ATPases involved in competence (DNA uptake) in Gram-positive bacteria. Residues shown in highlighted text are identical within the subfamily sequences shown; TcpT residues are highlighted based on identity to EpsE, since there are no other subfamily members for comparison. Secondary structure symbols
above the EpsE sequence represent the observed secondary structure in our experimental model; shaded portions of the alignment represent the four distinguishing sequence
motifs common to the broad superfamily of ATPases that includes the T2SS, TFP and ComG1 ATPases as well as the T4SS ATPases. Domain/subdomains are shown by
colored bars above and below the five groups of sequences. The upright and inverted triangular symbols identify residues involved in the intersubunit interface of our experimental model, with dark triangular symbols reserved for residues which are identical in over 95% of the known GspE subfamily members. The alignment shown was chosen
from among those produced by Planet,17 which were prepared using CLUSTALX without experimental information about the tertiary structure of EpsE. Manual inspection of
this alignment together with our experimentally determined tertiary structure uncovers the previously obscured presence of a CM-like tetracysteine motif in the TcpT
sequence, in addition to the GspE and PilB/HofB subfamilies. (B) Structure-based sequence alignment of EpsE and HP0525, a member of the VirB11 subfamily of the Type
4 secretion system ATPases. Experimentally determined secondary structure is depicted above the corresponding sequence. Residues highlighted in red show instances
where the DALI superposition places an identical amino acid at corresponding positions in the alignment. Shaded boxes show the position of the four characteristic sequence
motifs found in the Type II/Type IV ATPases. The colored bar at the top of the block of sequences shows the boundaries of the domains identified from the experimental EpsE
structure. Residues with asterisk make close contact to liganded nucleotide in the structures, black circles indicate residues implicated in NTPase activity by proximity to the
nucleotide and structural homology to similar residues in other NTPase structures. Red circles are placed at the proposed positions of N2-nucleotide contacts in a putative
HP0525-like “closed” conformation. Upright triangles mark the positions of the observed intersubunit N:C0 contacts in each protein; inverted triangles mark the residues
involved in the C:C0 interfaces. This Figure was prepared using ESPript 2.0.53
660
proteins. In V. cholerae there are 14: designated
EpsA to EpsN8 and VcpD/PilD.9,10 The generic
names of the orthologs of these gsp proteins in
other species are GspA to GspN and GspO. The
focus of this article is the “E” component of the
T2SS from V. cholerae (Figure 1). This cytosolic
EpsE protein is associated with the cytoplasmic
face of the inner membrane of V. cholerae forming
a complex with at least two bitopic inner membrane T2SS proteins, EpsL and EpsM.11,12 Several
reports have established that other GspE proteins
form similar complexes in other species.13 – 15 Furthermore, additional inner membrane components
including GspC, GpsF, and GspG have been
found to interact with the GspE-L-M subcomplex.13,15,16
EpsE and its T2SS homologs in the GspE subfamily of proteins, belong to a large superfamily
of “Type II/IV Secretion NTPases” which have
been analyzed by Planet et al.17 These authors17
divided 148 genes from this superfamily into a
“Type II family” and a “Type IV family”, which
are each further subdivided into subfamilies
according to characteristic amino acid sequence
patterns. The GspE subfamily to which EpsE
belongs represents a set of closely related putative
ATPases involved in Type II secretion. It should
be noted, however, that other subfamilies in the
“Type II family” identified by Planet are not components of Type II Secretion Systems. Many are,
instead, components involved in Type 4 pilus
(TFP) biogenesis. Although TFP assemblies and
the T2SS multi-protein complexes are distinctly
different, some components of these machineries
are homologous,18 particularly the TFP and GspE
ATPases.1,17 TFP ATPases comprise two subfamilies
exhibiting 30 – 50% amino acid sequence identity
with the GspE subfamily. In contrast, members of
the GspE and TFP ATPases are only distant homologs of the “Type IV family” of secretion ATPases
as classified by Planet et al.17 Another subfamily of
the “Type II family” of secretion NTPases,
ComG1, has yet another function and is associated
with a bacterial competence mechanism responsible for DNA uptake through a multi-protein
complex.19 Representatives of the subfamilies of
the “Type II family” of secretion NTPases are
shown in the family sequence alignment of Figure
1(A). This article describes the three-dimensional
structure of EpsE, the first member of the entire
“Type II family” of secretion NTPases with its crystal structure solved so far.
The structure of a Type IV Secretion NTPase has
been reported previously: HP0525,20 a member of
the VirB11 subfamily in the classification of Planet
et al.17 This ATPase from Helicobacter pylori is a
component of a Type IV Secretion System involved
in injecting the H. pylori CagA protein into gastric
epithelial cells.21 The 2.5 Å structure of HP0525
revealed a two-domain protein with ADP bound
to the anticipated nucleotide-binding site (Yeo et al.)20
Comparison of the EpsE and HP0525 amino acid
sequences using BLAST22 revealed a sequence
Crystal Structure of EpsE
identity of , 32% confined to 113 amino acid residues in the C-terminal domain of the 330-residue
HP0525 ATPase. No sequence homology between
EpsE and HP0525 was detected for either the
250 N-terminal residues or for the remaining 130
C-terminal residues of EpsE.
Here, we (i) report the crystal structure of an
N-terminally truncated version of V. cholerae EpsE
in liganded and unliganded state; (ii) compare
these structures with those of the closest, yet distant, relatives; (iii) analyze the effects of published
mutagenesis studies in the GspE subfamily of
ATPases; and (iv) propose a model for a hexameric
arrangement of EpsE which may reflect features of
the GspE ATPases when functioning in the Type II
secretion systems. Knowledge of the EpsE structure provides a platform for the design of inhibitors of this subfamily of secretion ATPases. Given
that EpsE and its subfamily members are essential
for the functioning of the Type II Secretion Systems
in a group of pathogens of major medical relevance,8 interfering with the functioning of these
enzymes could significantly diminish the extracellular transport of virulence factors and consequently decrease the severity of several important
bacterial diseases.
Results
Structure of the monomer
The crystal structures of both unliganded and
liganded EpsED90, an N-terminal deletion variant
of hexahistidine-tagged EpsE, were solved in
space group P61 22 using SeMet SAD methods,
followed by model building and refinement to
Table 1. Data reduction and refinement statistics
Unliganded
Wavelength (Å)
0.9748
Resolution (Å)
60–2.50
Unique reflections
18,982
Total reflections
241,296
10.2 (65.3)
Rsym (last shell)
3.0 (18.6)
Rpim (last shell)
I=sigma (last shell)
18.6 (4.3)
Multiplicity
12.71
Refinement
Number of protein atoms
3045
Number of water molecules
111
Number of metal ions
1
Number of other atoms
1 (Cl2)
Resolution (Å)
50–2.5
21.4/26.5
Rwork =Rfree
RMS deviations from ideal geometry
Bond lengths (Å)
0.011
Bond angles (8)
1.374
Chirality
0.090
Ramachandran analysis
Most favorable
Allowed
Generously allowed
Disallowed
302 (92.1%)
22 (6.7%)
2 (0.6%)
2 (0.6%)
Liganded
0.9791
60–2.70
15,414
160,321
15.2 (57.7)
4.6 (16.7)
11.1 (3.8)
10.40
3034
25
1
31 (AMPPNP)
50–2.7
23.8/28.4
0.010
1.348
0.089
300 (89.8%)
29 (8.7%)
3 (0.9%)
2 (0.6%)
Crystal Structure of EpsE
661
Figure 2. Structure of the EpsE monomer. (A) EpsE ribbon structure with domains colored as follows: N2, cyan; C1,
dark blue; CM, yellow; C2, green. Due to loops with weak electron density, some residues between aA and aB, b4 and
b5, and aJ and b14 could not be incorporated into the final structure. Hence, these loops are shown with dotted lines.
The position of a bound molecule of AMPPNP observed in the 2.7 Å dataset is indicated. The position of 11 selenomethionine residues and the metal site identified by SOLVE are also shown, together with the anomalous Fourier
map contoured at þ4s. This Figure and Figures 3(A) and (B), 4, 5(A) and (B), and 6(A) – (C) were generated by the
programs MOLSCRIPT54 and with Xtalview,49 and rendered with Raster3D.55 (B) Surface representation of the EpsE
monomer and bound AMPPNP, showing the positions of the domains and subdomains, with N2 colored cyan; C1,
dark blue; CM, yellow; and C2, green. The small , 300 Å2 contact between the N2 domain (cyan) and CM subdomain
(yellow) is seen in the foreground. The surface of the linker residues (white) is visible behind the nucleotide, AMPPNP.
resolutions of 2.5 Å and 2.7 Å, respectively
(Table 1). Twelve heavy atom sites were found by
SOLVE.23 There are 12 methionine residues in the
sequence of EpsED90, but the selenium of the
N-terminal selenomethionine was not detectable.
One of the 12 sites appeared to be a metal-binding
site.
The EpsED90 subunit (Figure 2(A)) is a two
domain protein composed of an N2 domain and a
C domain connected by a 15 residue linker. The C
domain is further subdivided into C1, CM and C2
subdomains (Figure 2(B)). This nomenclature
implicitly allows the deleted 90 residues to be
referred to as the N1 domain. The N2 domain
comprises residues 101– 225 and consists of a sixstranded antiparallel beta sheet forming one
concave face of the domain, and helices aA, aB,
aC, which comprise the convex face of N2. The C1
domain exhibits a topology with a central b sheet
composed of six parallel strands with a seventh
antiparallel b15 strand; three and four helices flank
the two faces of this sheet. Consultation of the
SCOP database24 of protein structures reveals that
this topology is similar to that observed in ABC
transporter ATPases, AAA ATPases, and other
RecA-like proteins, but differs in strand order or
the presence and position of the antiparallel strand
from other classes of P-loop NTPases. The C1
domain, made up of residues 240– 392 and 442 –
450, contains all four characteristic sequence
motifs, previously identified in all members of the
T2SS and T4SS subfamilies:1,17,25 the “Walker A”,
”Walker B”, “Asp” and “His” boxes.
The small CM subdomain, containing residues
662
393 –441, is a hairpin-like meandering loop with a
conserved tetracysteine motif that binds a metal
cation near the sharply bent proximal end of the
loop. The CM subdomain protrudes from the C1
domain, resulting in effectively no contacts
between these domains. The C2 subdomain, spanning the C-terminal residues 451– 500, is spatially
interposed between the C1 and CM subdomains,
and consists of four short helical segments
arranged in a single layer along the convex face of
the large C1 domain (Figure 2(A)). The interface
between the C2 and CM subdomains, which buries
, 1230 Å2 of surface area, features several residues,
including a salt bridge between the Arg394
and Glu495, which are strictly conserved within
the GspE subfamily (Figure 1(A)), and several
hydrogen bonds between the loops composed of
residues 393– 396 and 489 –495. Contacts between
the N2 domain and the C domains within a
single subunit are quite limited, with a buried
interdomain surface of less than 300 Å2 between
the b2– b3 and b14 –b15 loops (Figure 2(B)) in the
structures from both liganded and unliganded
crystals.
Nucleotide-binding site
One crystal, co-crystallized in the presence of
10 mM AMPPNP at 14 8C and solved at 2.7 Å
resolution (Table 1), has the active site occupied
by a molecule of AMPPNP in the anti conformation. The AMPPNP ligand has well defined
density (Figure 3(A)), and makes contacts solely
with residues belonging to the C1 domain or the
linker (Figures 2(C) and 3(B)). The protein structures of unliganded EpsED90 and liganded
EpsED90 are very similar (rmsd, 0.3 Å for 377 Ca
atoms). We did not find structural evidence to
suggest that the presence of bound nucleotide
leads to a significant conformational change, such
as the spatial relationship between the N2 and the
C domains.
The adenyl moiety of the AMPPNP makes
hydrogen bonds with the backbone carbonyls of
Leu239 and Arg441. The side-chains of Leu234
and Leu239 make additional hydrophobic contacts
with the adenyl ring. The ribose moiety of
AMPPNP makes hydrogen bonds with the sidechains of Thr232 and Arg441. In contrast to the
adenyl and ribose moieties, the phosphate tail of
the nucleotide forms an extensive network of close
contacts with EpsE (Figure 3(B)). Oxygen atoms of
the a-phosphate form hydrogen bonds with the
backbone amide groups of Gly269, Lys270, Ser271,
Thr272 and the side-chain of Thr272. Oxygen
atoms of the b-phosphate hydrogen bond with the
backbone amide groups of Ser268, Gly269, Lys270,
Ser271 and the side-chain of Lys270. The oxygen
atoms of the g-phosphate form hydrogen bonds
with the backbone amide of Gly267 and the sidechains of Thr266 and Lys270. The phosphates are
located at the N terminus of helix a allowing for a
Crystal Structure of EpsE
favorable interaction of the charge with the helix
dipole.26
The protein surface in the vicinity of the nucleotide-binding site forms a binding groove with a
bowl-shaped depression surrounding the position
of the g-phosphate. The Walker A, Walker B, Asp,
and His boxes contribute key residues that form
the walls of this bowl (Figure 3(D)). The Walker A
box is responsible for forming an extensive
hydrogen-bonding network with the phosphate
tail. For each of the remaining three boxes, the
protein surface features a prominently exposed
residue that likely plays a role in the putative
NTPase activity of EpsE. The Asp box contains an
exposed Glu296 side-chain that is positioned
within 5.5 Å of an oxygen of the g-phosphate and
3.9 Å of a terminal oxygen of the b phosphate.
The Walker B box features the Glu334 side-chain
within 5.3 Å of the g-phosphate. In the His box, an
imidazole nitrogen of His359 is 6.9 Å from the
nearest phosphate oxygen atoms of the AMPPNP.
Metal-binding site in the CM domain
A metal ion is found tetrahedrally coordinated
by the Sg of four cysteine residues, Cys397,
Cys400, Cys430 and Cys433. Regarding the nature
of the metal, crystallography cannot discriminate
readily between different divalent cations. However, the cysteine-coordinated metal in EpsE is
tentatively modeled as zinc for the following
reasons. First, the vast majority of tetracoordinated
metal sites with four ligating cysteine residues
contain zinc. Consultation of the MDB database27
of metalloprotein sites from the PDB discloses 172
structures with a total of 300 metal-containing
Cys4 sites. Of these 300 sites, 264 contain Zn as the
metal, 31 contain Fe, and 2 contain Ga. The remaining three instances contain Cu, Cd, and Ni, respectively. Secondly, at the wavelengths we employed,
Zn has an anomalous signal of 2.5 electrons, Se
has an anomalous signal of over 3.8 electrons, and
Fe, Ni, Co, Cu, and Cd have anomalous signals of
between 1.5 and 2.2 electrons. Nonetheless, the
anomalous peaks in the anomalous difference
Fourier maps at the metal sites are 8.3s and 11.8s,
in the liganded and unliganded EpsE structures,
respectively, which is between 59% and 76% of the
mean anomalous peak heights for the selenium
sites. The ratio between the anomalous peak
heights is thus more consistent with Zn than the
other metals, in particular Fe.
Construction of the model in the area of the
metal was complicated by relatively weak electron
density resulting in high B-factors; particularly for
residues 427 through 437. The elongated shape of
the metal in the anomalous difference Fourier
(Figure 4) suggests several positions for the metal
coupled with multiple conformations of the
ligands and surrounding residues, explaining the
unclear electron density in the neighborhood of
the metal ion.
663
Crystal Structure of EpsE
Figure 3 (legend on p.665)
Linker
Quaternary structure
In both liganded and unliganded EpsE structures, a linker of 15 residues, composed of residues 226 – 240, connects the N2 and C1 domains.
The linker makes few contacts with either of
these domains, with only five hydrogen bonds
and , 10 additional contacts of less than 3.5 Å.
The limited contacts between the N and C
domains in the EpsE monomer demonstrate that
the two domains of EpsE are quite independent
entities if one considers a single subunit. This
becomes entirely different once subunit – subunit
interactions are taken into account as shown in
the next section.
Of the 12 subunits per unit cell of EpsED90
(Figure 5(A)), six can be found within a single
helical filament. Another six form a second antiparallel filament, related to the initial filament by
a crystallographic 2-fold axis perpendicular to
the 61 axis. Together, the two antiparallel helical
filaments form a hollow cylinder with a central solvent-filled space running along the central axis
(not shown). This cylinder has an inner diameter
of , 37 Å and an outer diameter of , 105 Å. The
buried surface area of a , 710 Å2 interface between
contacting subunits from antiparallel helical filaments within this cylinder is quite small. An even
664
Crystal Structure of EpsE
Figure 3 (legend opposite)
665
Crystal Structure of EpsE
Figure 4. Metal site. Metal coordinated by the Sg of the four surrounding cysteine residues in the CM subdomain.
Orange contours represent the anomalous difference Fourier map contoured at þ4s. This dataset was collected at a
wavelength of 0.9748 Å, slightly above the Se edge.
smaller interface of , 420 Å2 is buried between
subunits from adjacent cylinders. The residues
comprising these interfaces are not conserved
within the GspE subfamily. These interfaces
between antiparallel filaments or adjoining
cylinders therefore are not likely to be relevant
for the in vivo conformation of the GspE proteins.
A single helical filament generated by the 6-fold
screw axis, has a rise of 27 Å per subunit, six subunits per turn and extensive intermolecular contacts (Figure 5(B)). The extensive intersubunit
contacts are due to the C domains of one monomer
being positioned into the large space between the
N20 and C0 domains of an adjacent monomer, with
the primed notation signifying a domain in the
adjacent monomer within a helical strand. This
arrangement buries a total of , 3240 Å2 of surface
area between neighboring subunits. The C1:N20
interaction is composed of , 1800 Å2 of buried
surface area, which is , 12% of the total C10 þ N2
surface area. We will refer to this as the C:N0 interface (orange ellipse, Figure 5(B)). The C1 þ
C2:CM0 þ C20 interaction buries , 1440 Å2, which
is , 8% of the total surface area of these domains
(magenta ellipse, Figure 5(B)). This interface will
be referred to as the C:C0 contacts.
Examination of the sequence conservation of
these two interfaces discloses marked differences
between them. The large C:N0 interface is composed of 28 interacting residues including four
Arg– Asp salt bridges and a total of 19 hydrogen
bonds between highly conserved residues within
the GspE subfamily. A total of 23 of these 28 residues are strictly conserved within the selected
GspE subfamily sequences shown in Figure 1(A).
Of these, 13 are also strictly conserved within
PilB/HofB subfamily, the closest homologous TFP
ATPase subfamily.17 In contrast, the C:C0 interface
is comprised of 24 residues with seven hydrogen
bonds. Only 11 of the 24 interacting residues are
highly conserved even when considering only the
closely related members within the GspE subfamily (Figure 1(A)). The sequence conservation
and the large relative buried surface area suggests
that the C:N0 interface is likely of physiological
relevance, as will be discussed later. The in vivo
relevance of the C:C0 contacts is less clear.
DALI results: structurally similar proteins
A DALI28 search was conducted using the core
portions of the N2 and C1 domains, to find structurally homologous proteins. HP0525 was the top
scoring homolog in both searches. No other proteins were found which contained high-scoring
structural homologs to both of the EpsE domains.
A total of 71 proteins homologous to the N2
domain of EpsE were reported by DALI. Only two
of these had a Z-score greater than 4. HP0525 was
the top scoring homolog of the EpsE N2 domain
with a Z-score of 6.8. HP0525 exhibits an rms
deviation of 3.3 Å for 92 aligned residues out of
Figure 3. Active site of EpsE with bound AMPPNP. (A) The electron density of AMPPNP, contoured at þ 1s, of a
sigma-A weighted 2mFo 2 DFc map. (B) Stereo view of the model at the active site. Residues in orange are suspected
of playing a role in enzyme function. This includes one strictly conserved residue from each of the four distinctive
sequence motifs found in the Type II/Type IV secretory NTPase superfamily, i.e. Thr266 (Walker A), Glu296 (Asp
box), Glu334 (Walker B), and His359 (His box) (Figure 1(A)). Shown in red dotted lines are hydrogen bonds or salt
bridges between the protein and the AMPPNP. Other EpsE residues with bold bonds are strictly conserved in the
GspE subfamily of ATPases; residues in lighter bonds are less conserved. (C) Schematic representation of EpsE interacting with bound AMPPNP. (D) Surface of the nucleotide-binding site, with residues of the Walker A, Asp box,
Walker B, and His boxes (Figure 1(A)) colored cyan, green, blue and magenta, respectively.
666
Crystal Structure of EpsE
Figure 5. Quaternary structure of the experimental EpsE. (A) Van der Waals representation of 12 subunits of EpsE
contained in the unit cell, viewed along a crystallographic dyad perpendicular to the 61 axis. (B) Adjacent subunits of
EpsE within a helical strand, showing the two major interfaces. The C:N0 interface (magenta ellipse) buries approximately 1800 Å2 of surface area. The C:C0 interface (orange ellipse) buries approximately 1440 Å2 of surface area.
104 residues in the EpsE N2 domain, with a
sequence identity of 8%. Residues that are identical
in this alignment of the N domains of EpsE and
HP0525 are widely dispersed through the N
domains. The overall topologies of the N2 domain
of EpsE and the N domain of HP0525 are clearly
similar (Figure 6(A)), but the positions of the initial
helices of the two proteins are quite different.
The initial kinked helix of HP0525 juts out from
the central globular portion of the N domain on
the left side of this diagram, while the small initial
helical segment of EpsED90 is found on the right
side of this diagram. Hence, the initial EpsED90
helix has no counterpart in HP0525. This initial
helix of EpsE, spanning residues Phe101 through
Glu107, is separated from the bulk of the N2
domain by 13 disordered residues, raising the
possibility that the initial helix of EpsE could be
part of a separate N1 domain truncated by the
deletion of the first 90 residues of EpsE.
DALI returned a total of 173 proteins homologous to the core C1 domain. None of the homologs
contain structural elements with recognizable
similarity to the CM or C2 subdomains. Many of
these are relatively distant structural homologs, as
only 47 of the proteins have a Z-score greater than
4. With a Z-score of 19.2, H. pylori HP0525 is the
top scoring homolog to the EpsE C1 domain. The
residues of HP0525 homologous to the EpsE C1
domain are all contained with the C domain of
HP0525, with an rms deviation of 2.0 Å for 155
Ca atoms, out of 161 in the EpsE C1 domain, with
a sequence identity of 21%.
The superposition of the HP0525 C domain and
Crystal Structure of EpsE
667
Figure 6. Superposition of EpsE with its homolog HP0525. (A) Superposition of the N2 domain (cyan) of EpsE and
the N domain (light red) of the structure of HP0525. Dotted lines are shown in positions where the EpsE sequence
could not be placed into electron density. (B) Superposition of the C domains of EpsE and the C domain (light red)
of HP0525. The C1 domain of EpsE is dark blue, the CM domain yellow, and the C2 domain green. The position of
ADP bound to HP0525 is red and the AMPPNP bound to EpsE is cyan. Dotted lines connect residues of the CM domain
bridging residues 415– 419 that were not modeled due to weak electron density. (C) Superposition of the experimental
structure of EpsE (“open” configuration, dark blue), and the structure of HP0525 (“closed” configuration, light red).
The C domains of the two structures are superimposed, in order to demonstrate the relative position of the corresponding N domains in the two proteins.
the EpsE C1 domain (Figure 6(B)) shows the similarity of the topology between these chains. The
ADP in HP0525 and the AMPPNP in the liganded
EpsE structure both adopt the anti conformation,
and occupy very similar positions with respect
to the nucleotide-binding domains with an rms deviation of 1.4 Å for 26 comparable atoms in the two
ligands. By several measures, such as the sequence
identity of 21% (versus 8%) and rms deviation
of 2.0 Å (versus 3.3 Å), the C domain of HP0525
668
and the C1 domain of EpsE are more similar than
the N domains of these two proteins. A major
difference between the C domains of these proteins
is that HP0525 lacks a structural counterpart to
both the EpsE CM and C2 subdomains. The
alignment of HP0525 and EpsE also discloses
12 residues, residues 290– 301, which are found
in HP0525 but not in EpsE nor in other
members of the GspE subfamily. In the quaternary
structure of HP0525, these 12 residues form a constriction at the narrow opening of a hexameric
ring.20
The mutual orientation of the N and C domains
is markedly different in EpsE and HP0525. In
HP0525, the two domains form a closed jaw in
which both domains tightly interact with the
bound ADP molecule found in the active site.
Despite this close interaction with the ligand, direct
interactions between the two domains of HP0525
are limited as evidenced by the very small , 90 Å2
buried interdomain surface area.20 In the EpsE
structure, the N2 and C domains are in a more
open configuration (Figure 6(C)). As a result, the
N2 domain does not make any contacts with the
AMPPNP bound to the C1 domain. Interestingly,
HP0525 N domain residues implicated in binding
ADP are either strictly conserved, such as Arg113
and Arg133, equivalent to Arg210 and Arg225 in
EpsE, or conservatively substituted, such as Thr45
and Asn61 in HP0525 which are, respectively,
equivalent to Ser140 and Asp158 in EpsE. All four
of these positions are strictly conserved within the
GspE subfamily. Despite the limited sequence
identity between the N domains of HP0525 and
EpsE, the sequence identity or conservative substitutions of residues participating in N domainADP contacts of HP0525 suggests that it is quite
possible that EpsE is capable of adopting a similar
closed jaw formation as will be discussed further
below.
The intersubunit relationships in the EpsE and
HP0525 crystal structures are more diverse. The
large, 24 residue, , 1440 Å2 burying C:C0 interface
between adjacent subunits in the “61-helix” of the
EpsE helical filament (Figure 5(B)) bears no significant structural similarity to the small , 450 Å2 C:C0
interface in the HP0525 cylindrical hexamer.20 This
is reflected in the non-overlapping positions of the
inverted triangles in Figure 1(B), which represent
residues participating in this C:C0 interface. On the
other hand, the C:N0 interface of adjacent subunits
are more comparable in buried surface area:
, 1370 Å2 in HP0525 and , 1800 Å2 in EpsE. Fourteen out of 28 of the EpsE residues involved in the
C:N0 interface structurally align with intersubunit
interacting residues in HP0525 (Figure 1(B),
upright triangles), showing that similar regions
of the corresponding domains are involved in the
largest intersubunit interface in both crystal structures. Nonetheless, the chemical nature of these
intersubunit interactions is not strongly conserved.
For example, a salt bridge in the HP0525 N:C0
interface between residues Glu47 and Arg240 is
Crystal Structure of EpsE
not conserved in EpsE. Conversely, salt bridges in
the EpsE C:N0 interface, Arg156 –Asp326, Arg156–
Asp328, and Asp195 –Arg324 are without counterparts in the HP0525 N:C0 interface. Thus, although
many of the residues involved in C:N0 intersubunit
contacts are structurally equivalent, the chemical
nature of the interacting residues is not very well
conserved.
Insights into catalysis based on
homologous proteins
Analysis of common structural elements among
previously studied homologous ATPases20,29 – 33
may provide insight into the ATPase activity of
EpsE. The side-chains of Glu296 and Glu334 are
prominent surface exposed elements of the cavity
adjacent to the g-phosphate of AMPPNP in our
structure (Figure 3(B)). Two corresponding acidic
residues of RecA, Glu96 and Asp144 are found in
similar locations. In RecA, Glu96 is proposed to
activate an attacking water during ATP hydrolysis
and Asp144 participates in the coordination of the
divalent cation cofactor, magnesium.29 Diverse
RecA-like ATPases, such as H. pylori HP0525,20 bacteriophage T7 helicase,30 and the AAA þ ATPase
NSF,31,32 display a similar arrangement of acidic
residues. Thus, in EpsE, this cavity suggests itself
as the active-site cleft for the putative ATPase
activity of EpsE.
His359, a strictly conserved residue of the His
box in the GspE subfamily of secretory NTPases
(Figure 1(A)), is also a surface exposed element of
the cavity adjacent to the g-phosphate. The structural alignment reported by the DALI server indicates that this residue is a histidine in seven of the
top 20 structural homologs including HP0525. In
the second highest scoring homolog to EpsE, the
bacteriophage T7 helicase domain, a role for the
homologous His465 has been proposed. In the T7
helicase, this histidine may act as a g-phosphate
sensor, with nucleotide hydrolysis promoting a
conformational change in the position of this
side-chain. As noted by Sawaya et al.,30 similar
conformational switching mechanisms have been
proposed for RecA29 and PcrA,33 both of which are
structural homologs to the C1 domain of EpsE.
Although the distances from the imidazole nitrogen atoms to the nearest terminal oxygen of
AMPPNP are , 7 – 8 Å in the current liganded
EpsE structure (Figure 3(B)), a rotation of the
solvent-exposed His359 side-chain about the x1
dihedral angle could reduce this distance to as
little as 2.7 Å (Figure 3(B)). Hence, it is plausible
that His359 of EpsE may act as a phosphate sensor
in GspE proteins.
EpsE structure and mutation studies
The EpsE structure allows the loss of function
associated with several reported mutations to be
more fully explained. Mutations of the EpsE
Walker A residue Lys270 are known to cause a
Crystal Structure of EpsE
marked loss of function in several T2SS and TFP
systems.11,16,25,34 As anticipated, the Lys270 sidechain amine interacts with the b-phosphate of the
AMPPNP in our liganded structure. The Gly to
Ala or Ser mutations at the position corresponding
to EpsE Gly269 causes a loss of type II secretion in
P. aeruginosa and the pullulanase secretion system
of Klebsiella oxytoca,25 respectively.35 The backbone
dihedral angles for this glycine are in the region of
the Ramachandran plot with positive values for
both the f and c angles. This helps explain the
preference for glycine over other residues at this
position, as non-glycine residues are relatively
infrequently observed with these backbone
dihedral angles. A Thr to Ile mutation at the
position corresponding to EpsE Thr273 also causes
a
temperature
sensitive
phenotype
in
P. aeruginosa.16 The contacts of the Og of Thr273
include the Oe and Ne of Gln390 and the mainchain carbonyl of Gly269. A Thr273Ile mutation
would abolish this network of favorable hydrophilic interactions.
A different Thr to Ile mutation, at the position
corresponding to EpsE Thr266, in the TFP ATPase
PilQ also causes a loss of function of the Type IV
sex pilus of the R64 plasmid.34 In EpsE, the Thr266
side-chain is , 3.5 Å from a terminal oxygen of
the g-phosphate. Thr266 may be able to act as a
phosphate sensor, or alternatively, may form a
stabilizing hydrogen bond with ATP in vivo; in
either case, the hydrophobic side-chain of isoleucine at this position would likely provide an
unfavorable environment for the hydrophilic
g-phosphate.
Within the Asp box, an Asp to Asn mutation at a
position corresponding to Asp293 in EpsE has been
studied in both the T2SS system of K. oxytoca25 and
in the TFP system of the R64 plasmid sex pilus.34
This aspartate residue is strictly conserved
throughout the GspE subfamily. In the EpsE structure, the side-chain carboxylate oxygen atoms of
Asp293 are both more than 9.7 Å from the nearest
oxygen of the AMPPNP phosphate tail. This
would suggest that this side-chain is likely too
distant to be participating in ATP hydrolysis. In
the EpsE structures, the side-chain of Asp293
forms a salt bridge with the side-chain of Walker
B residue Arg336, an arginine which is strictly conserved in all GspE subfamily members. This salt
bridge would be lost by mutation of Asp293 to the
neutral asparagine residue. The loss of function
by such mutations,25,34 may be due to the loss of
this favorable interaction. The Asp293 side-chain
is solvent-exposed in the EpsE structure, and thus
may also participate in interactions with other
components within the fully assembled Type II
secretion machinery, which could be an alternative
explanation for the loss of function in this
mutation.
Tetracysteine motif in related proteins
In EpsE, the tetracysteine motif consists of two
669
CxxC motifs with 29 intervening residues that
form the extended hairpin-like loop. The tetracysteine motif of the CM domain occurs in all
known members of the GspE subfamily (Figure
1(A)), with the exception of Xanthomonas campestris
XpsE and Xylella fastidiosa XpsE (not shown),
which has a CM domain in which the four cysteine
residues are replaced in the former by Asp, Asn,
Thr, and Ala and in the latter by Glu, His, Ser, and
Ala. Additionally, in another GspE subfamily
member, Pseudomonas putida XcpR (not shown),
the first CxxC is replaced by CxC. Within the
GspE subfamily, the separation between the CxxC
motifs ranges from 21 to 40 residues.
The CM domain is also found throughout both the
PilB and HofB branches of the PilB subfamily. In
some of the HofB sequences, such as H. influenzae
HofB (Figure 1(A)), the second CxxC motif is
replaced by CxC. The separation between the
second and third cysteine in the PilB subfamily is
more variable than in the GspE family, ranging
from as few as nine residues in several HofB
sequences, up to 32 residues.
The CM domain is also found in the TFP ATPase
V. cholerae TcpT (Figure 1(A)). This sequence has
two CxxC motifs with 20 intervening residues
between the second and third cysteine residues.
Thus, a CM domain is found in members of the
GspE and PilB subfamilies, with some variability
noted, especially in the number of residues found
in the loop between the second and third cysteine
residues. However, the tetracysteine sequence
motif neither occurs in any of the members of the
ComG1 and PilT/PilU subfamilies (Figure 1(A)),
nor in the VirB11 subfamily of sequences, such as
HP0525 (Figure 1(B)). A large gap is found in this
region of sequences from subfamilies without the
tetracysteine motif, suggesting that the CM hairpin
loop and the sharp bend found at the site of metal
ligation is likely to be replaced by a much simpler
and shorter loop, such as that found in HP052520
(Figure 6(B)).
The importance of the CM domains is underlined
by several studies. Mutation of one or two of the
cysteine residues to serine in the GspE of K. oxytoca
lead to diminished secretion of the protein
pullulanase.36 Simultaneous mutation of three of
the four cysteine residues by serine led to abolition
of Type II protein secretion in this organism.36
Clearly, the cysteine residues play a crucial role in
the functioning of T2SS. Our structure shows
(Figure 4) that these cysteine residues are implicated in metal binding. How metal binding and
the consequent specific organization of the CM
domains affects other components of the Type II
secretion system still remains to be determined.
GspE proteins are not known to interact with
DNA or RNA as eukaryotic zinc-finger domains
do, although this has not been excluded. Some
Zn-binding proteins, such as the eukaryotic RING
proteins are implicated in protein –protein
interactions.37 As the CM subdomain of the EpsE
monomer has a very large exposed surface area
670
(Figure 2(B)), this is an attractive hypothesis for the
role of the this subdomain.
A hexameric ring model of EpsE
Eight of the ten top scoring DALI homologs to
the C1 domain of EpsE are reported to be capable
of forming multimeric ring assemblies. There are
several ATPases with experimental evidence for
an in vivo ring hexameric structure despite a helical
filament arrangement of protomers in the crystal
structure. For instance the bacteriophage T7 helicase-primase protein forms a hexamer that forms
a topologically closed ring observed by electron
microscopy studies. Nevertheless, the isolated helicase domain crystallizes as a right-handed helical
filament.30 Sawaya et al. describe a model of the
T7 helicase in a hexameric ring by collapsing the
subunits forming the helix into a circle, followed
by an 188 rotation and 10 Å translation of the
subunits.30 Cryoelectron microscopy images show
that E. coli RecA can form a ring structure,38 while
the crystal structure is a helical filament with P61
symmetry. Another likely hexameric ring motor,
the AAA þ ATPase chaperonin ClpA, forms a lefthanded helical filament in the observed crystal
with space group P65 :39 In this case, a hexameric
ring model of ClpA was constructed by using
known closely related structural homologs as
templates.
Studies of the multimerization state of GspE
proteins have been conflicting, with a monomer
reported for purified histidine-tagged EpsE,11 and
in vivo dimers or higher order multimers reported
for K. pneumoniae PulE,25 Erwinia carotova OutE14
and P. aeruginosa XcpR.40 These differences suggest
that oligomerization of GspE proteins may require
other type II secretion components as well as phospholipids. Interestingly, two different forms of
EpsE were detected during subcellular fractionation of V. cholerae cells; a cytoplasmic soluble
form and a cytoplasmic membrane-associated
form.11 It is possible that these forms represent
different multimerization states of EpsE; the
soluble form being a monomer and the membrane-associated form present within the T2SS
complex being a larger oligomer. Additionally, a
possible homooctameric form has been reported
for the homologous TFP ATPase PilQ.34 The aggregation of the purified GspE proteins, particularly
at higher concentrations25,36 (M.A.R. & B.E.K.,
unpublished observations), may have made size
exclusion studies difficult to pursue, in the absence
of possibly stabilizing partner proteins. To our
knowledge, stoichiometric and other structural
information about the GspE-GspL-GspM subcomplex that could shed light on the presence of a
multimeric ring form for GspE proteins has not
been reported.
Although the multimerization state of EpsE
within the T2SS complex is unknown, we sought
to construct a model of EpsE with C6 point group
symmetry while maintaining extensive inter-
Crystal Structure of EpsE
subunit interactions observed in our crystal structure, since many structural homologs of EpsE can
form hexameric rings. A very interesting result
(Figure 7(A)) was obtained by (i) placing six
C-domains of EpsE onto the six C-domains of the
hexameric HP0525 arrangement reported by Yeo
et al.;20 and (ii) adding to each of the EpsE
C-domains so positioned, the N0 -domain of EpsE
as observed in the helical 61 filament in our crystals
(Figure 5), i.e. maintaining the extensive and conserved C:N0 interface described above. After this
procedure the AMPPNP bound to the C-domain is
approached by the conserved residues (red circles
in Figure 1(B)) Ser140, Asp158, Arg210, and
Arg224 of the N-domain with e.g. the guanido
side-chain of Arg224 less than 2.3 Å from an oxygen of the g-phosphate of AMPPNP, the Ser140
4.4 Å from the O3 of the ribose, and a side-chain
carbonyl oxygen of Asp158 simultaneously 4.2 Å
from the O4 of the ribose and 4.1 Å of the N3 of
the adenine base of AMPPNP. This construction is
more convincing than an EpsE hexamer obtained
by simply placing the C and N domains of EpsE
onto the C and N domains of the HP0525 hexamer
since (i) in our model the C:N0 interface is considerably more extensive, 1800 Å2 versus 1470 Å2;
and (ii) the residues from the N domain approach
AMPPNP more closely (not shown).
Mapping the degree of conservation of residues
in the GspE family (Figure 1(A)) onto the hexameric EpsE model reveals a striking difference
between the two sides of the hexamer (Figure
7(B)): the “lower” side with the extra C-subdomains is much less well conserved than the
“upper” side where the N2-domain is positioned.
The 90 N-terminal EpsE residues that are absent
in our structure are most likely located near this
upper side of the full-length hexamer. Since these
residues are essential for interacting with the EpsL
components of the Eps apparatus,41 it is reasonable
to assume that the “upper” surface of the EpsE
hexamer faces the bacterial inner membrane. The
C2 subdomain, with less conserved residues,
would then face the cytosolic compartment of
V. cholerae and other T2SS-containing bacteria. The
CM subdomain is found on the periphery of the
ring, with the long axis of this domain essentially
parallel with the central ring axis and perpendicular to the putative plane of the membrane.
Within this domain, the more conserved residues
surrounding the cysteine residues are found on
the membrane-facing side of the C domain, with
the less conserved residues of elongated loop
directed down toward the cytoplasm. As mentioned before, it remains to be determined what
the functions of these subdomains are. They could
either be involved in contacts with as yet
to be discovered, possibly transiently interacting,
cytosolic proteins or, alternatively, could engage in
interactions with one or more components of
Type 2 secretion systems during the protein translocation cycle.
Given the possibility of major conformational
671
Crystal Structure of EpsE
Figure 7. Hexameric ring model of EpsE. (A) View from the proposed membrane-facing side (left), side view
(middle) and the cytoplasmic face (right) of the hexameric ring model of EpsE constructed as described in the text.
One monomer is shown with the domains colored as follows: N2, cyan; C1, dark blue; CM, yellow; and C2, green.
The other five monomers are colored in a lighter shade of the same colors. This Figure was prepared using GRASP56
and RASTER 3D.55 (B) Same views as in (A), but with the surface colored by sequence conservation.
changes of the secretion machinery during the protein translocation process, it might be that a helical
filament arrangement of EpsE resembling that seen
in the crystals may have physiological relevance. It
has been proposed that the GspG, H, I, J, and K
proteins, i.e. the pilin-like components of the Type
II secretion systems, may form a dynamic, pistonlike arrangement which is involved in pushing
substrate proteins like cholera toxin through the
GspD pore in the outer membrane.42 – 45 Given the
tendency of GspE subfamily members and RecAlike ATPases to engage in helical arrangements,
transitions from hexameric toroidal to helical
assemblies, and vice versa, may play a key role in
the functioning of these marvelous multiprotein
machineries.
Experimental Procedures
harvested in four hours. Cells were harvested by centrifugation at 5000g for 20 minutes and resuspended in
,50 ml buffer (for the pellet from a total of 4 l of media)
consisting of 100 mM TEA, 1000 mM NaCl, 10% (v/v)
glycerol, 1 mM TCEP/HCl, 2 mM imidazole, to which
one COMPLETE EDTA-free protease inhibitor tablet
(Roche Diagnostics) was added. The final pH of this
buffer was adjusted to pH 8.0 at 4 8C. The cells were disrupted by multiple passes using a French press, and the
supernatant was harvested by ultracentrifugation at
90,000g for one hour. The protein was then purified by
IMAC (TALON, Clontech), followed by ion exchange
chromatography (MonoQ, Pharmacia), and hydrophobic
interaction chromatography (Phenyl Sepharose, Pharmacia), and a final dialysis step against 100 mM TEA, 0.5 M
NaCl, 1 mM EDTA, 1 mM DTT, 1 mM PMSF adjusted to
pH 8.0 at 4 8C. The final protein solution was concentrated to 5 mg/ml. The protein solution was filtered
using 0.22 micron filter (Millipore) prior to setting up
crystallization experiments.
Crystallization
Cloning, expression and purification of EpsED90
EpsED90, containing a modified C-terminal sequence
with a hexahistadine tag, was cloned into E. coli strain
MC1061 as described.11 In our construct, the C-terminal
residues, VTKES, were replaced by GSRSHHHHHH
(residues 499– 508), in order to accommodate the pQE70
vector used. Native protein was expressed in LB at
27 8C, by induction with 1 mM IPTG at A600 , 0:6 and
cells were harvested when the A600 started to plateau, at
approximately six hours. SeMet substituted protein was
expressed in M9 minimal media supplemented with
amino acid residues as described by Van Duyne et al.46
For SeMet substituted expression, induction was again
with 1 mM IPTG at A600 , 0:6 – 0:8 and the cells were
The initial crystallization conditions for native and
SeMet EpsED90 were obtained by mixing equal volumes
(1– 4 micron) of protein solution (0.1 M TEA, 0.5 M
NaCl, 10% glycerol, 1 mM TCEP, 1 mM EDTA) with
well solutions from commercially available screens
(Crystal Screen 1, 2, Hampton Research; Wizard I, II,
Cryo I, II, Emerald BioStructures) using the sitting drop
vapor diffusion technique. Two conditions were identified and optimized. The initial conditions identified
were optimized (12 – 18% (v/v) PEG 5000mme, 0.15–
0.20 M ammonium sulfate, 0.1 M MES pH 6.3), and
SeMet crystals prepared in this fashion, and cryoprotected with 25% glycerol mixed with well solution,
diffracted to 3.4 Å resolution, although these were
672
subject to considerable radiation decay. Native crystals
prepared in a similar manner diffracted to , 3.1 Å
resolution. These crystals were difficult to work with
due to often very weak diffraction; typically fewer than
one in ten crystals cryoprotected with glycerol yielded
diffraction better than 5 Å resolution. Various methods
to more gradually introduce the crystals to glycerol or
co-crystallize them neither in the presence of cryoprotecting quantities of glycerol were successful nor
was the use of MPD, ethylene glycol, low molecular
mass PEGs, or highly soluble inorganic salts such as
lithium formate or lithium acetate. Cryoprotection with
Paratone-N (Exxon) was marginally more successful
than glycerol in more frequently producing sub-5 Å resolution data. Growth at 14 8C or 4 8C seemed to improve
the size and appearance of the crystals, and not only
gave slightly better diffraction than those set up at room
temperature.
One other condition was identified though repeated
screening; this condition (30% (v/v) PEG200, 0.1 M
Hepes pH 8.0) initially produced extremely small and
delicate crystals which were difficult to reproduce either
with or without several seeding strategies. It was found
that using a protein solution without glycerol improved
the size and reproducibility of these crystals. The best
“unliganded” datasets were obtained with crystals
grown at room temperature with 10 mM AMPPNP
present in the protein solution prior to setting up crystallization experiments, and precipitant solution of 18 – 22%
PEG200, 0.1 M Mops, pH 7.2, 10 mM ammonium acetate.
Crystals of liganded EpsED90 were grown at 14 8C, with
10 mM AMPPNP present in the protein solution prior to
setting up drops, with a precipitant consisting of
18 – 22% PEG200, 0.1 M Mops, pH 7.2 (without other
additives).
The “PEG200 crystals” were mounted directly into
loops from the cryoprecipitant containing drops and
flash-cooled in liquid nitrogen. EXAFS spectra were
performed to help determine optimal wavelength for
the peak and high-energy remote wavelengths. X-ray diffraction data were collected from each individual crystal
at ALS, Berkeley, CA (beam lines 8.2.1, 8.2.2, 5.0.2) and
APS, Argonne, IL (beamlines 19ID, 19BM).
The diffraction data were processed, and integrated
using the ELVES (J. Holton, unpublished) package
primarily to optimize the use of MOSFLM.47 Scaling and
merging of the datasets was done with the SCALA
program from the CCP4 suite. Initially, the best resolution datasets available were a number of datasets from
separate crystals all diffracting to ,2.90 Å. The crystals
have space group P61 22 or the enantiomorphic P65 22;
this ambiguity was resolved when SOLVE23 initially
found a solution with ten sites (later 12 sites) in P61 22
but no good solution in P65 22: With one molecule per
asymmetric unit, the VM is 2.8 Å3/Da and estimated
solvent content is 55.3%. Attempts to use multiple wavelength datasets were not successful, but SeMet SAD
datasets were easily solved by the SOLVE package.
Solvent flattening and histogram matching using
RESOLVE48 produced a noisy map. At this point,
additional data diffracting to 2.5 Å became available,
and the autotracing facility of RESOLVE 2.0248 was able
to convincingly place 105 residues with side-chains and
an additional 112 residues with only main-chain atoms.
This model was used as a starting point for multiple
rounds of manual tracing and refinement using
XtalView49 and REFMAC5.50
As multiple SeMet SAD datasets were collected,
the experimental phases for each of these datasets was
Crystal Structure of EpsE
originally refined independently and subsequently used
for multidomain multicrystal density modification using
the program DMMULTI51,52 from the CCP4 suite. The
working model was used to generate separate masks for
the N2 domain and the grouped C-terminal domains
(C1, CM, C2). The use of multiple domains was motivated by the marked paucity of intramolecular contacts
between the N-terminal and C-terminal regions of the
protein. Attempts to use only a single subunit mask
lead to lower map correlation coefficients; use of more
than two domains did not lead to clear improvements.
DMMULTI was also used with datasets where experimental phasing was not available.
Protein Data Bank accession codes
Coordinates and structure factors have been deposited
with the Protein Data Bank, ID 1P9R (unliganded) and
1P9W (with AMPPNP).
Acknowledgements
M.R. greatly appreciates support under the
NIAID supported T32 Host Defense Training
Grant during the initial stages of this work, and
the encouragement of Dr Walter Stamm of the
University of Washington Division of Infectious
Diseases. We thank Stewart Turley for advice on
crystal freezing and data collection. We gratefully
acknowledge the use of ALS beamlines 5.0.1, 5.0.2
and 5.0.3 and APS beamlines 19ID and 19BM for
earlier, less accommodating crystals and the use of
ALS beamline 8.2.1 for the final collected datasets.
W.G.J.H. and M.S. acknowledge support from
NIH grants No. AI34501-10 and AI49294,
respectively.
References
1. Pugsley, A. P. (1993). The complete general secretory
pathway in Gram-negative bacteria. Microbiol. Rev.
57, 50 – 108.
2. Overbye, L. J., Sandkvist, M. & Bagdasarian, M.
(1993). Genes required for extracellular secretion of
enterotoxin are clustered in Vibrio cholerae. Gene, 132,
101– 106.
3. Sandkvist, M., Morales, V. & Bagdasarian, M. (1993).
A protein required for secretion of cholera toxin
through the outer membrane of Vibrio cholerae. Gene,
123, 81 – 86.
4. Sandkvist, M., Michel, L. O., Hough, L. P., Morales,
V. M., Bagdasarian, M., Koomey, M. & DiRita, V. J.
(1997). General secretion pathway (eps) genes
required for toxin secretion and outer membrane biogenesis in Vibrio cholerae. J. Bacteriol. 179, 6994– 7003.
5. Hirst, T. R. & Holmgren, J. (1987). Transient entry of
enterotoxin subunits into the periplasm occurs
during their secretion from Vibrio cholerae. J. Bacteriol.
169, 1037– 1045.
6. Hirst, T. R. & Holmgren, J. (1987). Conformation
of protein secreted across bacterial outer membranes:
a study of enterotoxin translocation from Vibrio
cholerae. Proc. Natl Acad. Sci. USA, 84, 7418– 7422.
Crystal Structure of EpsE
7. Spangler, B. D. (1992). Structure and function of
cholera toxin and the related Escherichia coli heatlabile enterotoxin. Microbiol. Rev. 56, 622–647.
8. Sandkvist, M. (2001). Type II secretion and pathogenesis. Infect. Immun. 69, 3523– 3535.
9. Marsh, J. W. & Taylor, R. K. (1998). Identification of
the Vibrio cholerae type 4 prepilin peptidase required
for cholera toxin secretion and pilus formation. Mol.
Microbiol. 29, 1481– 1492.
10. Fullner, K. J. & Mekalanos, J. J. (1999). Genetic
characterization of a new type IV-A pilus gene cluster found in both classical and El Tor biotypes of
Vibrio cholerae. Infect. Immun. 67, 1393– 1404.
11. Sandkvist, M., Bagdasarian, M., Howard, S. P. &
DiRita, V. J. (1995). Interaction between the autokinase EpsE and EpsL in the cytoplasmic membrane
is required for extracellular secretion in Vibrio
cholerae. EMBO J. 14, 1664– 1673.
12. Sandkvist, M., Hough, L. P., Bagdasarian, M. M. &
Bagdasarian, M. (1999). Direct interaction of the
EpsL and EpsM proteins of the general secretion
apparatus in Vibrio cholerae. J. Bacteriol. 181,
3129–3135.
13. Possot, O. M., Vignon, G., Bomchil, N., Ebel, F. &
Pugsley, A. P. (2000). Multiple interactions between
pullulanase secreton components involved in stabilization and cytoplasmic membrane association of
PulE. J. Bacteriol. 182, 2142– 2152.
14. Py, B., Loiseau, L. & Barras, F. (1999). Assembly of the
type II secretion machinery of Erwinia chrysanthemi:
direct interaction and associated conformational
change between OutE, the putative ATP-binding
component and the membrane protein OutL. J. Mol.
Biol. 289, 659– 670.
15. Py, B., Loiseau, L. & Barras, F. (2001). An inner
membrane platform in the type II secretion
machinery of Gram-negative bacteria. EMBO Rep. 2,
244–248.
16. Kagami, Y., Ratliff, M., Surber, M., Martinez, A. &
Nunn, D. N. (1998). Type II protein secretion by
P. aeruginosa: genetic suppression of a conditional
mutation in the pilin-like component XcpT by the
cytoplasmic component XcpR. Mol. Microbiol. 27,
221–233.
17. Planet, P. J., Kachlany, S. C., DeSalle, R. & Figurski,
D. H. (2001). Phylogeny of genes for secretion
NTPases: identification of the widespread tadA subfamily and development of a diagnostic key for
gene classification. Proc. Natl Acad. Sci. USA, 98,
2503–2508.
18. Nunn, D. (1999). Bacterial type II protein export and
pilus biogenesis: more than just homologies? Trends
Cell Biol. 9, 402– 408.
19. Dubnau, D. (1999). DNA uptake in bacteria. Annu.
Rev. Microbiol. 53, 217– 244.
20. Yeo, H. J., Savvides, S. N., Herr, A. B., Lanka, E. &
Waksman, G. (2000). Crystal structure of the hexameric traffic ATPase of the Helicobacter pylori type IV
secretion system. Mol. Cell, 6, 1461– 1472.
21. Odenbreit, S., Puls, J., Sedlmaier, B., Gerland, E.,
Fischer, W. & Haas, R. (2000). Translocation of
Helicobacter pylori CagA into gastric epithelial cells
by type IV secretion. Science, 287, 1497– 1500.
22. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. &
Lipman, D. J. (1990). Basic local alignment search
tool. J. Mol. Biol. 215, 403– 410.
23. Terwilliger, T. C. & Berendzen, J. (1999). Automated
MAD and MIR structure solution. Acta Crystallog.
sect. D, Biol. Crystallog. 55, 849– 861.
673
24. Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia,
C. (1995). SCOP: a structural classification of proteins
database for the investigation of sequences and
structures. J. Mol. Biol. 247, 536– 540.
25. Possot, O. & Pugsley, A. P. (1994). Molecular characterization of PulE, a protein required for pullulanase
secretion. Mol. Microbiol. 12, 287–299.
26. Hol, W. G., van Duijnen, P. T. & Berendsen, H. J.
(1978). The alpha-helix dipole and the properties of
proteins. Nature, 273, 443– 446.
27. Castagnetto, J. M., Hennessy, S. W., Roberts, V. A.,
Getzoff, E. D., Tainer, J. A. & Pique, M. E. (2002).
MDB: the Metalloprotein Database and Browser at
the Scripps Research Institute. Nucl. Acids Res. 30,
379 –382.
28. Holm, L. & Sander, C. (1993). Protein structure comparison by alignment of distance matrices. J. Mol.
Biol. 233, 123–138.
29. Story, R. M. & Steitz, T. A. (1992). Structure of the
recA protein – ADP complex. Nature, 355, 374– 376.
30. Sawaya, M. R., Guo, S., Tabor, S., Richardson, C. C. &
Ellenberger, T. (1999). Crystal structure of the helicase domain from the replicative helicase-primase of
bacteriophage T7. Cell, 99, 167– 177.
31. Yu, R. C., Hanson, P. I., Jahn, R. & Brünger, A. T.
(1998). Structure of the ATP-dependent oligomerization domain of N-ethylmaleimide sensitive factor
complexed with ATP. Nature Struct. Biol. 5, 803– 811.
32. Lenzen, C. U., Steinmann, D., Whiteheart, S. W. &
Weis, W. I. (1998). Crystal structure of the hexamerization domain of N-ethylmaleimide-sensitive fusion
protein. Cell, 94, 525– 536.
33. Velankar, S. S., Soultanas, P., Dillingham, M. S.,
Subramanya, H. S. & Wigley, D. B. (1999). Crystal
structures of complexes of PcrA DNA helicase with
a DNA substrate indicate an inchworm mechanism.
Cell, 97, 75 – 84.
34. Sakai, D., Horiuchi, T. & Komano, T. (2001). ATPase
activity and multimer formation of Pilq protein are
required for thin pilus biogenesis in plasmid R64.
J. Biol. Chem. 276, 17968– 17975.
35. Turner, L. R., Lara, J. C., Nunn, D. N. & Lory, S.
(1993). Mutations in the consensus ATP-binding
sites of XcpR and PilB eliminate extracellular protein
secretion and pilus biogenesis in Pseudomonas
aeruginosa. J. Bacteriol. 175, 4962– 4969.
36. Possot, O. M. & Pugsley, A. P. (1997). The conserved
tetracysteine motif in the general secretory pathway
component PulE is required for efficient pullulanase
secretion. Gene, 192, 45 – 50.
37. Borden, K. L. (2000). RING domains: master builders
of molecular scaffolds? J. Mol. Biol. 295, 1103– 1112.
38. Yu, X. & Egelman, E. H. (1997). The RecA hexamer is
a structural homologue of ring helicases. Nature
Struct. Biol. 4, 101– 104.
39. Guo, F., Maurizi, M. R., Esser, L. & Xia, D. (2002).
Crystal structure of ClpA, an Hsp100 chaperone and
regulator of ClpAP protease. J. Biol. Chem. 277,
46743 – 46752.
40. Turner, L. R., Olson, J. W. & Lory, S. (1997). The XcpR
protein of Pseudomonas aeruginosa dimerizes via its N
terminus. Mol. Microbiol. 26, 877–887.
41. Sandkvist, M., Keith, J. M., Bagdasarian, M. &
Howard, S. P. (2000). Two regions of EpsL involved
in species – specific protein– protein interactions with
EpsE and EpsM of the general secretion pathway in
Vibrio cholerae. J. Bacteriol. 182, 742– 748.
42. Mattick, J. S., Whitchurch, C. B. & Alm, R. A. (1996).
674
43.
44.
45.
46.
47.
48.
49.
Crystal Structure of EpsE
The molecular genetics of type-4 fimbriae in Pseudomonas aeruginosa—a review. Gene, 179, 147– 155.
Shevchik, V. E., Robert-Baudouy, J. & Condemine, G.
(1997). Specific interaction between OutD, an Erwinia
chrysanthemi outer membrane protein of the general
secretory pathway, and secreted proteins. EMBO J.
16, 3007– 3016.
Filloux, A., Michel, G. & Bally, M. (1998). GSP-dependent protein secretion in Gram-negative bacteria:
the Xcp system of Pseudomonas aeruginosa. FEMS
Microbiol. Rev. 22, 177– 198.
Sandkvist, M. (2001). Biology of type II secretion.
Mol. Microbiol. 40, 271– 283.
Van Duyne, G. D., Standaert, R. F., Karplus, P. A.,
Schreiber, S. L. & Clardy, J. (1993). Atomic structures
of the Human immunophilin FKBP-12 complexes
with FK506 and rapamycin. J. Mol. Biol. 229,
105– 124.
Leslie, A. G. W., Brick, P. & Wonacutt, A. (1986).
Mosflm. Daresbury Lab. Inform. Quart. Protein Crystallog. 18, 33 – 39.
Terwilliger, T. C. (2000). Maximum-likelihood density modification. Acta Crystallog. sect. D, Biol.
Crystallog. 56, 965–972.
McRee, D. E. (1999). XtalView/Xfit—a versatile
50.
51.
52.
53.
54.
55.
56.
program for manipulating atomic coordinates and
electron density. J. Struct. Biol. 125, 156– 165.
Murshudov, G. N., Vagin, A. A. & Dodson, E. J.
(1997). Refinement of macromolecular structures
by the maximum-likelihood method. Acta Crystallog.
sect. D, 53, 240– 255.
Cowtan, K. & Main, P. (1998). Miscellaneous algorithms for density modification. Acta Crystallog. sect.
D, Biol. Crystallog. 54, 487– 493.
Cowtan, K. D. & Zhang, K. Y. J. (1999). Density modification for macromolecular phase improvement.
Prog. Biophys. Mol. Biol. 72, 245– 270.
Gouet, P., Courcelle, E., Stuart, D. I. & Metoz, F.
(1999). ESPript: analysis of multiple sequence alignments in PostScript. Bioinformatics, 15, 305– 308.
Kraulis, P. J. (1991). Molscript—a program to produce both detailed and schematic plots of protein
structures. J. Appl. Crystallog. 24, 946– 950.
Merritt, E. A. & Bacon, D. J. (1997). Raster3D: photorealistic molecular graphics. Macromol. Crystallog.
277, 505– 524.
Nicholls, A., Sharp, K. A. & Honig, B. (1991). Protein
folding and association: insights from the interfacial
and thermodynamic properties of hydrocarbons.
Proteins: Struct. Funct. Genet. 11, 281– 296.
Edited by R. Huber
(Received 12 May 2003; received in revised form 13 July 2003; accepted 21 July 2003)