Main

The structure of the HEF trimer is shown in Fig. 1. The HEF monomer is composed of three domains: an elongated stem (red in Fig. 2a) active in membrane fusion (F), a receptor-destroying esterase domain (E) (green in Fig. 2a), and a receptor-binding domain (R) (blue in Fig. 2a). Two of these compact domains are made from non-contiguous segments of amino-acid sequence (Fig. 2b): the stem domain F consists of the amino-terminal amino acids 1–40 and carboxy-terminal residues 367–432 of HEF1 and all of HEF2 (labelled F1, F2, F3 in Fig. 2a, b); the esterase domain E consists of HEF1 segments comprising residues 41–150, which precede the receptor domain, and residues 311–366, which follow R (labelled E1, E′ and E2 in Fig. 2a, b). The single-segment R domain is inserted into a surface loop of the esterase domain, and the esterase domain is inserted into a surface loop near the top of the stem domain F (Fig. 2c). The R and E domains of HEF are both compact, having their N and C termini within a few angströms of each other, so that they can be accommodated into a pre-existing protein at surface loops without disrupting either protein's structure or function.

Figure 1: Haemagglutinin-esterase-fusion glycoprotein structure.
figure 1

a, The structure of the HEF trimer. HEF1 (blue), HEF2 (red), receptor analogue and enzyme inhibitor ligands, yellow; N -linked carbohydrate ball-and-stick (purple). HEF1 is linked to HEF2 by a disulfide bond from Cys 6 of HEF1 to Cys 137 of HEF2. b, Monomer surface of HEF (Grasp27) showing 9-O -acetylsialoside receptor binding site (top) and 9-O -acetylesterase site (bottom). Inset, the esterase removes the acetyl group of 9-O -acetylsialic acid (see arrow).

Figure 2: Comparison of HEF and HA monomers.
figure 2

a, HEF monomer structure and exploded views of the sequence segments (coloured by domain). Superposition of corresponding HA (X:31 strain; PDB code 1hge) segments (grey) are shown with r.m.s. values below the segment name. Fusion peptides are yellow. b, Linear order of the sequence segments in HEF, coloured by domains. Red segments: F1, F2, F3; green: E1, E2; light green: E′; blue: R. c, Topological relationship of the compact domains in HEF. d, HA monomer structure with sequence segments coloured by domains. e, Linear order of the sequence segments in HA, coloured by domain. Red segments: F1, F2, F3; light green: E′; blue: R.

Alignment of the amino-acid sequences of HEF and HA based on their three-dimensional structures indicates that they have 12% sequence identity (the alignment is available from the authors). Nevertheless, both the overall structure (compare Fig. 2a and d) and the detailed folds of individual segments (Fig. 2a) of HEF and HA are remarkably similar. This is true for the globular R domain, as well as for the highly extended segments F1 and F2 (Fig. 2a), and for the similar helical-hairpin and membrane-proximal five-stranded β-sheet forming HEF2 and HA2 (F3 in Fig. 2a). Two of the segments of the enzyme domain, E1 and E2, are not found in HA (compare Fig. 2b and e), which is consistent with its lack of enzyme activity. If the E′ domain of HA is a vestigial fragment of E, then HA may have evolved by the deletion of segments E1 and E2 from an ancestral gene similar to HEF. Segments E1, E′, R, E2, and F2 are present, with 30% sequence identity, in the haemagglutinin-esterase (HE) found in coronaviruses4. (Data concerning the antigenicity of HEF and the structural relation between HEF and HE will be published elsewhere.)

The R domains of both HEF and HE are similar eight-stranded ‘Swiss rolls’ of β-sheet (Fig. 2a, d) containing the receptor-binding sites bounded by an α-helix, a loop, and an extended strand (superimposed in Fig. 3b) (except for one residue, Tyr 127 from E′). The structures of the complexes of HEF with two receptor analogues5,6, 9-acetamidosialic acid α-methylglycoside (Fig. 3a) and 9-acetamidosialic acid α-thiomethylmercuryl glycoside7, show that sialosides bind similarly to HEF and to HA, although they are recognized by different amino-acid side chains (see ref. 8 and references therein). Sialic acid linkage specificity (α(2,3) vs α-(2,6)) has been attributed to residues near amino acid 226 in HA8,9,10. The homologous loop (near HEF 270) is truncated in HEF (Fig. 2b), consistent with the lack of linkage specificity of influenza C virus11. The HEF receptor differs from the HA receptor by the addition of an acetyl group at the 9-O position of the glycerol side chain. The acetyl methyl group binds in a nonpolar pocket unique to HEF (Fig. 3a).

Figure 3: HEF receptor binding.
figure 3

a, Ligand bound to the receptor-binding site. Potential hydrogen bonds are indicated in green or red for those conserved in HA ligand binding. Four polar contacts are formed with the ligand identically in HEF and HA: two from the hydroxyl group of HEF1 Tyr 127 (Y98 in HA1) to the 8-hydroxyl and 9-amide of the ligands and two from main-chain atoms: the carbonyl oxygen of HEF1 residue 170 (135 in HA1) to the 5-amide of the ligand and the amide of HEF1 172 (137 in HA1) to the carboxylate of the ligand. The acetyl methyl group binds in a nonpolar pocket unique to HEF, formed by Phe 225 and 293, and Pro 271; the acetyl carbonyl oxygen contacts the hydroxyl group of Tyr 224 and the guanidino group of Arg 236. b, Comparison of architecture of the HEF and HA binding sites. The figure was constructed by superimposing the common parts of the ligands in HA and HEF.

Complexes of HEF with the two non-hydrolysable receptor analogues described above also show how substrate interacts with the novel 9-O -acetylesterase active site (Fig. 4a). Ser 57 in the catalytic triad (Ser 57 from E1, His 355 and Asp 352 from E2) is positioned for nucleophilic attack on the carbonyl carbon of the 9-O acetyl group of the bound sialoside. The carbonyl oxygen points into an ‘oxyanion hole’ formed by the side chain of Asn 117, and the NH groups of Gly 85 and Ser 57 (Fig. 4a). The ligand interactions and the structure of the enzyme site are completely different from those of the receptor-binding site.

Figure 4: HEF enzyme active site.
figure 4

a, 9-Acetamidosialic acid α-methyl glycoside (Ki = 2 mM) or 9-acetamidosialic acid α-thiomethylmercury glycoside (Ki = 4 mM) shown in yellow. Catalytic triad is shown in green. Oxyanion-hole hydrogen bonds are on the right. Arg 322 forms two hydrogen bonds with the sialoside carboxylate group. b, Esterases with structural similarity to the HEF enzyme domain (E1, E′, E2; Fig. 2a). PAF (platelet-activating factor acetylhydrolase Ib) (Z-score = 8.0, sequence identity 13%, 129 aligned residues) and SsEst (esterase from Streptomyces scabies) (Z-score = 7.9, sequence identity 10%, 146 aligned residues). Side chains for the catalytic triads are shown in green. N and C termini of the domains are numbered (black). The HEF receptor-binding domain (R) inserts at residues marked in red. Blue ribbon indicates similar structure identified by the Dali program24. HEF and PAF-AH lack the extended Ω-loops of SsEst (bottom).

A search for proteins with structural similarity to the HEF enzyme domain identified the α1 subunit of platelet-activating factor acetylhydrolase (Ib) from bovine brain (PAF-AH)2 and the esterase from Streptomyces scabies (SsEst)1 (Fig. 4b). All three proteins have a similar topology, despite sharing only 13% sequence identity, and their core residues superimpose (r.m.s. 3 Å), including the central five-stranded β-sheet and long flanking helices (blue in Fig. 4b). When the core folds are superimposed, the catalytic triads (green in Fig. 4b) and oxyanion-hole residues of each protein overlap.

Interactions between the α-helices in the stem of the HEF trimer and the packing of the N-terminal fusion peptide of F3 have implications for low-pH-induced conformational change in HEF and the mechanism by which this induces membrane fusion at low pH. The F3 segments of HA and HEF are very similar in structure (Fig. 2a): each monomer has a central α-helix along the threefold axis and a smaller N-terminal helix packed antiparallel on the outside and connected by an interhelical loop (Fig. 2a). In HEF2, although the central helices interact closely in the middle like HA, they diverge from the trimer axis at both ends (Fig. 5a). At the top, the interhelical loops interpose between the first five turns of the long helices (residues 80–97), where loop residues HEF2 Arg 69 and central helix residues HEF2 Glu 95 form salt bridges and contact an unidentified ion (possibly a sulphate) on the trimer axis. Although HEF2 has a sequence deletion of seven residues in the loop region, this difference would preserve the register of the heptads during the formation of an extended triple-stranded coil in the low-pH conformation, like that in HA12. Unlike HA, however, the top third of the triple-stranded helical bundle must first make interactions on the trimer axis after the removal of the interhelical loop, before the N-terminal coiled-coil extension can form.

Figure 5: α-Helical interactions and the fusion peptide in HEF2.
figure 5

a, Stereo diagram of the triple-stranded α-helical bundle of HEF (79–126 of HEF2). Only the region 98–113 interacts across the trimer axis. The red monomer shows residues 4–126 (labelled N, C). b, Trimeric fusion domain of HEF consisting of segments F1, F2, and F3. N terminus of F1 and C terminus of F3 are indicated.

Three tryptophans (HEF2 116) form the last interaction on the trimer axis (Fig. 5a), below which the helices diverge as in HA. Unlike HA, in which residues 2 (Leu) and 3 (Phe) of the HA2 N-terminal ‘fusion peptides’ interact across the trimer axis, HEF2 residues Val 11 and Leu 12 are closest to the trimer axis but further penetration is blocked by tryptophans at position 116. Residues N-terminal to residue 10 fold back out to the surface of the protein, where aspartic acids at positions 5 and 6 of HEF2 can interact with Arg 29 and Lys 30 of HEF2 and Lys 4 of HEF1. Residues 1–4 of HEF2 appear to be disordered on the surface of the molecule. In HEF, the residues that are functionally analogous to the fusion peptide of HA, namely the buried residues whose exposure would convert a soluble protein into a lipophilic one, are displaced along the sequence by six residues. HEF may therefore be regarded as having an internal fusion peptide, similar to virus fusion proteins that do not require cleavage activation. The conformation of the N terminus of the HEF2 fusion peptide may indicate that fusion peptides can insert into membranes as loops.

Because the influenza C virus esterase domain E is folded like other esterases (SsEst and PAF-AH), and the R domain present in HEF and HA is a ubiquitous folding unit also found as a receptor-binding domain in the orbivirus BTV13, we conclude that HEF must have evolved from functional domains (Fig. 2c). A precursor to HEF may have evolved by recombination events that resulted in the insertion of R into a surface loop of E, and E into the interhelical loop of the stem domain F (Fig. 2c). In such a scheme, the trimeric F domain (Fig. 5b) may have been an ancestral membrane-fusion protein analogous to the single-function fusion proteins of paramyxoviruses such as Sendai14. Similar modular structures of the envelope glycoproteins of retroviruses are suggested by biochemical data. The first 62 and the last 20 residues of gp120 from HIV-1 can be removed, retaining receptor binding of the fragment15, indicating that those terminal segments might be analogous to F1 and F2, forming with gp41 (F3) the stem of gp160 (1517).

The HEF structure implies that the membrane fusion domains of HEF and HA consist of F1 and F2, in addition to the segment F3 = HEF2, suggesting that1 and F2 may play a part in membrane fusion, either by controlling the low-pH-induced conformational change required for fusion or during the formation of a fusion pore. In both HEF and HA, F2 packs against the interhelical loop of F3, which refolds to a helix at low pH, but the position of F2 after refolding is unknown. The β-hairpin of F1 (residues 15–32 of HEF1; Fig. 2a) has already been implicated in the low-pH-induced conformational change of HA by proteolytic susceptibility18, and by its location adjacent to the site of the helix-to-β-turn refolding and chain-direction reversal on the long HA2 helix12.

Methods

Structure determination. HEF was obtained by bromelain digestion of C/Johannesburg/1/66 virions and two crystal forms were characterized as described previously19. Crystals in harvest buffer (60% saturated ammonium sulphate, 50 mM Tris-HCl pH 7.1, 140 mM NaCl) were transferred to a ligand soak solution (40 mM 9-acetamidosialic acid α-thiomethylmercury glycoside, 300 mM MOPS pH 7.1, 140 mM NaCl) for 2 h and then cryoprotected by serial transfer through ligand soak solution containing 5–25% glycerol in 5% steps. Complexes of 9-acetamidosialic acid α-methyl glycoside (Ki = 2 mM) with HEF were prepared by soaking form I crystals in 40 mM ligand using the same procedure but with a slightly different soak solution (40 mM 9-acetamidosialic acid α-methyl glycoside, 100 mM Tris-HCl pH 7.1, 140 mM NaCl). 9-acetamidosialic acid α-methyl glycoside was a gift from J. Hanson.

Data collection and processing. X-ray diffraction data were collected at the Cornell High Energy Synchrotron Source by flash-cooling crystals and phosphorimage plate detection as described19. A 6.5 Å derivative dataset of form I crystals complexed with 9-acetamidosialic acid α-thiomethylmercury glycoside collected on a Mar scanner and Elliot GX-13 rotating anode provided initial phases. Data were processed using Denzo, Scalepack20, and programs from the CCP4 suite21.

Phasing, model building and refinement. 6.5 Å resolution SIR phases were extended to 3.5 Å resolution in form I crystals by iterative solvent flattening, histogram matching, and non-crystallographic symmetry averaging about the molecular three-fold axis using the program DM22. Details of the phase extension, model building, two-crystal-averaging, and refinement to 3.2 Å resolution will be described elsewhere (X.Z. et al., manuscript submitted). The current model (Rfree = 26.7%, Rwork = 22.3%) contains HEF1 residues 1–427 (out of 432), residues 4–165 (out of 175) of HEF2, and no solvent molecules.

Core oligosaccharide (MAN-NAG-NAG) was built at 5 of 8 potential N -linked glycosylation sites (indicated in Fig. 2). No evidence for CHO was found at HEF1 Asn 117, occurring at the rarely glycosylated sequence NWSP. For the liganded HEF structures, the HEF model was subjected to rigid body and positional refinement against ligand data to 3.5 Å. Ligands were built into iteratively averaged 2FoFc density phased with the HEF model, omitting residues within 5 Å of the binding sites.

Structure analysis. A Go plot23 was used to help identify domains in HEF. The Dali program and database were used to find structurally similar domains24. Least-squares superpositions were performed using the program Lsqman25. RIBBONS26 was used to produce Fig. 1a; GRASP27 for Fig. 1b; BOBSCRIPT28 and Raster3D29 for Figs 3a and 4a; and SETOR30 for Figs 2a, 3b, 4b and 5.