Abstract
The genomes of human gut bacteria in the genus Bacteroides include numerous operons for biosynthesis of diverse capsular polysaccharides (CPSs). The first two genes of each CPS operon encode a locus-specific paralog of transcription elongation factor NusG (called UpxY), which enhances transcript elongation, and a UpxZ protein that inhibits noncognate UpxYs. This process, together with promoter inversions, ensures that a single CPS operon is transcribed in most cells. Here, we use in-vivo nascent-RNA sequencing and promoter-less in-vitro transcription (PIVoT) to show that UpxY recognizes a paused RNA polymerase via sequences in both the exposed non-template DNA and the upstream duplex DNA. UpxY association is aided by ‘pause-then-escape’ nascent RNA hairpins. UpxZ binds non-cognate UpxYs to directly inhibit UpxY association. This UpxY-UpxZ hierarchical regulatory program allows Bacteroides to generate subpopulations of cells producing diverse CPSs for optimal fitness.
Similar content being viewed by others
Introduction
Bacteroides are abundant and crucial members of the modern human gut microbiota. A key evolved feature of these bacteria is the ability of each strain to produce numerous (eight or more) distinct capsular polysaccharides (CPS)1,2 that are tightly regulated so that only one CPS is typically produced per bacterial cell. This bet-hedging strategy generates Bacteroides populations with great surface variability that protect from phage3,4,5 and mediate immune modulation, biofilm formation, antibiotic resistance, and inflammation6,7,8,9,10,11.
CPS diversity is achieved by regulating both transcription initiation and elongation of CPS biosynthesis operons. Bacteroides fragilis (Bfr) has eight distinct CPS operons, producing PSA–PSH. All but PSC use invertible promoters and all encode upxY (YX) and upxZ (ZX) paralogs as the first genes in each operon12,13. The fraction of each promoter oriented ON versus OFF varies with environmental conditions14. CPS promoter inversions are stochastic and multiple CPS promoters are oriented ON in most cells simultaneously15,16,17. Bacteroides prioritize expression of one promoter-ON CPS operon over others by regulating RNA polymerase (RNAP) elongation via the operon-specific YX elongation activator and ZX inhibitor of non-cognate YX.. ZX inhibits a subset of non-cognate YX possibly via direct binding (e.g., ZA from PSA may inhibit YE from PSE). Bfr YX paralogs must distinguish among eight target CPS loci to enable operon-specific regulation, but how this discrimination is accomplished is unknown.
YX family proteins are specialized (i.e., locus-specific) paralogs of NusG/Spt5, the only universal transcription factor found in archaea, eukaryotes, and bacteria18. NusG-family regulators bind RNAPs during transcript elongation and modulate RNAP activity through interactions with the RNAP and the surface-exposed ntDNA strand19,20,21. Globally acting Escherichia coli NusG and its single specialized paralog RfaH increase elongation rate and decrease pausing22,23,24,25. In contrast, Bacillus subtilis, Mycobacterium tuberculosis, and Thermus thermophilus NusGs enhance both pausing and intrinsic termination26,27,28,29,30. Pausing during transcript elongation is a universal regulatory feature of RNAPs that allows site-specific recruitment of transcription factors (TFs)31 and guides RNA synthesis.
Among the known NusGSP families, RfaH of Proteobacteria is the best understood. RfaH targets operons that contain a DNA element called ops (operon polarity suppressor) in their leader regions (DNA between the transcription start site and the translation start codon of the first gene). RNAP pauses at the 12-nucleotide ops, allowing RfaH to associate via sequence-specific interactions with a non-template strand DNA hairpin (ntDNAhp) exposed by the paused RNAP22,23,32,33. Other NusGSP include LoaP in Firmicutes21, TaA in Myxococcota34, and plasmid-encoded ActX in Proteobacteria35.
The CPS operon leader regions are required for Y-mediated regulation12, consistent with sequence-specific YX recruitment to RNAP paused in this region (Fig. 1a). In principle, YX could recognize ntDNA (like RfaH), nascent RNA (like LoaP), or both to discriminate among multiple, similar CPS operon targets. We used both in vivo and in vitro analyses to identify pauses in CPS operon leader regions, establish that these pause sites function as recruitment sites for Y, and discover NusGSP–DNA interactions and mechanisms that mediate Y–CPS operon specificity. We found that Z directly binds noncognate Ys to block Y action and that differential YX–ZX affinities enable CPS hierarchical control of transcript elongation. These results define mechanisms that explain the exquisite specificity of multiple NusGSP and that allow Bacteroides to program CPS diversity in the highly dynamic human gut environment.
Results
Bacteroides fragilis RNAP pauses in CPS operon leader regions in vivo and in vitro at candidate YX-recruitment sites (ops X)
Specific YX recruitment sites likely exist in CPS leader regions because these leader sequences are variable and are required for YX activity12. Since EcoRfaH is recruited to RNAP at leader region ops pause sites, we first asked if BfrRNAP pauses in the leader regions of CPS operons. To identify candidate YX-recruiting pause sites directly in vivo, we used nascent elongating transcript sequencing (NET-seq) (Fig. 1a, b and Supplementary Fig. 1a). NET-seq allows genome-scale identification of precise nascent RNA 3′ ends, which are enriched at pause sites36,37.
NET-seq revealed single prominent pause sites in most CPS operon leader regions (Fig. 1a, b and Supplementary Fig. 1b)37. Eight CPS leader pauses exhibited an obvious consensus sequence that resembles strong E. coli pauses (Fig. 1b) as well as apparent nascent RNA pause hairpins (PHs) that resemble those known to enhance pausing allosterically in concert with NusA in other bacteria (e.g., the so-called type-1 E. coli his and B. subtilis trp leader region pauses; Supplementary Fig. 1c)37,38,39,40. Pausing in the PSC leader region (the only Bfr CPS operon with a non-invertible, constitutively ON promoter)41 occurred at multiple sites; weak pausing occurred at a site resembling the other seven in sequence and location (Fig. 1b and Supplementary Fig. 1b). We designated the CPS leader pause sites opsX (‘X’ designates the CPS operon) based on analogy to the RfaH ops site.
To test whether the opsX pause recruits YX, we generated recombinant Bacteroides fragilis RNAP (rBfrRNAP) and assayed CPS leader regions using promoter-less in vitro transcription (PIVoT) (Fig. 1a and Supplementary Fig. 2a, b)42,43. PIVoT bypasses the need for σA-dependent initiation. We first asked if rBfrRNAP recognizes the consensus elemental pause signal defined for EcoRNAP (Fig. 1b)37. Signals resembling this consensus direct pausing by a wide variety of RNAPs from bacteria to human37,44,45. Bacterial pause sequences are reported to differ in some species46,47 and have not been tested for Bacteroidota. We found that rBfrRNAP pauses strongly at the consensus sequence but not anti-consensus sequence (Supplementary Fig. 2c), suggesting its pause signals resemble those of EcoRNAP and most other tested RNAPs.
We next assayed pausing in for six of the eight CPS leader regions (PSA, B, C, E, F, and H). Strikingly, the PSA, B, E, F, and H leader segments encoded prominent pause sites that corresponded exactly to the sites found by NET-seq (Supplementary Figs. 2d, 3). Pausing was less prominent but detectable at opsC, consistent with the heterogeneous pausing observed the NET-seq. We conclude that CPS operon leader regions encode strong pause sites for RNAP with similar but not identical sequences, as might be expected for YX recruitment sites that must distinguish among YX paralogs.
To ask if the CPS leader pauses function as targets for YX recruitment and test whether they are modulated by regulators like NusA and YX, we purified recombinant BfrNusA and YX for these six PSA operons (YA, YB, YC, YE, YF, and YH; Methods) and tested their effects on pausing using PIVoT. In Eco and Bsu, NusA stimulates pausing in part via contacts to PHs37,39,43,48,49,50. All six CPS leader pauses were greatly enhanced by NusA (Fig. 1c and Supplementary Fig. 4). Intriguingly, YA,B,E inhibited the cognate leader pause, whereas YC,F,H enhanced the cognate leader pause (Fig. 1c, d and Supplementary Fig. 4). YE additionally trapped a fraction of RNAP just downstream from the pause site, as seen previously with EcoRfaH (‘capture’ in Fig. 1c, d and Supplementary Fig. 4). Thus, YX association with paused elongation complexes (PECs) may manifest as either pro-pausing or anti-pausing activity.
Importantly, the effects of YX are likely to be specific to the NET-seq identified leader pauses, consistent with opsX sites functioning as specific YX-recruitment sites. YX only modulated pausing at cognate opsX but not non-cognate opsX or other positions (Supplementary Fig. 4). We conclude that the NET-seq-identified leader pauses are bona fide target sites for YX association with BfrRNAP. Notably, opsA,B,E encode putative ntDNAhps at [−11 to +1] that resemble the ops ntDNAhp known to recruit EcoRfaH (5′-GCG–AGC stems; Fig. 1b and Supplementary Fig. 1c). The Bfr opsX ntDNAhp sequences differ, consistent with specific recruitment of cognate YX. However, opsF,H are identical in the ntDNAhp region, suggesting that some other element contributes to specificity.
RNAP capture by YX–opsX interaction, which is evident by accumulation of RNAs a few nucleotides longer than the primary pause RNA for opsE but not opsA or opsB (Supplementary Fig. 4), suggests some but not all opsX sites exhibit pause cycling31,33,51. Pause cycling occurs when the ntDNA is captured by a regulator that also contacts RNAP (e.g., Ecoσ70 or RfaH), anchoring the PEC and hindering extension beyond 2–3 nt52,53. Trapped PECs can be rescued by RNA cleavage factors GreA,B33, creating a cycle that repeats until ntDNA contacts rearrange to allow normal elongation51.
Importantly, even in the presence of globally acting BfrNusG, YF still enhances opsF pausing (Supplementary Fig. 5a). Thus, YX appears to outcompete BfrNusG even though both NusG and its specialized paralog YX use the same primary binding site on RNAP (Supplementary Fig. 5b).
ZX inhibits YX at ops X through direct ZX–YX interaction
We next sought to test whether YX binding requires sequence upstream of the putative ntDNAhp region using in vitro binding, in silico interaction, and in vivo gene expression assays. YE is predicted to be inhibited by ZA but not by ZE or ZC in a strain with only the PSA, PSE, and PSC promoters oriented ON (expression hierarchy PSA > E > C)13,17. We call this strain [AE]ON for simplicity because the PSC promoter is constitutive13. To test our prediction, we measured ZA–YE.and ZE–YE binding constants by biolayer interferometry (BLI) (Fig. 2a, b). ZA but not ZE bound tightly to YE (KD ~ 0.9 nM vs ~88 nM). We conclude that ZX acts through direct YX binding.
To understand how ZA might interact with YE, we predicted their association using AlphaFold 354 (Fig. 2c and Supplementary Fig. 6). The ZA–YE complex, which was predicted with high confidence, placed ZX on the RNAP-binding interface of YE. When modeled into an EcoRNAP-RfaH-ops-PEC (PDB 8PHK)33 by alignment of the YE NGN domain with the RfaH NGN, ZA clashed with two major PEC features: (i) the RNAP clamp helices (CH), which provide the primary RNAP binding site for all NusG-family regulators (Fig. 2c, orange); and (ii) the proximal upstream DNA duplex (usDNA). Thus, ZX likely inhibits YX by preventing its recruitment to RNAP at opsX pause sites.
We next used PIVoT to test whether ZA or ZE blocked YE inhibition of pausing at the candidate opsE pause site as predicted by the AlphaFold model. ZE blocked YE action only at high concentrations (KI approximating the KD measured by BLI; Fig. 2d). In contrast, ZA inhibited YE at all tested concentrations. We conclude that differential YX–ZX affinities enable CPS hierarchical control of transcript elongation (Fig. 2e).
YX targets extended ops X sites in vivo
Using these insights into ZX–YX interaction, we tested whether opsX pause sites function as YX recruitment sites in vivo and which sequences govern cognate YX function. Using a constitutive [AE]ON strain17, we replaced opsE segments with the corresponding opsA segments. We predicted that the opsE–opsA swapped strain should activate PSE expression because YA should bind opsA in PSE. To ask if the PH-encoding region of opsX is required for YX recruitment, we also constructed a hybrid opsE–A strain in which only the ntDNAhp region corresponding to the RfaH ops but not the PH-encoding region of opsE was replaced with opsA sequence (Fig. 2f). Using antibodies confirmed to detect PSE in a WT strain but not in a PSE– mutant, we tested for PSE expression in [AE]ON and derivative strains: ∆ZA, hybrid opsE–A, and full opsE→A (Fig. 2f). PSE was (i) not expressed in [AE]ON; (ii) expressed in ∆ZA; (iii) not expressed in the hybrid opsE–A strain; and expressed in the full opsE→A swapped strain.
To confirm that the upstream PH-encoding region is required for YX action, we also tested YA and YE effects similarly using PIVoT (Supplementary Fig. 7a, b). Neither YA nor YE modulated pausing or PEC capture at WT levels unless the full cognate opsX including the upstream PH-encoding region was present. Thus, both in vivo and in vitro, the cognate upstream PH-encoding region is required for full YX activity.
We conclude that opsX is comprised of both the ntDNAhp region and the upstream PH-encoding region. These regions are necessary and sufficient to program YX recruitment and enhancement of CPS-operon transcription. The inactivity of YX at hybrid sites establishes that the ~40 bp Bacteroides CPS opsX sequences differ fundamentally from the RfaH ops that requires only a 12-bp ntDNAhp sequence. Additional recognition of the upstream PH-encoding region likely aids YX discrimination among target sites. However, determining whether these upstream sequences contact YX as a nascent RNA hairpin, as proposed for LoaP55, or as duplex DNA required further experimentation.
YX–ops X pairs can be divided into distinct classes
To ask if the variability in opsX sequences could be related to variability in YX paralogs, we compared their apparent evolutionary relationships to sequence and structural alignments of YX, RfaH, and NusGs (Fig. 3a and Supplementary Fig. 8). Strikingly, both YX protein and opsX DNA sequences clustered into two distinct classes with two outliers (anti-pausing Class-1, PSA,B,E; pro-pausing Class-2, PSD,F,H; Outliers PSG,C) (Fig. 3b and Supplementary Fig. 8). We use the opsX pause site defined as position −1 as a reference in this analysis. Class-1 DNA–RNA sequences exhibited several key features: (i) an apparent ntDNAhp (orange arrows); (ii) an apparent PH that extends to −12 to −9 (red arrows; relative to −1 pause RNA 3′ nucleotide position); and (iii) the YX gene start codon is at +41, +42. Class-1 YX protein sequences (Fig. 3a) exhibited (i) an identical β2–β3 hairpin sequence in the NGN domain (LPTQFVIRQLYKRR[R/K]RVEVP); (ii) variable sequences (pink) in NGN α1 and α2 that contact the ops ntDNAhp (yellow), RNAP protrusion, and RNAP gate loop; and (iii) variability in the C-terminal KOW domain (Fig. 3a and Supplementary Fig. 8). The variable YX sequences in contacts to the ntDNAhp, protrusion, and gate loop are consistent with YX recognition and potential effects on pausing27,56,57, whereas variability in KOW may enable target specificity or coupling of transcription to other cellular processes.
The Class-1 PSA,B PHs have greater potential to extend towards the pause RNA 3′ end (teal highlight) relative to the PSE PH. Extension of PHs past −10 is thought to destabilize PECs at intrinsic terminators58, but we did not observe termination at these sites. An alternative role of PHs extending past −10 could be to aid PEC escape from pause cycles if auxiliary factors like GreA,B are insufficient. Thus, we postulated that base-pairing of the PSA,B PHs at −11, −10, and −9 could explain why, in contrast to YE, YA, and YB did not capture PECs in pause cycles (Supplementary Fig. 4 and Fig. 3b red highlight) (see next section). Based on an apparent ability to prevent PEC capture by YX, we call this PH extension the escape duplex (ED).
Pro-pausing Class-2 (PSD,F,H) sequences exhibited features that differed from Class-1 (Fig. 3b and Supplementary Fig. 8). For Class-2 DNA-RNA: (i) opsX lacks an obvious ntDNAhp; (ii) the apparent PH extends only to −14; and (iii) the YX gene start codon is at +9 relative to opsX. For Class-2 YX: (i) the β2–β3 hairpin sequence is variable with pattern of basic residues distinct from Class-1; (ii) NGN α1 and α2 also are variable but distinct from Class-1 and thus consistent with differential recognition and different effects on pausing; and (iii) the KOW domain exhibits greatly increased positive charge relative to Class-1 (Supplementary Fig. 8).
PSC,G were outliers whose YX and opsX clustered differently relative to Class-1,2. Their apparent PHs extended to −12 or −16, respectively. The YX start codons were at +111, +25 and both YX sequences were relatively divergent compared to Class-1,2. YC enhanced rather than inhibited the opsC pause (Supplementary Fig. 4). Class-2 YX and PSC YC exhibited charge similarity to the LoaP KOW proposed to bind RNA hairpins (Supplementary Fig. 8).
We conclude that YX regulators diverged during evolution to form at least two distinct classes within which the interactions that determine YX–opsX specificity and pro- vs. anti-pausing action appear to have followed different trajectories.
ops X PHs stabilize PECs but also can aid escape of PECs captured by YX-DNA contacts
We next sought to assess the function of the putative opsX PHs (Fig. 3b). We focused on Class-1 opsX to investigate the impact of the PH and ED (Fig. 3b and Supplementary Fig. 9). The strong effect of NusA on Class-1 pauses (Fig. 1d and Supplementary Fig. 4) made it likely the PHs stimulate pausing39,43,48,49,50,59. Further, removal of the PH-encoding region from an opsE scaffold eliminated NusA-stimulation of pausing (Supplementary Fig. 10a). To probe the functions of the conventional opsE PH and the unconventional opsB PH + ED, we used complementary antisense oligonucleotides (asDNAs or asRNAs) to progressively disrupt the 5′ arm of the PSE,B PHs (Fig. 4a, c).
asDNAs that disrupt the PSE PH by pairing with the 5′ arm, but not those that pair just upstream, reduced pausing (Fig. 4b). Thus, the PH alone stimulates pausing at opsX and BfrNusA significantly stimulates pausing in a PH-dependent manner. We conclude that opsX sites are type-1 pauses that encode NusA-stabilized PHs, in notable contrast to the type-2 RfaH ops that lacks a PH38.
To test the idea that the apparent ED could aid escape of PECs, we measured the effect on capture of antisense RNAs (asRNAs) that disrupt the ED by pairing to the distal bases of 5′ arm of the opsB PH. opsB but not opsE encodes an ED, and YB does not cause PEC capture in contrast to YE (Fig. 4c, d and Supplementary Fig. 4). Addition of asRNAs that progressively disrupted the ED caused YB to capture PECs in pause cycles. Thus, opsB, and by analogy opsA, PHs not only stimulate opsX pausing synergistically with NusA to allow time for YX recruitment, but also use an ED to drive forward translocation at the pause. The ED breaks extensive contacts by YX necessary for its initial recruitment but problematic for subsequent EC escape.
YX distinguishes PECs via multipartite NGN interactions with exposed ntDNA and upstream duplex DNA
We next sought to determine how Class-1 YX proteins distinguish cognate vs. non-cognate opsX sites via the PH-encoding region (Fig. 2). Since the ntDNA of opsE and opsB are most similar, particularly at the key −6 ntDNAhp position (Fig. 3b and Supplementary Fig. 10b), we reasoned that the contribution of sequences upstream from the ntDNAhp might be most apparent by swapping regions between opsE and opsB. We used PIVoT to measure YX effects on NusA-stimulated pausing and capture using templates with opsE–B swapped sequences or YE–YB hybrid proteins that separate potential NGN vs. KOW contributions (Fig. 5a). To enable the direct comparison between opsE and opsB, we used a variant of opsB that lacked the ED (opsB, –ED).
To ask if YX recognizes the upstream DNA or the PH RNA encoded by it, we first tested whether the upstream DNA sequences affected YX action in the absence of a PH (Fig. 5b). With the PH removed, YB stimulated RNAP capture at the opsB pause site by a factor of ~4.5 (Fig. 5c). When 3-bp segments of the opsB usDNA were replaced with opsE sequence, YB capture of RNAP decreased either modestly (substitutions 1 and 2) or nearly completely (substitution 3). We next asked if changing both the upstream DNA and the PH from opsB to opsE sequences had more effect on YB action than changing just the upstream DNA, as predicted if the PH functions in YB action. However, even combining the 1 + 2 + 3 substitutions in the upstream DNA and PH had no greater defect in YB action than introducing substitution 3 to the upstream DNA alone (Fig. 5d). We conclude that YB recognition of the extended opsB site depends on the usDNA and not on the PH RNA.
We next investigated the contributions of the upstream sequences in progressively interconverted opsE and opsB to PEC capture by YX (Fig. 5e and Supplementary Fig. 11). To simplify this comparison, we used a variant of opsB in which capture was activated by removing the ED (Supplementary Figs. 9, 11). Strikingly, YE continued to function even when the opsE ntDNAhp was changed to the opsB ntDNAhp. However, the YE effect was mostly lost and YB capture progressively increased as the usDNA was increasingly converted to opsB sequence (Fig. 5e). Thus, multiple segments of usDNA contribute to YB recognition of opsB. Consistent with our in vivo experiments (Fig. 2f), we conclude that opsX sequences are multipartite ntDNA and usDNA signals of ~40 nucleotides whose constituent parts variably contribute to YX recruitment in different CPS operons.
We next asked if the NGN alone recognizes opsX as it does for RfaH–ops interaction23,32 or if the KOW domain might also participate, as proposed for LoaP55. Attempts to purify a Class-1 NGN alone yielded only insoluble protein. Instead, we compared NGN–KOW YE–B hybrids to YE and YB on opsE, opsB, and an opsE-B hybrid scaffold (Fig. 5f). For both YE and YB, the effect on capture or pausing was determined completely by the NGN domains. We conclude that recognition of opsX by at least Class-1 YX is mediated by the NGN and not the KOW domain.
Class-1 YX protects upstream DNA from exonucleolytic cleavage
For the YX NGN to contact upstream duplex DNA, the DNA must distort from a canonical B-form trajectory departing the PEC (Fig. 6a). Although protein interactions can easily bend duplex DNA60, we sought direct physical evidence for usDNA–YX-NGN interaction. Exonuclease III (ExoIII) has been used extensively to detect PEC boundaries on DNA61,62,63. Since YE,B variably depend on distal usDNA in our activity assays, we assayed opsE,B with cognate YX.
Over the full time course, YE,B strongly stabilized a −21 footprint, 6–7 base pairs upstream of RNAP (Supplementary Fig. 12). However, YB but not YE also slowed ExoIII digestion at −24, and −31 to −34. Further, these same upstream protections were caused by a YB,E NGN–KOW hybrid (Fig. 6b and Supplementary Fig. 12). We conclude that YB NGN likely contacts usDNA at least near −21 to −24, and −31 to −34.
As an additional test of the upstream YB contacts, we performed ExoIII assays on scaffolds containing opsB-to-opsE sequence changes to distal usDNA (−36 to −34 and −26 to −24) and proximal usDNA (−18 to −16). These substitutions strongly reduced upstream protection from ExoIII (Fig. 6c and Supplementary Fig. 13). Together, our results suggest a set of YX specificity determinants reflected in both physical contacts detected with ExoIII and sequence effects on YX activity.
To understand these contacts in a structural context, we modeled YE and YB into an RfaH- ops-PEC structure (PDB 8PHK)33. Both YE and YB are predicted to have a much larger positively charged surface approximately in the path of the usDNA (Supplementary Fig. 14). This charge is created largely by basic residues in the beta hairpin mini-domain of YE,B and could position the usDNA for sequence-specific readout by NGN.
Discussion
Human gut Bacteroides strains synthesize numerous surface CPS that are highly regulated to create subpopulations in which primarily a single PS locus is transcribed, providing phenotypic plasticity to environmental challenges. To coordinate CPS gene expression in a manner that maximizes CPS diversity, Bacteroides have developed a complex hierarchy involving locus-specific cognate YX activation and noncognate ZX inhibition.
We have elucidated the biochemical mechanisms of Bacteroides CPS hierarchical control (Fig. 7): (i) BfrRNAP pauses prominently at single CPS leader-region pause sites (opsX); (ii) opsX programs NusA-enhanced, RNA hairpin-stabilized transcriptional pauses that create time windows for YX recruitment; (iii) ZX inhibits non-cognate YX directly via differential binding affinities, forming a heterodimer that precludes YX recruitment by steric clash of ZX with RNAP and opsX; (iv) YX locus-specific recruitment depends on multipartite interactions of the YX NGN domain with the exposed opsX ntDNA and upstream duplex DNA; (v) YXs evolved into functionally distinct classes; and (vi) YX-bound PECs use different mechanisms to escape opsX. This combination of multiple functions at a single RNAP pause site has little precedent and may reflect the strong evolutionary pressure associated with the challenges of discriminating among multiple similar NusGSPs.
Bacteroides belong to the greater phylum Bacteroidota, evolutionarily distant from the commonly studied model proteobacterium E. coli and firmicute B. subtilis. Despite the importance of these bacteria to human health, there is a limited understanding of Bacteroides transcription regulation. Our recombinant BfrRNAP overexpression system enables facile production and genetic manipulation of BfrRNAP. Multiple questions can now be addressed, including the roles of uncharacterized RNAP sequence insertions64, the molecular interactions of RNAP with TFs (e.g., σA) and small molecules (e.g., ppGpp), and sequence-dependent effects on transcriptional activities (e.g., backtracking, translocation, etc.). Recombinant RNAPs enable studies of both lineage-specific transcription mechanisms and evolutionary comparisons. rBfrRNAP will enhance mechanistic understanding in the entire field of transcription, as demonstrated by numerous recent studies in M. tuberculosis, C. difficile, and B. subtilis26,27,28,65,66.
We found that opsX recruitment sites for YX are ~40 bp multipartite DNA elements with both upstream duplex and transcription bubble ntDNA components, in striking contrast to the 12-nucleotide ntDNAhp (ops) necessary for RfaH-recruitment and the proposed nascent RNA hairpin necessary for LoaP recruitment32,55. The ntDNAhps formed by ops and opsX differ in apparent structure and position relative to the pause site. All eight ops sites in E. coli targeted by the single RfaH encode the same ntDNAhp sequence: 5′-GCGGTAGC67. The longer Bacteroides 5′-YGCGNAGCR ntDNAhps exhibit both similar (GCG..AGC stems) and distinct (loop) features compared to ops. These differences highlight how Bacteroides evolved to manage numerous NusGSP. Extensive YX–opsX interactions may also accelerate Bacteroides adaptation by expanding the sequence space available for functional bifurcation following gene duplication.
We found that ZX inhibits YX recruitment to opsX–PECs directly, likely by blocking YX interaction with the conserved β′ clamp helices (CH) and the opsX usDNA. ZX could also tune heterologous operon PSX expression or limit self-expression through negative feedback. Ultimately, YX–ZX interactions define the cell surface architecture of Bacteroides. Our findings provide a foundation for understanding them.
The closer Class-2 start-codon proximity to opsX (9 bp) suggests that Class-2 YX may play a stronger role in ribosome association for coupled transcription–translation of the YX gene. Translation is not well studied in Bacteroides68,69,70,71, but both the similarity of anti-pausing by BfrNusG to EcoNusG (Supplementary Fig. 5) and the location of stop codons relative to intrinsic terminators72 suggests transcription and translation may be coupled in Bacteroides – like E. coli but unlike B. subtilis72,73,74,75,76,77,78,79,80,81,82. RfaH is thought to recruit ribosomes for coupled translation in E. coli83,84. Start codon GUG is thought to initiate ribosomes 5–10 times more weakly than AUG in E. coli85. Taken together, these differences are consistent with evolution of Class-2 YX–opsX pairs for tight linkage of YX and ribosome recruitment at opsX sites immediately adjacent to the translation start site. Both these potential distinctions (relative to Class 1) in Class-2 YX–opsX function and interesting differences evident for YC–opsC and YG–opsG require future experimental investigation.
We also discovered a regulatory RNA element—the opsX PH ED—involved in the regulation of PSA and PSB. The conserved role of PHs at opsX is to enhance pausing with NusA. The opsA,B ED provides a driving force to propel RNAP out of pause-cycling traps created by extensive interactions that occur at these sites. Possibly, opsE does not encode an ED because YE interacts with less sequence (Supplementary Fig. 12) and Gre factor may be sufficient for its escape as it is for RfaH33. Alternatively, the strong kinetic difference in escape mechanisms could be exploited by Bacteroides in CPS expression control. We propose that the ED evolved in response to evolutionary pressure to expand YX specificity.
Our results provide new mechanistic insights into transcriptional regulation by a large class of NusGSP, YX (UpxY). We find that determinants of transcriptional pausing in the phylum Bacteroidota resemble those found for other bacteria, but that recruitment sites for these NusGSPs differ notably both in being multipartite and much more extensive (~40 bp) than found for E. coli RfaH (~12 bp). Two aspects of the YX recruitment mechanisms provide precedent for new types of transcriptional regulation: (1) the upstream DNA is a sequence-specific platform for PEC regulation, and (2) pause hairpins can include escape duplexes that can drive escape from regulator-stabilized pauses. These discoveries highlight the importance of studying transcriptional mechanisms in diverse bacteria.
Methods
Plasmids, oligonucleotides, and strains used in this study are listed in Supplementary Tables S1–4. Nucleic acid scaffolds used in PIVoT assays are organized by figure in Supplementary Fig. 16. All reported measurements were taken from distinct samples.
E. coli strain construction
E. coli strain RL3569 was created by P1 transduction of RL1674 with donor strain RL357086 harboring the rifampicin-resistance mutation S522F in rpoB. Briefly, 5 mL of donor strain RL3570 was grown to saturation (overnight) in LB + 5 mM CaCl2. The next day, 50 µl of the donor strain was mixed with 100 µl of a 10−5 dilution (in LB + 5 mM CaCl2) of a freshly made P1 stock, then incubated at 37 °C for 20 min without shaking. 2.5 mL of 45c-equilibrated R top agar (0.8 % agar, 1% tryptone, 0.8% NaCl, 0.1% yeast extract, supplementing to a final concentration of 2 mM CaCl2 and 0.1% glucose after autoclaving) was added to the bacteria-phage mixture, flicked to mix, then poured evenly onto a thick, moist, freshly-made R plate (1.2% agar, 1% tryptone, 0.8% NaCl, and 0.1% yeast extract, supplementing to a final concentration 2 mM CaCl2 and 0.2% glucose after autoclaving). The plates were incubated at 37 °C overnight in a plastic bag with wet paper towels. The next day, the plate was transferred to a 4c room and overlayed with 5 mL of MC solution (10 mM MgSO4 + 5 mM CaCl2). After a 5 h incubation at 4 °C, the overlayed solution containing fresh P1 lysate was collected, 0.2 µm filter-sterilized, then stored in the dark at 4c until use. The recipient strain (RL1674) was grown to saturation (overnight) in LB + 5 mM CaCl2 + 20 µg chloramphenicol/mL. The next day, 100 µl of donor P1 phage serial dilutions were separately mixed with 100 µl of recipient strain overnight culture, then incubated at 37 °C for twenty minutes with no shaking. The mixture was plated on LB agar + 20 µg chloramphenicol/mL + 100 µg rifampicin/mL. Candidates were sequence-verified.
B. fragilis strain construction
Bacterial growth
B. fragilis NCTC 9343 (ATCC25285; Genbank assembly ASM2598v1) strains were grown in basal medium87 or on BHI plates supplemented with 5 mg hemin/liter and 2.5 µg vitamin K1/L. Mutants ΔmpiM4417, ΔmpiM44ΔupaZ13 and ΩPSE41 were previously constructed. For selection of cointegrants, gentamycin (200 µg/ml) and erythromycin (5 µg/ml) were added to the plates when indicated.
Construction of mutant PSE ops and HP-ops regions in 9343ΔmpiM44
Two different alterations to the PSE 5′ UTR were made in the ΔmpiM44 strain. In the first mutant, the ops sequence of the PSE locus (CTGCGAAGCATA) was replaced with the ops sequence of the PSA locus (ccgcgtagcgca). In the second mutant, a larger replacement was made and included the hairpin region adjacent to the ops sequence. The sequence from the PSE 5′ UTR (ttggctgagaaaaagagtctcacccaaCTGCGAAGCATA) was replaced with the sequence from the PSA 5′UTR (cggtttgaatgggaaaagatgtctcgtccaaaccgcgtagcgca). The recombinant plasmids were created by PCR amplifying two (ops) or three (HP-ops) DNA segments using Phusion polymerase (NEB) with ΔmpiM44 as template with the primers listed in Table S2. These segments were cloned into BamHI-digested pLGB1388 using NEBuilder (NEB). Plasmids were sequenced to confirm the correct assembly of the segments. Plasmids were conjugally transferred from E. coli S17 λpir to ΔmpiM44 and after overnight co-incubation, were plated on BHIS with gentamycin and erythromycin. The resulting cointegrants were passaged in basal medium for several hours and plated on BHIS with 50 ng anhydrotetracycline to select for double cross-over recombinants. These strains were tested by PCR for replacement of the PSE sequences with the respective PSA sequences and the genomes of these two strains were sequenced to confirm the correct replacements.
Western immunoblot analysis
Bacterial strains were grown overnight to an apparent OD600 of ~1.2. Bacteria were pelleted and resuspended in 1× LDS loading buffer (Invitrogen) and boiled for 5 min. Cell lysates (equivalent to 3.5 µl of the original culture) were loaded onto 4–12% NuPAGE (Invitrogen) and run with MES buffer until the 17 kDa molecular weight standard had run to the bottom of the gel to allow for migration of the high molecular weight PSE further into the gel. The contents of the gel were transferred to PVDF and blocked with 5% skim milk in TBS with 0.5% tween (TBST). The blot was probed with a mouse monoclonal antibody specific to PSE (Supplementary Fig. 15) used at 1:100 dilution, washed with TBST, and probed with a 1:2000 dilution of alkaline phosphatase conjugated goat-anti rabbit IgG (Invitrogen Catalog # 31340 Lot YA366475). After washing with TBST, the blot was developed with BCIP/NBT (KPL).
NET-seq
B. fragilis NCTC 9343 rpoC-3xFLAG was streaked onto BHIS plates and incubated at 37 °C anaerobically for 2 days. A swab from a dense area on the plate was used to inoculate overnight cultures. The next day, 10 mL of the overnight culture was used to inoculate 500 mL SBM (starting apparent OD600 0.04 as measured by a Denville® CO8000 Personal Cell Density Meter). When the apparent OD600 measured 0.65, cultures were removed from the anaerobic chamber and 300 mL was used for subsequent steps.
To harvest nascent transcripts for the NET-seq workflow, cultures were filtered between two vacuum filtration systems using a 0.45 µm pore nitrocellulose filter (GVS Micron Sep, 1215305). Cells were scraped off each filter using a spatula and plunged immediately into liquid nitrogen (i.e., cells from the same culture were combined into the same 50 mL conical tube containing ~25 mL liquid nitrogen). Collected cells were cryo-lysed using a RETSCH mixer mill (MM 400) as previously described37, with the exception that 50 mL stainless steel canisters and a 25 mm stainless steel ball were used to perform the cryomilling.
To isolate nascent transcripts, we performed a modified 3xFLAG-IP protocol with previously described buffers37. Specifically, the thawed grindate volume was scaled to 5.5 mL with lysis buffer (1× lysis stock [20 mM Tris, pH 8.0, 0.4% Triton X-100, and 0.1% NP-40 substitute], 100 mM NH4Cl, 1× EDTA-free cOmplete Mini protease inhibitor cocktail [Roche Diagnostics GmbH, 11836170001], 10 mM MnCl2, and 50 µ/mL RNasin [Promega, N211B], and 0.4 mg/mL puromycin), DNA was partially digested for 20 min with RQ1 DNase (0.054 µ/mL [0.02 µ/mL for the E. coli-only NET-seq pilot experiment])[Promega, M6101], and digestion reactions were stopped by addition of EDTA to 28 mM (final concentration). RNAP-nascent transcript complexes were directly immunoprecipitated using Anti-FLAG M2 affinity gel (Sigma, A2220) (i.e., without buffer exchange), and the precipitated RNAP-nascent transcript complexes were subsequently washed four times (1× lysis stock, 100 mM NH4Cl, 300 mM KCl, 1 mM EDTA, and 50 µ/mL RNasin)[Promega, N2515]. RNAP-nascent transcript complexes were eluted twice with 3xFLAG peptide (Sigma, F4799) (1× lysis stock, 100 mM NH4Cl, 2 mg/mL 3xFLAG peptide, 1 mM EDTA, and 50 µ/mL RNasin). Nascent transcripts were purified using a miRNeasy kit [Qiagen, 217084] as previously described37. However, to reduce phenol and chaotropic salt contamination, nascent transcripts were subjected to an additional overnight isopropanol-GlycoBlue (Invitrogen, AM9516) precipitation at −20 °C.
For nascent transcript library generation, we followed a modification of a previous NET-seq workflow36,37. Specifically, our workflow included using custom adaptors compatible with an Illumina NovaSeq X instrument. Likewise, the DNA adapter used for nascent transcript 3′ end ligation was adenylated using components from a NEB 5´ DNA Adenylation kit (E2610; 6 µM DNA linker [RL15032], 80 µM ATP, 6 µM Mth RNA ligase, and 1× Adenylation Reaction Buffer). The adenylation reaction was incubated for 4 h incubation at 65 °C, inactivated at 85 °C for 5 min, and precipitated overnight at −20 °C with isopropanol and GlycoBlue (Invitrogen AM9516). The precipitated, adenylated DNA linker was ligated to 750 ng of precipitated nascent transcripts, in duplicate, using components of a NEB T4 RNA Ligase 2, truncated (T4 Rnl2tr) kit (M0242; 10% DMSO, 22% PEG8000, 3 µM adenylated DNA linker, T4 Rnl2tr [14.7 µ/µL], RNasin [2 µ/µL], and 1× T4 RNA Ligase Reaction Buffer). These ligation reactions were incubated at 37 °C for 4 h. After this incubation, T4 Rnl2tr was inactivated by incubation with Proteinase K (0.04 µ/µL) (NEB, P8107) at 37 °C for 1 h. RNAs were fragmented, resolved, gel extracted, and precipitated as previously described36,37, with the exception that the gel extraction incubation at 70 °C was increased to 25 min. cDNAs were synthesized using a custom adapter (RL14637) and a previously described protocol36,37, with the exception that the reaction time was increased to 1 h. Circularization of gel extracted and precipitated cDNAs was performed using a protocol previously described36,37, with the exception that the circularization reaction incubation period was increased to 3 h and the gel extraction incubation period was increased as above. After circularization, cDNA libraries were PCR amplified using minimal cycles and custom adapters, gel extracted, and precipitated as previously described36,37. Library concentration and amplified product size distribution were determined using an Agilent TapeStation 4150. NET-seq libraries were sequenced by the University of Wisconsin-Madison Biotechnology Center on an Illumina NovaSeq X instrument.NET-seq data were processed using a combination of custom scripts and standard tools. Briefly, adapters, linker, and control oligos potentially contaminating each sample were trimmed from raw reads using cutadapt (v3.4). Reads with a minimum length of 14 nts were mapped to the B. fragilis genome (NC_003228.3) using Bowtie (v1.3.0) allowing both one mismatch and random assignment of reads mapping to multiple loci based on alignment stratum (Bowtie options --best -a -M 1 -v 1). Alignments were converted to BAM and BED files using samtools (v1.16.1) and bedtools (v2.30.0). The specific 3′ end counts for each genome position were determined using bedtools (options -d -strand - -5 [plus strand] or -d -strand + −5 [minus strand]).
rBfrRNAP cloning and purification
B. fragilis RNAP coding regions were codon-optimized using Gene Designer from DNA2.0 (now ATUM) using E. coli codon frequencies89 and amplified from synthetic DNA (IDT) of B. fragilis NCTC 9343, then cloned into a pRM756 backbone90, incorporating a His10-ppx tag at the C-terminus of β′ and a Strep tag at the N-terminus of β. RBS sites were optimized using denovodna.com91,92. This plasmid enables T7 overexpression of all subunits under IPTG control.
rBfrRNAP was purified similarly to E. coli RNAP93, with changes described below. Following transformation of RL3569 with pJS015, a colony was picked and inoculated into a 3 mL LB + 25 µg kanamycin/mL + 20 µg chloramphenicol/mL. Two milliliter of overnight culture was used to inoculate 2 L LB + 25 µg kanamycin/ml + 10 drops Sigma Antifoam Y-30 Emulsion in baffled Fernbach flasks and incubated at 37 °C. When the apparent OD600 reached 0.4, the temperature was dropped to 16 °C, overexpression was induced by addition of 200 µM IPTG, and incubation was continued with shaking at 200 RPM overnight (~18 h). Cell cultures were placed on ice for 20 min, then pelleted by centrifugation at 3000 × g for 15 min at 4 °C.
Moving forward, all steps were performed at 4 °C or on ice, and all buffers were filtered through 0.2 µm filters. Pellets were resuspended in 30 mL lysis buffer (50 mM Tris-HCl pH 8.0, 5% glycerol, 100 mM NaCl, 2 mM EDTA, 10 mM BME, 10 mM DTT, 0.1 mg/mL phenylmethylsulfonyl fluoride, with one dissolved tablet of Roche cOmpleteTM ULTRA EDTA-Free Protease Inhibitor Cocktail). The resuspended cell solution was sonicated for 20 min total (alternating sonication on/off times of 5 min) with settings Power 8, Duty Cycle 20%. The lysate was then transferred to round-bottom polycarbonate tubes and spun at 27,000 × g for 15 min. The supernatant was transferred to a 100 mL beaker with stir bar, then 6.5% PEI was slowly added to a final concentration of 0.6% while stirring. The solution was stirred for one hour, then transferred to open-top, round-bottom polycarbonate tubes and spun at 11,000 × g for 15 min. After decanting supernatant, a tissue homogenizer was used to resuspend the pellet in 25 mL of TGEDZ (10 mM Tris-HCl pH 8.0, 5 % glycerol, 0.1 mM EDTA, 5 µM ZnCl2, 1 mM dithiothreitol) with added 0.3 M NaCl. The solution was spun at 11,000 × g for 15 min. After decanting supernatant, a tissue homogenizer was used to resuspend the pellet in 25 mL of TGEDZ with added 1 M NaCl. The solution was spun at 11,000 × g for 15 min. The supernatant was transferred into a 100 mL beaker with stir bar, then finely-ground AmSO4 was added to the stirring solution to a final concentration of ~0.37 g/mL and precipitated overnight. The solution was transferred to Oak Ridge round-bottom tubes and spun at 27,000 × g for 15 min.
The pellet was dissolved in 35 mL of HisTrap Binding Buffer (20 mM Tris-HCl pH 8.0, 500 mM NaCl, 5 mM imidazole, 5 mM beta-mercaptoethanol (BME), then spun at 27,000 × g for 15 min in the same Oak Ridge round-bottom tube. The supernatant was filtered through 0.2 µm filters and applied at 1 mL/min to a HisTrap HP 5 mL column, pre-equilibrated with HisTrap Binding Buffer. The column was washed with HisTrap Binding Buffer at 5 mL/min until A280 reached baseline, then washed at 5 mL/min with 2% HisTrap Elution Buffer (20 mM Tris-HCl pH 8.0, 500 mM NaCl, 1 M imidazole, 5 mM beta-mercaptoethanol [BME]) until A280 reached baseline. rBfRNAP was eluted at 5 mL/min with a 2–50% gradient of HisTrap Elution Buffer (translating to a 20–500 mM imidazole gradient). Three milliliter elution fractions containing rBfRNAP were pooled, filtered through 0.2 µm filters, then the NaCl concentration was reduced to 150 mM for the following purification step by dilution with TGEDZ buffer.
HisTrap elution fractions were pooled then diluted with 100 mM Tris-HCl, pH 8.0, 1 mM EDTA, 10 mM DTT to adjust the salt concentration to 150 mM NaCl. The sample was then applied to a 5 mL Strep-Tactin® XT High Capacity column pre-equilibrated with 2 CV Buffer W (100 mM Tris-HCl, pH 8.0, 150 mM NaCl, 1 mM EDTA, 10 mM DTT) at 2 mL/min. The flow-through was reapplied to the column at 0.037 mL/min. The column was then washed with 5 CV of Buffer W. rBfRNAP was eluted with Buffer BXT (100 mM Tris-HCl, pH 8.0, 150 mM NaCl, 1 mM EDTA, 10 mM DTT, 50 mM D+Biotin (Acros Organics)).
Pooled fractions from the previous step were applied at 1.5 mL/min to a HiTrap HP column pre-equilibrated with TGEDZ + 200 mM NaCl. The column was then washed with TGEDZ + 200 mM NaCl until A280 reached baseline, then rBfrRNAP was eluted with TGEDZ + 500 mM NaCl at 2.5 mL/min.
Pooled fractions from the previous step were dialyzed overnight in RNAP storage buffer (10 mM Tris-HCl, pH 8.0, 25% glycerol, 100 mM NaCl, 100 µM EDTA, 1 mM MgCl2, 20 µM ZnCl2, 10 mM DTT) using a 10 kDa MWCO cassette, then concentrated using Ultra-4, MWCO 100 kDa (Sigma-Aldrich Z648043-24EA) to a final concentration of 8 µM. The solution was then aliquoted, flash-frozen, and stored at –80 °C.
Cloning and purification of transcription factors
All TFs (NusG, NusA, YA, YB, YC, YE, YF, YH, YB(NGN)–YB(KOW), YE(NGN)– YB(KOW)) were cloned into a pTYB2 backbone (Addgene catalog N6702S) after PCR amplification from Bacteroides fragilis ATCC 25285 (NCTC 9343) genomic DNA by NEB HiFi DNA assembly (Gibson Assembly). This vector enables IPTG-inducible over-expression of proteins fused at the C-terminus to the Saccharomyces cerevisiae VMA intein and chitin-binding domain. Importantly, to ensure efficient self-cleavage via the intein, an Ala residue was incorporated at the C-terminus of all transcription factor coding sequences.
After plasmid sequence verification, RL1674 (E. coli BL21 RosettaTM (DE3)) was transformed by electroporation with pTYB2-derived constructs, then plated on LB agar with 100 µg ampicillin/mL and 20 µg chloramphenicol/mL (for retention of pRARE2 plasmid). For each expression construct, a single colony was picked and used to inoculate a 3 mL overnight LB culture grown at 37 °C containing the same concentration of antibiotics. The next day, 1 mL of overnight culture was used to inoculate a 200 mL LB culture containing antibiotics (3% ethanol was added for all YX constructs) and grown at 37 °C. When the OD reached 0.2–0.3, the incubation temperature was dropped to 16 °C and shaking continued for 30 min. Subsequently, a final concentration of 200 µM IPTG was added and incubation continued overnight (16–18 h). The next day, cultures were placed on ice for 20 min, then pelleted at 3000 × g for 15 min at 4 °C.
Pellets were resuspended in 40 mL of Chitin Wash Buffer (CWB; 30 mM Tris-HCl, pH 7.5–8.0 depending on protein pI, 0.5 M NaCl, 1 mM EDTA, 0.05% Tween® 20) plus one dissolved tablet of Roche cOmpleteTM ULTRA EDTA-Free Protease Inhibitor Cocktail. The cell suspension was sonicated 10 min at 20% duty cycle, Power 8. The lysate was pelleted at 30,000 × g for 30 min at 4 °C, then the supernatant was passed through 0.2 µm filters.
The subsequent steps were performed at room temperature closely following manufacturer’s instructions. Briefly, 3 mL of a homogenous suspension of NEB Chitin Resin (Catalog S6651L) were loaded into a 25 mL Poly-Prep Gravity Chromatography Column (Biorad), washed with 5 mL of mQH2O, then equilibrated by washing 3 times each with 10 mL of CWB. The lysate was subsequently loaded onto the column, then washed three times each with 10 mL of CWB. Cleavage Buffer (CB) was made by adding 500 µl of 1 M DTT (prepared fresh from solid reagent) to 10 mL of CWB, then a quick flush was performed by adding 3 mL of CB. SDS-PAGE revealed no premature elution in the quick flush fraction. Immediately after dripping stopped, the bottom and top of the column were capped, parafilmed, and the column was incubated at room temperature overnight (16–18 h) to allow sufficient time for cleavage. The next day, cleaved protein was eluted by addition of 1.5 mL CWB + 10 mM DTT, then dialyzed overnight in 10 mM Tris-HCl, pH 7.5–8.0 depending on pI, 2% glycerol, 100 mM NaCl, 100 µM EDTA, 10 mM DTT using a 10 K MWCO cassette. After removal from the dialysis cassette, additional glycerol was added to a final concentration of 25%. The solution was aliquoted, flash-frozen, then stored at −80 °C until use.
PIVoT assays
A direct reconstitution approach was used to assemble elongation complexes (ECs). Briefly, RNA and template DNA oligonucleotides were mixed at a ratio of 1:1.2 (5 µM: 6 µM) in transcription buffer (TB; 20 mM Tris-OAc, pH 7.7, 40 mM KOAc, 5 mM Mg(OAc)2, 1 mM DTT), then annealed by slow cooling in a thermocycler. To assemble 10× ECs, first the annealed RNA:tDNA scaffold and RNAP were mixed in TB and incubated for 15 min at 37 °C. Then, non-template DNA oligonucleotide was added and incubation continued for an additional 15 min at 37 °C. The solution was diluted with TB to prepare 2× EC (subtracting volume of further additions) and incubated for 1 min at 37 °C. Then, 5 µCi of [α-32P]NTP (depending on the scaffold) was added and incubated for 3 min at 37 °C. Additional GTP was added such that the final concentration of GTP in the solution was 10 µM, and incubation continued for 3 min at 37 °C.
2× ECs were aliquoted and all comparisons made were therefore performed with identically formed ECs. The assay was performed at 37 °C: transcription was restarted by addition of 2× NTPs minus/plus TFs or storage buffer. For Fig. 5c, YB was pre-incubated with halted ECs following reconstitution at −3 and incorporation labeling to −2 prior to restarting transcription. Timepoints were taken by mixing 5 µl reaction aliquots with 5 µl of 2× Stop Buffer (25 mM EDTA, 8 M Urea, 1× TBE, 0.1% bromophenol blue, 0.1% xylene cyanol). The ratio and concentrations of EC components in the 1× EC solution was 1:1.2:1.4:1.6 (R:T:RNAP:NT; 50 nM, 60 nM, 70 nM, 80 nM). The final reaction concentrations of TFs are indicated in each figure legend. Unless otherwise indicated, NTPs are added to a final reaction concentration of 100 µM. RNAs were resolved by 8% or 15% Urea-PAGE with 0.5× TBE running buffer until the leading dye ran off the gel. Gels were exposed to PhosphorImager screens and scanned using a Typhoon Phosphorimager. To quantify effects in ImageQuant, boxes were drawn around the pause band opsX, the capture band(s) (if applicable), and beyond. After subtracting background, the fractions of RNA at opsX or at capture positions were averaged and errors reflect standard deviation from at least three replicates (unless indicated otherwise).
For the Z-titration assay in Fig. 2d, data were fit in Kaleidagraph to a sigmoidal function of the form y = a + (b-a) / (1 + (x/c)d) where a= ymin, b is ymax, c is the ZX concentration at mid-point, and d is slope at mid-point; and weighted by standard deviation (error bars) from three assays.
Biolayer interferometry
Preparation of biotinylated-YE: pJS060 was cloned similarly to other pTYB2-derived constructs (see above), with the exception that two oligos were included in the Gibson assembly to introduce the 16 codon Avi-tagTM onto the N-terminus of upeY. Expression, cell harvesting, and lysis conditions are as described above. Avi-YE was biotinylated on a gravity column as described below:
The subsequent steps were performed at room temperature closely following NEB instructions. Briefly, 3 mL of a homogenous suspension of NEB Chitin Resin (Catalog S6651L) were loaded into a 25 mL Poly-Prep Gravity Chromatography Column (Biorad), washed with 5 mL of mQH2O, then equilibrated by washing 3 times each with 10 mL of CWB. The lysate was subsequently loaded onto the column, then washed three times each with 10 mL of CWB. The column was then washed with three times each with Avi Chitin Wash Buffer (AviCWB = 10 mM Tris 8.0, 0.5 M KGlu, 0.1% Tween20). Components from Avidity BirA500 Kit were used in the subsequent biotinylation reaction: a biotinylating solution (500 µL AviCWB, 70 µL of BiomixA, 70 µL Biomix B, 10 µL of 1 mg/mL BirA) was added to the column and the reaction was allowed to continue for 2.5 h. The column was subsequently washed three times each with 10 mL of CWB. Cleavage Buffer (CB) was made by adding 500 µl of 1 M DTT (prepared fresh from solid reagent) to 10 mL of CWB, then a quick flush was performed by adding 3 mL of CB. Immediately after dripping stopped, the bottom then the top of the column were capped, parafilmed, and the column was incubated at room temperature overnight (16–18 h) to allow sufficient time for cleavage. The next day, cleaved protein was eluted by addition of 1.5 mL CWB + 10 mM DTT, then dialyzed overnight in 10 mM Tris-HCl pH 7.5, 2% glycerol, 100 mM NaCl, 100 µM EDTA, 1 mM DTT using a 10 K MWCO cassette. After removal from the dialysis cassette, additional glycerol was added to a final concentration of 20%. The solution was aliquoted, flash-frozen, then stored at −80 °C until use. Importantly, Biotin-YE retained activity in vitro.
For each titration, 1 mL of 0.3 µM biotinylated-YE was prepared in Octet Binding Buffer 4.1 (OBB4.1 = PBS + 400 mM NaCl + 0.01% Triton X-100 + 0.25% BSA). ZA solution was prepared at 100 nM in OBB4.1 with twofold serial dilutions down to 1.56 nM. ZE solution was prepared at 500 nM in OBB4.1 with serial dilutions down to 31.3 nM. Plates were prepared for binding assays: in plate 1200 µL of OBB4.1 was placed in each well of column 1 containing a biosensor (up to 8 biosensors per experiment); plate 2 (containing ‘half-area’ wells permitting 100 µL volumes) column 1 contained 100 µL/well of OBB4.1, column 2 contained 100 µL/well of 0.3 µM biotinylated-YE, and column 3 contained 100 µl/well of ZX serial dilutions or buffer (as a blank/reference) prepared above.
A basic kinetics assay was performed using standard acquisition rates at 30 °C on a ForteBio Octet RED96 system. Octet® Streptavidin (SA) Biosensors were pre-equilibrated for 10 min at 30c. Step times: Baseline (Plate 2 Column 1 (P2C1)) = 60 sec; Loading (P2C2 = 320 sec (or until 2 nm loading density reached); Baseline (P2C1) = 60 sec; Association (P2C3) = ≥ 300 sec; Dissociation (P2C1) = ≥ 300 sec.
Data were processed using Octet Data Analysis Software. The reference biosensor curve (bio-YE + buffer in place of ZX) was subtracted from all binding curves. Traces were subsequently aligned along the Y axis at pre-association baseline with interstep correction performed at the dissociation step. Noise Filtering (Savitsky-GolayFiltering, smoothingfunction) was performed. Data from each experiment were independently globally fit. For each binding pair tested, two out of three global fits have R2 values around 0.95 or greater and chi-squared values less than 3 as recommended by ForteBio. Given the two orders of magnitude difference in binding constants, limited conclusions we are making, and parsimonious agreement of these constants among replicates and with our PIVoT assays, we deemed the fits overall acceptable. The average and standard deviation of the kinetic parameters from the global fits are reported. Equilibrium constants are calculated from models. The value ‘Req/Rmax’ is reported as fraction YE bound.
Exonuclease footprinting
Nucleic acid scaffolds used in exonuclease footprinting assays were each comprised of: (i) a 32P-labeled template DNA oligo, (ii) a non-template DNA oligo with four consecutive phosphorothioate bonds at the 3′ end, and (iii) an RNA oligo with 3′ end at the position of pausing in opsX and having noncomplementary bases upstream of the RNA-DNA hybrid to prohibit backtracking.
Template DNA oligo (20 μM) was labeled in a T4 PNK reaction with 1 μCi of [γ-32P]ATP and allowed to proceed for 15 mins at 37 °C. ATP (1 μL of 1 mM) was subsequently added to the reaction and allowed to proceed for 30 min at 37 °C. Reactions were stopped by heating at 65 °C for 20 min and oligos were subsequently purified using G-50 columns pre-equilibrated with TE and following the manufacturer’s instructions.
TECs were reconstituted essentially as described in in vitro transcription assays, except that the molar ratio of T:R:Pol:NT was 1:2:3:5 (50 nM 32P-T: 100 nM R: 150 nM RNAP: 250 nM NT). TECs were subsequently split into 35 μL aliquots and incubated with either storage buffer or YX variants for 3 min at 37 °C. Tubes were shifted to 30 °C and allowed to incubate for 3 min before removing a 5 μl aliquot (time 0) and mixing with equal volume 2× Stop Buffer. Exonuclease reactions were initiated by adding 100 μ of ExoIII, and aliquots were removed from reactions and mixed with stop buffer at times indicated in figures.
To quantify both transient and stable protection from exonucleolytic cleavage, pseudodensitometry traces were generated for the first timepoint lane. Regions of interest were identified by comparison to a sequencing ladder. Areas under the peaks of these regions were determined by manual integration in Microsoft Excel, then divided by the sum of the areas under all peaks to the right of it. These values were determined in the absence or presence of YB, and their ratio is reported as fold change (+YB/−YB) for each sequence variant.
Structural models
A model of YB was made using Modeller94,95 and fitted to 8PHK33. Additional upstream and downstream DNA were modeled using Pymol. The YE–ZA complex structure was predicted using AlphaFold 354, yielding an interface predicted template modeling (iPTM) score of 0.89 and predicted template modeling (pTM) score of 0.9 (values above 0.8 represent confident high-quality predictions). Additional confidence metrics are illustrated in Supplementary Fig. 6. RNA secondary structures were predicted using RNAFold95.
The BfrRNA polymerase PEC model was generated using Modeller96, the M. tuberculosis PEC formed on the B. subtilis trpL pause sequence (8E74)27, NusA and NusG NGN models from SWISS-MODEL97, and Porphymonas gingevalis RNAP (8DKC)98.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The NET-seq data generated in this study have been deposited in the NCBI GEO database under accession code GSE281607. The YE–ZA model generated by AlphaFold3 is available on Zenodo at https://doi.org/10.5281/zenodo.14110860. Source data are provided with this paper for all other experiments. Source data are provided with this paper.
Code availability
Scripts for analyzing NET-seq data are available on Zenodo at https://doi.org/10.5281/zenodo.14110860.
References
Deng, H. et al. Bacteroides fragilis prevents clostridium difficile infection in a mouse model by restoring gut barrier and microbiome regulation. Front. Microbiol. 9, 2976 (2018).
Li, X. et al. A strain of Bacteroides thetaiotaomicron attenuates colonization of Clostridioides difficile and affects intestinal microbiota and bile acids profile in a mouse model. Biomed. Pharmacother. 137, 111290 (2021).
Carasso, S. et al. Inflammation and bacteriophages affect DNA inversion states and functionality of the gut microbiota. Cell Host Microbe 32, 322–334.e329 (2024).
Hryckowian, A. J. et al. Bacteroides thetaiotaomicron-infecting bacteriophage isolates inform sequence-based host range predictions. Cell Host Microbe 28, 371–379.e375 (2020).
Porter, N. T. et al. Phase-variable capsular polysaccharides and lipoproteins modify bacteriophage susceptibility in Bacteroides thetaiotaomicron. Nat. Microbiol. 5, 1170–1181 (2020).
Bechon, N. et al. Capsular polysaccharide cross-regulation modulates bacteroides thetaiotaomicron biofilm formation. mBio 11 (2020). https://doi.org/10.1128/mBio.00729-20.
Jiang, X. et al. Invertible promoters mediate bacterial phase variation, antibiotic resistance, and host adaptation in the gut. Science 363, 181–187 (2019).
Mazmanian, S. K., Liu, C. H., Tzianabos, A. O. & Kasper, D. L. An immunomodulatory molecule of symbiotic bacteria directs maturation of the host immune system. Cell 122, 107–118 (2005).
Mazmanian, S. K., Round, J. L. & Kasper, D. L. A microbial symbiosis factor prevents intestinal inflammatory disease. Nature 453, 620–625 (2008).
Porter, N. T., Canales, P., Peterson, D. A. & Martens, E. C. A subset of polysaccharide capsules in the human symbiont bacteroides thetaiotaomicron promote increased competitive fitness in the mouse gut. Cell Host Microbe 22, 494–506.e498 (2017).
Ramakrishna, C. et al. Bacteroides fragilis polysaccharide A induces IL-10 secreting B and T cells that prevent viral encephalitis. Nat. Commun. 10, 2153 (2019).
Chatzidaki-Livanis, M., Coyne, M. J. & Comstock, L. E. A family of transcriptional antitermination factors necessary for synthesis of the capsular polysaccharides of Bacteroides fragilis. J. Bacteriol. 191, 7288–7295 (2009).
Chatzidaki-Livanis, M., Weinacht, K. G. & Comstock, L. E. Trans locus inhibitors limit concomitant polysaccharide synthesis in the human gut symbiont Bacteroides fragilis. Proc. Natl. Acad. Sci. USA 107, 11976–11980 (2010).
Troy, E. B., Carey, V. J., Kasper, D. L. & Comstock, L. E. Orientations of the Bacteroides fragilis capsular polysaccharide biosynthesis locus promoters during symbiosis and infection. J. Bacteriol. 192, 5832–5836 (2010).
Lan, F. et al. Single-cell analysis of multiple invertible promoters reveals differential inversion rates as a strong determinant of bacterial population heterogeneity. Sci. Adv. 9, eadg5476 (2023).
Lan, F. et al. Massively parallel single-cell sequencing of diverse microbial populations. Nat. Methods 21, 228–235 (2024).
Coyne, M. J., Weinacht, K. G., Krinos, C. M. & Comstock, L. E. Mpi recombinase globally modulates the surface architecture of a human commensal bacterium. Proc. Natl. Acad. Sci. USA 100, 10446–10451 (2003).
Werner, F. A nexus for gene expression-molecular mechanisms of Spt5 and NusG in the three domains of life. J. Mol. Biol. 417, 13–27 (2012).
Bailey, M. J., Koronakis, V., Schmoll, T. & Hughes, C. Escherichia coli HlyT protein, a transcriptional activator of haemolysin synthesis and secretion, is encoded by the rfaH (sfrB) locus required for expression of sex factor and lipopolysaccharide genes. Mol. Microbiol. 6, 1003–1012 (1992).
Bies-Etheve, N. et al. RNA-directed DNA methylation requires an AGO4-interacting member of the SPT5 elongation factor family. EMBO Rep. 10, 649–654 (2009).
Goodson, J. R., Klupt, S., Zhang, C., Straight, P. & Winkler, W. C. LoaP is a broadly conserved antiterminator protein that regulates antibiotic gene clusters in Bacillus amyloliquefaciens. Nat. Microbiol. 2, 17003 (2017).
Artsimovitch, I. & Landick, R. The transcriptional regulator RfaH stimulates RNA chain synthesis after recruitment to elongation complexes by the exposed nontemplate DNA strand. Cell 109, 193–203 (2002).
Kang, J. Y. et al. Structural basis for transcript elongation control by NusG/RfaH universal regulators. Cell 173, 1650–1662.e1614 (2018).
Leeds, J. A. & Welch, R. A. RfaH enhances elongation of Escherichia coli hlyCABD mRNA. J. Bacteriol. 178, 1850–1857 (1996).
Yakhnin, A. V. et al. Robust regulation of transcription pausing in Escherichia coli by the ubiquitous elongation factor NusG. Proc. Natl. Acad. Sci. USA 120, e2221114120 (2023).
Czyz, A., Mooney, R. A., Iaconi, A. & Landick, R. Mycobacterial RNA polymerase requires a U-tract at intrinsic terminators and is aided by NusG at suboptimal terminators. mBio 5, e00931 (2014).
Delbeau, M. et al. Structural and functional basis of the universal transcription factor NusG pro-pausing activity in Mycobacterium tuberculosis. Mol. Cell 83, 1474–1488.e1478 (2023).
Mandell, Z. F. et al. NusG is an intrinsic transcription termination factor that stimulates motility and coordinates gene expression with NusA. Elife 10 (2021). https://doi.org/10.7554/eLife.61880.
Mondal, S., Yakhnin, A. V., Sebastian, A., Albert, I. & Babitzke, P. NusA-dependent transcription termination prevents misregulation of global gene expression. Nat. Microbiol. 1, 15007 (2016).
Sevostyanova, A. & Artsimovitch, I. Functional analysis of Thermus thermophilus transcription factor NusG. Nucleic Acids Res. 38, 7432–7445 (2010).
Landick, R. Transcriptional pausing as a mediator of bacterial gene regulation. Annu. Rev. Microbiol. 75, 291–314 (2021).
Zuber, P. K. et al. The universally-conserved transcription factor RfaH is recruited to a hairpin structure of the non-template DNA strand. Elife 7 (2018). https://doi.org/10.7554/eLife.36349.
Zuber, P. K. et al. Concerted transformation of a hyper-paused transcription complex and its reinforcing protein. Nat. Commun. 15, 3040 (2024).
Paitan, Y., Orr, E., Ron, E. Z. & Rosenberg, E. A NusG-like transcription anti-terminator is involved in the biosynthesis of the polyketide antibiotic TA of Myxococcus xanthus. FEMS Microbiol. Lett. 170, 221–227 (1999).
Nunez, B., Avila, P. & de la Cruz, F. Genes involved in conjugative DNA processing of plasmid R6K. Mol. Microbiol. 24, 1157–1168 (1997).
Churchman, L. S. & Weissman, J. S. Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature 469, 368–373 (2011).
Larson, M. H. et al. A pause sequence enriched at translation start sites drives transcription dynamics in vivo. Science 344, 1042–1047 (2014).
Artsimovitch, I. & Landick, R. Pausing by bacterial RNA polymerase is mediated by mechanistically distinct classes of signals. Proc. Natl. Acad. Sci. Usa. 97, 7090–7095 (2000).
Guo, X. et al. Structural basis for NusA stabilized transcriptional pausing. Mol. Cell 69, 816–827.e814 (2018).
Kang, J. Y. et al. RNA polymerase accommodates a pause RNA hairpin by global conformational rearrangements that prolong pausing. Mol. Cell 69, 802–815.e805 (2018).
Krinos, C. M. et al. Extensive surface diversity of a commensal microorganism by multiple DNA inversions. Nature 414, 555–558 (2001).
Daube, S. S. & von Hippel, P. H. Functional transcription elongation complexes from synthetic RNA-DNA bubble duplexes. Science 258, 1320–1324 (1992).
Toulokhonov, I., Artsimovitch, I. & Landick, R. Allosteric control of RNA polymerase by a site that contacts nascent RNA hairpins. Science 292, 730–733 (2001).
Bao, Y., Cao, X. & Landick, R. RNA polymerase SI3 domain modulates global transcriptional pausing and pause-site fluctuations. Nucleic Acids Res. (2024). https://doi.org/10.1093/nar/gkae209.
Gajos, M. et al. Conserved DNA sequence features underlie pervasive RNA polymerase pausing. Nucleic Acids Res. 49, 4402–4420 (2021).
Kireeva, M. L. & Kashlev, M. Mechanism of sequence-specific pausing of bacterial RNA polymerase. Proc. Natl. Acad. Sci. USA 106, 8900–8905 (2009).
Yakhnin, A. V. et al. NusG controls transcription pausing and RNA polymerase translocation throughout the Bacillus subtilis genome. Proc. Natl. Acad. Sci. USA 117, 21628–21636 (2020).
Ha, K. S., Toulokhonov, I., Vassylyev, D. G. & Landick, R. The NusA N-terminal domain is necessary and sufficient for enhancement of transcriptional pausing via interaction with the RNA exit channel of RNA polymerase. J. Mol. Biol. 401, 708–725 (2010).
Jayasinghe, O. T., Mandell, Z. F., Yakhnin, A. V., Kashlev, M. & Babitzke, P. Transcriptome-wide effects of NusA on RNA polymerase pausing in Bacillus subtilis. J. Bacteriol. 204, e0053421 (2022).
Kolb, K. E., Hein, P. P. & Landick, R. Antisense oligonucleotide-stimulated transcriptional pausing reveals RNA exit channel specificity of RNA polymerase and mechanistic contributions of NusA and RfaH. J. Biol. Chem. 289, 1151–1163 (2014).
Strobel, E. J. & Roberts, J. W. Two transcription pause elements underlie a sigma70-dependent pause cycle. Proc. Natl. Acad. Sci. USA 112, e4374–e4380 (2015).
Revyakin, A., Liu, C., Ebright, R. H. & Strick, T. R. Abortive initiation and productive initiation by RNA polymerase involve DNA scrunching. Science 314, 1139–1143 (2006).
Roberts, J. W. Biochemistry. RNA polymerase, a scrunching machine. Science 314, 1097–1098 (2006).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature (2024). https://doi.org/10.1038/s41586-024-07487-w.
Elghondakly, A., Wu, C. H., Klupt, S., Goodson, J. & Winkler, W. C. A NusG Specialized Paralog That Exhibits Specific, High-Affinity RNA-Binding Activity. J. Mol. Biol. 433, 167100 (2021).
Eckartt, K. A. et al. Compensatory evolution in NusG improves fitness of drug-resistant M. tuberculosis. Nature (2024). https://doi.org/10.1038/s41586-024-07206-5.
Sevostyanova, A., Belogurov, G. A., Mooney, R. A., Landick, R. & Artsimovitch, I. The beta subunit gate loop is required for RNA polymerase modification by RfaH and NusG. Mol. Cell 43, 253–262 (2011).
You, L. et al. Structural basis for intrinsic transcription termination. Nature 613, 783–789 (2023).
Zhu, C. et al. Transcription factors modulate RNA polymerase conformational equilibrium. Nat. Commun. 13, 1546 (2022).
Harteis, S. & Schneider, S. Making the bend: DNA tertiary structure and protein-DNA interactions. Int. J. Mol. Sci. 15, 12335–12363 (2014).
Landick, R. & Yanofsky, C. Isolation and structural analysis of the Escherichia coli trp leader paused transcription complex. J. Mol. Biol. 196, 363–377 (1987).
Nedialkov, Y., Svetlov, D., Belogurov, G. A. & Artsimovitch, I. Locking the non-template DNA to control transcription. Mol. Microbiol. 109, 445–457 (2018).
Samkurashvili, I. & Luse, D. S. Translocation and transcriptional arrest during transcript elongation by RNA polymerase II. J. Biol. Chem. 271, 23495–23505 (1996).
Lane, W. J. & Darst, S. A. Molecular evolution of multisubunit RNA polymerases: sequence analysis. J. Mol. Biol. 395, 671–685 (2010).
Cao, X. et al. Basis of narrow-spectrum activity of fidaxomicin on Clostridioides difficile. Nature 604, 541–545 (2022).
Vishwakarma, R. K., Qayyum, M. Z., Babitzke, P. & Murakami, K. S. Allosteric mechanism of transcription inhibition by NusG-dependent pausing of RNA polymerase. Proc. Natl. Acad. Sci. USA 120, e2218516120 (2023).
Hustmyer, C. M., Wolfe, M. B., Welch, R. A. & Landick, R. RfaH counter-silences inhibition of transcript elongation by H-NS-StpA nucleoprotein filaments in pathogenic Escherichia coli. mBio 13, e0266222 (2022).
Accetto, T. & Avgustin, G. Inability of Prevotella bryantii to form a functional Shine-Dalgarno interaction reflects unique evolution of ribosome binding sites in Bacteroidetes. PLoS ONE 6, e22914 (2011).
Mastropaolo, M. D., Thorson, M. L. & Stevens, A. M. Comparison of Bacteroides thetaiotaomicron and Escherichia coli 16S rRNA gene expression signals. Microbiology 155, 2683–2693 (2009).
Mimee, M., Tucker, A. C., Voigt, C. A. & Lu, T. K. Programming a Human Commensal Bacterium, Bacteroides thetaiotaomicron, to Sense and Respond to Stimuli in the Murine Gut Microbiota. Cell Syst. 1, 62–71 (2015).
Wegmann, U., Horn, N. & Carding, S. R. Defining the bacteroides ribosomal binding site. Appl. Environ. Microbiol. 79, 1980–1989 (2013).
Johnson, G. E., Lalanne, J. B., Peters, M. L. & Li, G. W. Functionally uncoupled transcription-translation in Bacillus subtilis. Nature 585, 124–128 (2020).
Adhya, S. & Gottesman, M. Control of transcription termination. Annu. Rev. Biochem. 47, 967–996 (1978).
Burmann, B. M. et al. A NusE:NusG complex links transcription and translation. Science 328, 501–504 (2010).
Byrne, R., Levin, J. G., Bladen, H. A. & Nirenberg, M. W. The in vitro formation of a DNA-Ribosome complex. Proc. Natl. Acad. Sci. USA 52, 140–148 (1964).
Castro-Roa, D. & Zenkin, N. In vitro experimental system for analysis of transcription-translation coupling. Nucleic acids Res. 40, e45 (2012).
Landick, R., Carey, J. & Yanofsky, C. Translation activates the paused transcription complex and restores transcription of the trp operon leader region. Proc. Natl. Acad. Sci. USA 82, 4663–4667 (1985).
McGary, K. & Nudler, E. RNA polymerase and the ribosome: the close relationship. Curr. Opin. Microbiol. 16, 112–117 (2013).
Miller, O. L. Jr, Hamkalo, B. A. & Thomas, C. A. Jr Visualization of bacterial genes in action. Science 169, 392–395 (1970).
Proshkin, S., Rahmouni, A. R., Mironov, A. & Nudler, E. Cooperation between translating ribosomes and RNA polymerase in transcription elongation. Science 328, 504–508 (2010).
Saxena, S. et al. Escherichia coli transcription factor NusG binds to 70S ribosomes. Mol. Microbiol. 108, 495–504 (2018).
Stevenson-Jones, F., Woodgate, J., Castro-Roa, D. & Zenkin, N. Ribosome reactivates transcription by physically pushing RNA polymerase out of transcription arrest. Proc. Natl. Acad. Sci. USA 117, 8462–8467 (2020).
Burmann, B. M. et al. An alpha helix to beta barrel domain switch transforms the transcription factor RfaH into a translation factor. Cell 150, 291–303 (2012).
Lorber, C. G. [Differential diagnosis of maxillofacial neuralgia]. ZWR 85, 514–518 (1976).
O’Donnell, S. M. & Janssen, G. R. The initiation codon affects ribosome binding and translational efficiency in Escherichia coli of cI mRNA with or without the 5’ untranslated leader. J. Bacteriol. 183, 1277–1283 (2001).
Jin, D. J. & Gross, C. A. Mapping and sequencing of mutations in the Escherichia coli rpoB gene that lead to rifampicin resistance. J. Mol. Biol. 202, 45–58 (1988).
Pantosti, A., Tzianabos, A. O., Onderdonk, A. B. & Kasper, D. L. Immunochemical characterization of two surface polysaccharides of Bacteroides fragilis. Infect. Immun. 59, 2075–2082 (1991).
Garcia-Bayona, L. & Comstock, L. E. Streamlined genetic manipulation of diverse bacteroides and parabacteroides isolates from the human gut microbiota. mBio 10 (2019). https://doi.org/10.1128/mBio.01762-19.
Welch, M. et al. Design parameters to control synthetic gene expression in Escherichia coli. PLoS ONE 4, e7002 (2009).
Windgassen, T. A. et al. Trigger-helix folding pathway and SI3 mediate catalysis and hairpin-stabilized pausing by Escherichia coli RNA polymerase. Nucleic Acids Res. 42, 12707–12721 (2014).
Reis, A. C. & Salis, H. M. An automated model test system for systematic development and improvement of gene expression models. ACS Synth. Biol. 9, 3145–3156 (2020).
Salis, H. M., Mirsky, E. A. & Voigt, C. A. Automated design of synthetic ribosome binding sites to control protein expression. Nat. Biotechnol. 27, 946–950 (2009).
Saba, J. et al. The elemental mechanism of transcriptional pausing. Elife 8 (2019). https://doi.org/10.7554/eLife.40981.
Sali, A. & Blundell, T. L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993).
Gruber, A. R., Lorenz, R., Bernhart, S. H., Neubock, R. & Hofacker, I. L. The vienna RNA websuite. Nucleic Acids Res. 36, W70–W74 (2008).
Webb, B. & Sali, A. Comparative protein structure modeling using MODELLER. Curr. Protoc. Bioinforma. 54, 5 6 1–5 6 37 (2016).
Waterhouse, A. et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 46, W296–W303 (2018).
Bu, F. et al. Cryo-EM structure of porphyromonas gingivalis RNA polymerase. J. Mol. Biol. 436, 168568 (2024).
Coyne, M. J. et al. Polysaccharide biosynthesis locus required for virulence of Bacteroides fragilis. Infect. Immun. 69, 4342–4350 (2001).
Sultana, A. & Lee, J. E. Measuring protein-protein and protein-nucleic acid interactions by biolayer interferometry. Curr. Protoc. Protein Sci. 79, 19 25 11–19 25 26 (2015).
Madeira, F. et al. The EMBL-EBI job dispatcher sequence analysis tools framework in 2024. Nucleic Acids Res. (2024). https://doi.org/10.1093/nar/gkae241.
Acknowledgements
We thank members of the Landick and Comstock labs for helpful discussions and comments on the manuscript. This work was supported by NIH R01 GM038660 and USDA Hatch WIS05004 to R.L., NIH R01 AI093771 to L.C., the Duchossois Family Institute, and the DOE Office of Science, Biological and Environmental Research Program Great Lakes Bioenergy Research Center (DE-SC0018409). A.G. was supported by the NIH Predoctoral Training Program in Genetics (T32 GM007133). J.S. was supported by the NIH Biotechnology Training Grant (T32 GM135066 and T32 GM008349), an NIH F31 Graduate Fellowship (F31 GM142153), and a SciMed Graduate Research Scholars Fellowship from the UW–Madison Graduate School and Wisconsin Alumni Research Foundation.
Author information
Authors and Affiliations
Contributions
R.L. and J.S. conceived of the study. J.S. conceived and developed assays, cloned most plasmids, purified all proteins, performed most experiments, and analyzed data. K.F. constructed plasmids for B. fragilis genetic manipulation, created Bacteroides strains and performed Western blots. M.E., B.M., and J.S. wrote custom scripts. J.S. and R.L. interpreted data. M.E., Y.P., and A.G. performed experiments. R.L. and J.S. constructed structural models. J.S. and R.L. wrote the original manuscript and designed figures. J.S., R.L., and L.C. revised the manuscript. R.L., L.C., and J.S., secured funding. R.L. and L.C. supervised the study.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Saba, J., Flores, K., Marshall, B. et al. Bacteroides expand the functional versatility of a conserved transcription factor and transcribed DNA to program capsule diversity. Nat Commun 15, 10862 (2024). https://doi.org/10.1038/s41467-024-55215-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-55215-9