[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Local regulation of gene expression by lncRNA promoters, transcription and splicing

Abstract

Mammalian genomes are pervasively transcribed1,2 to produce thousands of long non-coding RNAs (lncRNAs)3,4. A few of these lncRNAs have been shown to recruit regulatory complexes through RNA–protein interactions to influence the expression of nearby genes5,6,7, and it has been suggested that many other lncRNAs can also act as local regulators8,9. Such local functions could explain the observation that lncRNA expression is often correlated with the expression of nearby genes2,10,11. However, these correlations have been challenging to dissect12 and could alternatively result from processes that are not mediated by the lncRNA transcripts themselves. For example, some gene promoters have been proposed to have dual functions as enhancers13,14,15,16, and the process of transcription itself may contribute to gene regulation by recruiting activating factors or remodelling nucleosomes10,17,18. Here we use genetic manipulation in mouse cell lines to dissect 12 genomic loci that produce lncRNAs and find that 5 of these loci influence the expression of a neighbouring gene in cis. Notably, none of these effects requires the specific lncRNA transcripts themselves and instead involves general processes associated with their production, including enhancer-like activity of gene promoters, the process of transcription, and the splicing of the transcript. Furthermore, such effects are not limited to lncRNA loci: we find that four out of six protein-coding loci also influence the expression of a neighbour. These results demonstrate that cross-talk among neighbouring genes is a prevalent phenomenon that can involve multiple mechanisms and cis-regulatory signals, including a role for RNA splice sites. These mechanisms may explain the function and evolution of some genomic loci that produce lncRNAs and broadly contribute to the regulation of both coding and non-coding genes.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Many lncRNA and mRNA loci influence the expression of neighbouring genes.
Figure 2: Enhancer-like function of the Bendr promoter.
Figure 3: Transcription and splicing of Blustr activates Sfmbt2 expression.
Figure 4: Evolutionary conservation of mES cell lncRNAs and their promoters.

Similar content being viewed by others

References

  1. Okazaki, Y. et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420, 563–573 (2002)

    ADS  PubMed  Google Scholar 

  2. Kapranov, P. et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316, 1484–1488 (2007)

    ADS  CAS  PubMed  Google Scholar 

  3. Guttman, M. et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223–227 (2009)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  4. Carninci, P. et al. The transcriptional landscape of the mammalian genome. Science 309, 1559–1563 (2005)

    ADS  CAS  PubMed  Google Scholar 

  5. Lee, J. T. Lessons from X-chromosome inactivation: long ncRNA as guides and tethers to the epigenome. Genes Dev. 23, 1831–1842 (2009)

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Nagano, T. et al. The Air noncoding RNA epigenetically silences transcription by targeting G9a to chromatin. Science 322, 1717–1720 (2008)

    ADS  CAS  PubMed  Google Scholar 

  7. Wang, K. C. et al. A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature 472, 120–124 (2011)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  8. Ørom, U. A. et al. Long noncoding RNAs with enhancer-like function in human cells. Cell 143, 46–58 (2010)

    PubMed  PubMed Central  Google Scholar 

  9. Guil, S. & Esteller, M. Cis-acting noncoding RNAs: friends and foes. Nat. Struct. Mol. Biol. 19, 1068–1075 (2012)

    CAS  PubMed  Google Scholar 

  10. Ebisuya, M., Yamamoto, T., Nakajima, M. & Nishida, E. Ripples from neighbouring transcription. Nat. Cell Biol. 10, 1106–1113 (2008)

    CAS  PubMed  Google Scholar 

  11. Cabili, M. N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011)

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Bassett, A. R. et al. Considerations when investigating lncRNA function in vivo. eLife 3, e03058 (2014)

    PubMed  PubMed Central  Google Scholar 

  13. Li, G. et al. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148, 84–98 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Rajagopal, N. et al. High-throughput mapping of regulatory DNA. Nat. Biotechnol. 34, 167–174 (2016)

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Yin, Y. et al. Opposing roles for the lncRNA haunt and its genomic locus in regulating HOXA gene activation during embryonic stem cell differentiation. Cell Stem Cell 16, 504–516 (2015)

    CAS  PubMed  Google Scholar 

  16. Paralkar, V. R. et al. Unlinking an lncRNA from its associated cis element. Mol. Cell 62, 104–110 (2016)

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Martens, J. A., Laprade, L. & Winston, F. Intergenic transcription is required to repress the Saccharomyces cerevisiae SER3 gene. Nature 429, 571–574 (2004)

    ADS  CAS  PubMed  Google Scholar 

  18. Shearwin, K. E., Callen, B. P. & Egan, J. B. Transcriptional interference—a crash course. Trends Genet. 21, 339–345 (2005)

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Purmann, A. et al. Genomic organization of transcriptomes in mammals: Coregulation and cofunctionality. Genomics 89, 580–587 (2007)

    CAS  PubMed  Google Scholar 

  20. Kosak, S. T. et al. Coordinate gene regulation during hematopoiesis is related to genomic organization. PLoS Biol. 5, e309 (2007)

    PubMed  PubMed Central  Google Scholar 

  21. Brinster, R. L., Allen, J. M., Behringer, R. R., Gelinas, R. E. & Palmiter, R. D. Introns increase transcriptional efficiency in transgenic mice. Proc. Natl Acad. Sci. USA 85, 836–840 (1988)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  22. Fong, Y. W. & Zhou, Q. Stimulatory effect of splicing factors on transcriptional elongation. Nature 414, 929–933 (2001)

    ADS  CAS  PubMed  Google Scholar 

  23. Calo, E. & Wysocka, J. Modification of enhancer chromatin: what, how, and why? Mol. Cell 49, 825–837 (2013)

    CAS  PubMed  Google Scholar 

  24. Andersson, R., Sandelin, A. & Danko, C. G. A unified architecture of transcriptional regulatory elements. Trends Genet. 31, 426–433 (2015)

    CAS  PubMed  Google Scholar 

  25. Kim, T.-K. & Shiekhattar, R. Architectural and functional commonalities between enhancers and promoters. Cell 162, 948–959 (2015)

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Necsulea, A. et al. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature 505, 635–640 (2014)

    ADS  CAS  PubMed  Google Scholar 

  27. Hezroni, H. et al. Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species. Cell Reports 11, 1110–1122 (2015)

    CAS  PubMed  Google Scholar 

  28. Chen, J. et al. Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs. Genome Biol. 17, 19 (2016)

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Bhatt, D. M. et al. Transcript dynamics of proinflammatory genes revealed by sequence analysis of subcellular RNA fractions. Cell 150, 279–290 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Engreitz, J. M. et al. RNA–RNA interactions enable specific targeting of noncoding RNAs to nascent pre-mRNAs and chromatin sites. Cell 159, 188–199 (2014)

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827–832 (2013)

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human cells using the CRISPR-Cas9 system. Science 343, 80–84 (2014)

    ADS  CAS  PubMed  Google Scholar 

  33. Keane, T. M. et al. Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 477, 289–294 (2011)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  34. Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  35. Chen, B. et al. Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system. Cell 155, 1479–1491 (2013)

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Shishkin, A. A. et al. Simultaneous generation of many RNA-seq libraries in a single reaction. Nat. Methods 12, 323–325 (2015)

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Engreitz, J., Lander, E. S. & Guttman, M. RNA antisense purification (RAP) for mapping RNA interactions with chromatin. Methods Mol. Biol. 1262, 183–197 (2015)

    CAS  PubMed  Google Scholar 

  38. Engreitz, J. M. et al. The Xist lncRNA exploits three-dimensional genome architecture to spread across the X chromosome. Science 341, 1237973 (2013)

    PubMed  PubMed Central  Google Scholar 

  39. Gnirke, A. et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009)

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Huang, S., Holt, J., Kao, C.-Y., McMillan, L. & Wang, W. A novel multi-alignment pipeline for high-throughput sequencing data. Database (Oxford) 2014, bau057 (2014)

    Google Scholar 

  41. Guttman, M. et al. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat. Biotechnol. 28, 503–510 (2010)

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Levitt, N., Briggs, D., Gil, A. & Proudfoot, N. J. Definition of an efficient synthetic poly(A) site. Genes Dev. 3, 1019–1025 (1989)

    CAS  PubMed  Google Scholar 

  43. Kwak, H., Fuda, N. J., Core, L. J. & Lis, J. T. Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science 339, 950–953 (2013)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  44. Core, L. J., Waterfall, J. J. & Lis, J. T. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848 (2008)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  45. Mahat, D. B. et al. Base-pair-resolution genome-wide mapping of active RNA polymerases using precision nuclear run-on (PRO-seq). Nat. Protocols 11, 1455–1476 (2016)

    PubMed  Google Scholar 

  46. Adelman, K. & Lis, J. T. Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans. Nat. Rev. Genet. 13, 720–731 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013)

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012)

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Busby, M. et al. Systematic comparison of monoclonal versus polyclonal antibodies for mapping histone modifications by ChIP–seq. Preprint at http://dx.doi.org/10.1101/054387 (2016)

  50. Mouse ENCODE Consortium et al. An encyclopedia of mouse DNA elements (Mouse ENCODE). Genome Biol. 13, 418 (2012)

  51. Kagey, M. H. et al. Mediator and cohesin connect gene expression and chromatin architecture. Nature 467, 430–435 (2010)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  52. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  53. Fort, A. et al. Deep transcriptome profiling of mammalian stem cells supports a regulatory role for retrotransposons in pluripotency maintenance. Nat. Genet. 46, 558–566 (2014)

    CAS  PubMed  Google Scholar 

  54. Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002)

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Garber, M. et al. Identifying novel constrained elements by exploiting biased substitution patterns. Bioinformatics 25, i54–i62 (2009)

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  56. Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (2011)

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Gentleman, R. C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004)

    PubMed  PubMed Central  Google Scholar 

  58. Lawrence, M. et al. Software for computing and annotating genomic ranges. PLOS Comput. Biol. 9, e1003118 (2013)

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Lawrence, M., Gentleman, R. & Carey, V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 25, 1841–1842 (2009)

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010)

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011)

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011)

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank S. Grossman, J. Rinn, M. Yassour, P. Sharp, L. Boyer, M. Ray, C. Fulco, M. Munschauer, T. Wang and N. Friedman for discussions; A. Goren and Broad Technology Labs for ChIP; J. Lis, D. Mahat and A. Shishkin for technical advice and reagents; and J. Flannick for computational tools. J.M.E. is supported by the Fannie and John Hertz Foundation and the National Defense Science and Engineering Graduate Fellowship. M.G. is supported the NIH Director’s Early Independence Award (DP5OD012190), the Edward Mallinckrodt Foundation, the Sontag Foundation, and the Searle Scholars Program. Work in the Lander Laboratory is supported by the Broad Institute.

Author information

Authors and Affiliations

Authors

Contributions

J.M.E., M.G. and E.S.L. conceived and designed the study. J.M.E., J.E.H., G.M., M.K. and P.E.M. developed knockout protocols and performed genetic manipulations. E.M.P. and J.M.E. performed all other experiments. J.M.E. developed computational tools and analysed data. J.M.E. and J.C. performed evolutionary analysis. J.M.E. and E.S.L. wrote the manuscript with input from all authors. E.S.L. supervised the work and obtained funding.

Corresponding author

Correspondence to Eric S. Lander.

Ethics declarations

Competing interests

The Broad Institute holds patents and has filed patent applications on technologies related to other aspects of CRISPR.

Extended data figures and tables

Extended Data Figure 1 Expression and subcellular localization of knocked-out lncRNAs and mRNAs.

a, Expression of lncRNAs and mRNAs in F1 129/castaneus female mES cells, reported in fragments per kilobase per million (FPKM) in whole-cell poly(A)+ RNA-seq. Cumulative fraction is plotted for all mRNAs expressed in mES cells. Large dots represent transcripts whose promoters we deleted in this study. LncRNAs and mRNAs span a >20-fold range of abundance levels. b, Relative subcellular localization of lncRNAs and mRNAs. We sequenced poly(A)+ RNA from chromatin, soluble nuclear, and cytoplasmic fractions (see Methods) and plotted the relative abundance of mature transcripts in each fraction. We selected lncRNAs that showed localization biased towards the nuclear fractions relative to most mRNAs. For comparison, we plotted 1,000 randomly selected mRNAs (light grey).

Extended Data Figure 2 Generation of knockout clones and measurement of allele-specific RNA expression.

a, Overview of knockout and measurement protocol. b, Distribution of allelic expression ratios (number of informative reads mapping to 129S1 allele divided by the number mapping to either the 129S1 or the castaneus allele) across active genes in mES cells. c, Scatter plot of allelic expression ratios for genes with RPKM ≥ 2 that have more than 100 allele-informative reads across all libraries. Allelic expression ratios are consistent in RNA sequencing data before and after hybrid selection (HS). d, e, Allelic expression ratios as measured by two independent methods for Blustr (d) and Sfmbt2 (e) expression in 15 clonal cell lines containing genetic modifications in the Blustr locus. Each dot represents the mean of two ddPCR technical replicates (x axis) and the value from one RNA-seq technical replicate (y axis). f, Example locus showing hybrid selection strategy and RNA-seq coverage for cell lines with the indicated genotype for deletion of the Bendr promoter. The y axis scales represent normalized read counts and are the same for all hybrid selection tracks. The absolute level of expression for any given gene varies among clonal cell lines; throughout this work, we instead consider the relative level of expression between the two alleles in heterozygous knockout cells. For similar plots of each gene studied, see http://pubs.broadinstitute.org/neighboring-genes/.

Extended Data Figure 3 Read-through transcription at Meg3 and Snhg3 loci.

a, Snhg3 promoter knockout reduces the levels of Rcc1 mRNA by 23%. However, sequencing of chromatin-associated RNA shows that transcription continues past the annotated 3′ end of Snhg3 into the downstream Rcc1 gene (see Methods). This read-through transcription creates a fusion transcript containing exons of both Snhg3 and Rcc1, as well as intergenic RNA. We note that this fusion transcript is also annotated in the syntenic human locus as an alternative isoform of RCC1. Bars, relative poly(A)+ RNA expression on modified versus unmodified alleles. Error bars, 95% confidence interval for the mean (n ≥ 2 alleles, see Supplementary Table 1). b, Meg3 promoter knockout eliminates the expression not only of Meg3 but also of two additional lncRNAs encoded downstream in a tandem orientation (Rian and Mirg). Although these three lncRNAs are annotated as separate genes, they appear to be derived from a single transcript driven by the Meg3 promoter. This is consistent with the presence of continuous chromatin-associated RNA throughout the locus and a lack of CAGE reads at the 5′ ends of Rian and Mirg3.

Extended Data Figure 4 Promoter knockouts for five intergenic lncRNAs affect the expression of a neighbouring gene.

Significance (z-score) of allele-specific expression ratios at all genes within 1 Mb of each of five lncRNA loci. Each dot represents a different heterozygous promoter knockout clone for a given gene. Dots are shown only for genes that are sufficiently highly expressed to assess allele-specific expression (see Methods). The y axis is capped at –10 to +10 standard deviations from the mean. Black, knocked-out lncRNA; blue, gene with significant allele-specific change in gene expression (FDR < 10%). Independent clones are not expected to yield the same significance value (z-score), in part because read depth differs between samples.

Extended Data Figure 5 Promoter knockouts for four mRNAs affect the expression of a neighbouring gene.

Significance (z-score) of allele-specific expression ratios at all genes within 1 Mb of each of four mRNA loci. Each dot represents a different heterozygous promoter knockout clone for a given gene. Dots are shown only for genes that are sufficiently highly expressed to assess allele-specific expression (see Methods). The y axis is capped at –10 to +10 standard deviations from the mean. Black, knocked-out lncRNA; blue, gene with significant allele-specific change in gene expression (FDR < 10%). Independent clones are not expected to yield the same significance value (z-score), in part because read depth differs between samples.

Extended Data Figure 6 Dissecting mechanisms for how gene loci regulate a neighbour.

a, Three categories of possible mechanisms by which a gene locus might regulate the expression of a neighbour. b, We used two strategies to insert pAS downstream of gene promoters. In the first strategy, we inserted a 49-bp synthetic pAS (spA) using a single-stranded DNA oligo with 75-bp homology arms (see Methods). c, In the second pAS insertion strategy, we cloned a donor plasmid containing a selection cassette and three different pAS sequences (see Methods). Homology arms of 300–800 bp were used to integrate the cassette. After isolating clones with successful insertions, we used a second round of transfections to remove the selection cassette, leaving behind three tandem pASs. EFS, elongation factor 1 promoter; Puro, puromycin resistance gene (pac); HSV-tk, herpes simplex virus thymidine kinase.

Extended Data Figure 7 Promoters of lncRNAs and mRNAs have enhancer-like functions.

a, Allele-specific GRO-seq signal for clones with the indicated modifications at the Bendr locus. Only reads specifically mapping to one of the two alleles are shown. The y axis scale represents normalized read count and is the same for all tracks. b, Allele-specific poly(A)+ RNA expression for genetic modifications at the linc1405, Snhg17, Gpr19, and Slc30a9 loci. Bars, average RNA expression on modified compared to unmodified (wild-type) alleles. Error bars, 95% confidence intervals for the mean (n ≥ 2 alleles, see Supplementary Table 1). Grey arrows indicate distance from the targeted locus promoter to the affected neighbouring gene. We note that, based on their location, the Snhg17 and Gpr19 pAS insertions probably allow more substantial splicing and transcription; for these loci, it is clear that the majority of the transcript is dispensable but it is possible that transcription close to the promoter may be involved in the cis-regulatory function. c, Presence (grey) or absence (white) of various chromatin marks and transcription factors in mES cells in a 1.5-kb window centred on the TSS of each targeted gene. d, Distance from each knocked-out gene to its neighbouring target gene (x axis) versus the magnitude of the effect on the expression of the neighbouring gene (per cent compared to wild-type, y axis). Blue genes represent those discussed in main text; grey genes are discussed in Supplementary Note 5. e, Proximity-based contacts between the linc1405 and Eomes loci. The y axis shows enrichment in a sequencing-based proximity assay in which we used antisense oligos to capture linc1405 DNA and any interacting, cross-linked proximal DNA (see Methods). TAD annotations are derived from Hi-C experiments in mES cells (see Methods). Blue arrow, focal contact between the linc1405 and Eomes loci.

Extended Data Figure 8 Characterization of genetic modifications in the Blustr locus.

a, Allele-specific GRO-seq signal for clones with the indicated modifications at the Blustr locus. Only reads specifically mapping to one of the two alleles are shown. The y axis scale represents normalized read count and is the same for all tracks, and is magnified five times at the indicated location to better visualize the reads in the Sfmbt2 locus. b, Quantification of allele-specific GRO-seq signal in the Sfmbt2 locus on alleles modified as indicated. TSS, region including the two alternative TSSs of Sfmbt2 and 2 kb downstream; gene body, region containing the remainder of the Sfmbt2 gene locus; pause index, ratio of TSS to gene body. Dashed grey lines indicate the 95% confidence intervals for the mean of eight wild-type clones. Bars, n = 8 for wild-type and n = 1 for others. c, Schematic of the 5′ end of the Blustr locus and genotypes of two knockout clones. The 5′ splice site is located 78 bp downstream of the Blustr transcription start site (in this panel, Blustr is transcribed from left to right). One of the alleles from the two clones contains insertion of the oligo mediated by homologous recombination; the remaining three alleles contain insertions or deletions resulting from non-homologous end joining repair of sgRNA-mediated double-strand breaks, some of which also disrupt the 5′ splice site. Bar plots show allele-specific RNA expression for knockout clones and control clones (n = 18 for +/+, 1 for others). Error bars, 95% confidence interval for the mean. d, Schematic of the observed splice structures of Blustr RNA transcripts in poly(A)+ RNA sequencing of the exon deletion clones. Each deletion removes a region including ~50–200 bp on either side of the exon, thereby removing both the exon and its splice sites. The Exon 4 deletion removes the endogenous pAS, leading to new isoforms of the lncRNA transcript that splice into two cryptic splice acceptors downstream. e, GRO-seq, H3K4me3 ChIP–seq, and chromatin accessibility (ATAC-seq FPKM) at the Blustr and Sfmbt2 promoters in cell lines with the indicated genotypes. Deletion of the first 5′ splice site leads to a significant reduction in H3K4me3, RNA polymerase occupancy, and chromatin accessibility at the Blustr promoter, as well as H3K4me3 and RNA polymerase occupancy (but not accessibility) at the Sfmbt2 promoter. f, H3K27me3 ChIP–seq at the Blustr and Sfmbt2 loci in cell lines with the indicated genotypes. Deletion of the Blustr promoter or 5′ splice site leads to spreading of the repression-associated H3K27me3 modification across a ~30 kb region.

Extended Data Figure 9 Mechanisms for cross-talk between neighbouring lncRNAs and mRNAs.

Proposed mechanisms based on pAS insertion experiments and other genetic manipulations (see text). For proposed mechanisms of lncRNAs marked with daggers see Supplementary Note 5.

Extended Data Figure 10 Classification of lncRNAs based on conservation and promoter location.

a, Classification of 307 lncRNAs expressed in mES cells. ‘Conserved’ transcripts are those that show significant evidence of cap analysis of gene expression (CAGE) data and/or poly(A)+ RNA in syntenic loci (see Methods). Divergent, initiating within 500 bp of an mRNA TSS, on the opposite strand; ERV, endogenous retroviral repetitive element (see Supplementary Note 9). Box plot shows sequence-level conservation of the promoters of subsets of lncRNAs expressed in mES cells. Random intergenic regions are matched to lncRNA promoters by GC content. Positive SiPhy score indicates evolutionary constraint on functional sequences. Orange category corresponds to mouse-specific lncRNAs that appear to have evolved from ancestral regulatory elements (REs) and correspond to sequences that show evidence for DNase I hypersensitivity in human embryonic stem cells. Significance is calculated compared to random intergenic regions using a Mann–Whitney U-test. ***P < 0.001. Box represents first and third quartiles; centre line represents median; whiskers represent data within 1.5× the interquartile range. b, Chromatin and RNA data for 11 mouse-specific lncRNAs that appear to have evolved from ancestral regulatory elements. In mouse, these elements show evidence for CAGE, H3K4me3, and DNase I hypersensitivity, consistent with their roles as promoters. The syntenic sequences in human do not show evidence for CAGE but nonetheless are DNase I hypersensitive and are frequently marked by H3K4me1 and/or CTCF. c, Model for evolution of lncRNAs from pre-existing enhancers, which often initiate weak bidirectional transcription. Spliced transcripts may neutrally appear through the appearance of splice signals and loss of polyadenylation signals. In some cases, transcription, splicing, or other RNA processing mechanisms may feed back and contribute to the cis-regulatory function of the promoter, producing a lncRNA as a by-product.

Supplementary information

Supplementary Information

This file contains Supplementary Notes 1-9 and additional references. (PDF 418 kb)

Supplementary Table 1

This file contains Supplementary Table 1. (XLSX 51 kb)

Supplementary Table 2

This file contains Supplementary Table 2. (XLSX 66 kb)

Supplementary Table 3

This file contains Supplementary Table 3. (XLSX 1473 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Engreitz, J., Haines, J., Perez, E. et al. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539, 452–455 (2016). https://doi.org/10.1038/nature20149

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature20149

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research