Abstract
Differential analysis of gene and transcript expression using high-throughput RNA sequencing (RNA-seq) is complicated by several sources of measurement variability and poses numerous statistical challenges. We present Cuffdiff 2, an algorithm that estimates expression at transcript-level resolution and controls for variability evident across replicate libraries. Cuffdiff 2 robustly identifies differentially expressed transcripts and genes and reveals differential splicing and promoter-preference changes. We demonstrate the accuracy of our approach through differential analysis of lung fibroblasts in response to loss of the developmental transcription factor HOXA1, which we show is required for lung fibroblast and HeLa cell cycle progression. Loss of HOXA1 results in significant expression level changes in thousands of individual transcripts, along with isoform switching events in key regulators of the cell cycle. Cuffdiff 2 performs robust differential analysis in RNA-seq experiments at transcript resolution, revealing a layer of regulation not readily observable with other high-throughput technologies.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
£139.00 per year
only £11.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Cloonan, N. et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat. Methods 5, 613–619 (2008).
Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Methods 5, 621–628 (2008).
Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).
Guttman, M. et al. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat. Biotechnol. 28, 503–510 (2010).
Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M. & Gilad, Y. RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509–1517 (2008).
Fu, X. et al. Estimating accuracy of RNA-seq and microarrays with proteomics. BMC Genomics 10, 161 (2009).
Graveley, B.R. et al. The developmental transcriptome of Drosophila melanogaster. Nature 471, 473–479 (2011).
Lister, R. et al. Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature 471, 68–73 (2011).
Pickrell, J.K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).
Montgomery, S.B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).
Wang, E.T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).
Pan, Q., Shai, O., Lee, L.J., Frey, B.J. & Blencowe, B.J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413–1415 (2008).
Jiang, H. & Wong, W.H. Statistical inferences for isoform expression in RNA-seq. Bioinformatics 25, 1026–1032 (2009).
Katz, Y., Wang, E.T., Airoldi, E.M. & Burge, C.B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).
Nicolae, M., Mangul, S., Măndoiu, I.I. & Zelikovsky, A. Estimation of alternative splicing isoform frequencies from RNA-seq data. Algorithms Mol. Biol. 6, 9 (2011).
Lee, S. et al. Accurate quantification of transcriptome from RNA-seq data by effective length normalization. Nucleic Acids Res. 39, e9 (2011).
Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).
Langmead, B., Hansen, K.D. & Leek, J.T. Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol. 11, R83 (2010).
Oshlack, A., Robinson, M.D. & Young, M.D. From RNA-seq reads to differential expression results. Genome Biol. 11, 220 (2010).
Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Wang, L., Feng, Z., Wang, X., Wang, X. & Zhang, X. DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics 26, 136–138 (2010).
Hardcastle, T.J. & Kelly, K.A. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics 11, 422 (2010).
Griffith, M. et al. Alternative expression analysis by RNA sequencing. Nat. Methods 7, 843–847 (2010).
Glaus, P., Honkela, A. & Rattray, M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics 28, 1721–1728 (2012).
Anders, S., Reyes, A. & Huber, W. Detecting differential usage of exons from RNA-seq data. Genome Res. 22, 2008–2017 (2012).
Pearson, J.C., Lemons, D. & McGinnis, W. Modulating Hox gene functions during animal body patterning. Nat. Rev. Genet. 6, 893–904 (2005).
Xi, W., WU, Z. & Zhang, X. Isoform abundance inference provides a more accurate estimation of gene expression levels in RNA-seq. J. Bioinform. Comput. Biol. 08, 177 (2010).
Tarazona, S., García-Alcalde, F., Dopazo, J., Ferrer, A. & Conesa, A. Differential expression in RNA-seq: a matter of depth. Genome Res. 21, 2213–2223 (2011).
Hiller, D., Jiang, H., Xu, W. & Wong, W.H. Identifiability of isoform deconvolution from junction arrays and RNA-seq. Bioinformatics 25, 3056–3059 (2009).
Roberts, A., Trapnell, C., Donaghey, J., Rinn, J.L. & Pachter, L. Improving RNA-seq expression estimates by correcting for fragment bias. Genome Biol. 12, R22 (2011).
Rinn, J.L., Bondre, C., Gladstone, H.B., Brown, P.O. & Chang, H.Y. Anatomic demarcation by positional variation in fibroblast gene expression programs. PLoS Genet. 2, e119 (2006).
Wu, J.Q. et al. Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proc. Natl. Acad. Sci. USA 107, 5254–5259 (2010).
Cabili, M.N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. (2011).
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).
Morgan, D.O. & Morgan, D.O. Cyclin-dependent kinases: engines, clocks, and microprocessors. Annu. Rev. Cell Dev. Biol. 13, 261–291 (1997).
Liu, S. et al. Structural analysis of human Orc6 protein reveals a homology with transcription factor TFIIB. Proc. Natl. Acad. Sci. USA 108, 7373–7378 (2011).
Dhar, S.K. & Dhar, S.K. Identification and characterization of the human ORC6 homolog. J. Biol. Chem. 275, 34983–34988 (2000).
Guillamot, M. et al. Cdc14b regulates mammalian RNA polymerase II and represses cell cycle transcription. Scientific Reports 1, 189 (2011).
Washkowitz, A.J., Gavrilov, S., Begum, S. & Papaioannou, V.E. Diverse functional networks of Tbx3 in development and disease. Wiley Interdisciplinary Rev. Syst. Biol. Med. 4, 273–283 (2012).
Wilson, V., Wilson, V., Conlon, F.L. & Conlon, F.L. The T-box family. Genome Biol. 3, S3008 (2002).
Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012).
Bradley, R.K., Merkin, J., Lambert, N.J. & Burge, C.B. Alternative splicing of RNA triplets is often regulated and accelerates proteome evolution. PLoS Biol. 10, e1001229 (2012).
Johnson, D.S., Mortazavi, A., Myers, R.M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007).
Mikkelsen, T.S. et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560 (2007).
Crawford, G.E. et al. Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 16, 123–131 (2006).
Giresi, P.G. & Lieb, J.D. Isolation of active regulatory elements from eukaryotic chromatin using FAIRE (formaldehyde assisted isolation of regulatory elements). Methods 48, 233–239 (2009).
Fullwood, M.J. et al. An oestrogen-receptor-alpha-bound human chromatin interactome. Nature 462, 58–64 (2009).
Zhao, J. et al. Genome-wide identification of polycomb-associated RNAs by RIP-seq. Mol. Cell 40, 939–953 (2010).
Licatalosi, D.D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469 (2008).
Wang, E.T. et al. Transcriptome-wide regulation of pre-mRNA splicing and mRNA localization by muscleblind proteins. Cell 150, 710–724 (2012).
Acknowledgements
We are grateful to D. Kelley for a careful reading of the manuscript, and B. Wold for sharing the hESC RNA-seq data. We are also thankful for the ongoing development efforts of A. Roberts, B. Langmead, D. Kim, G. Pertea, H. Pimentel and S. Salzberg. C.T. and D.G.H. are Damon Runyon Postdoctoral Fellows. J.L.R. is a Damon Runyon-Rachleff Inovator fellow. This work was supported by US National Institutes of Health grants DP2OD006670, P01GM099117, P50HG006193 and RO1ES020260 (to J.L.R.) and R01 HG006129 and R01 DK094699 (to L.P.).
Author information
Authors and Affiliations
Contributions
C.T. and L.P. developed the mathematics and statistics. D.G.H. and M.S. performed the experiments. D.G.H. and C.T. designed the experiments and performed the analysis. C.T. and L.G. implemented the software. L.P., J.L.R., D.G.H. and C.T. conceived the research. All authors wrote and approved the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–87 and Supplementary Tables 1–3 (PDF 21617 kb)
Rights and permissions
About this article
Cite this article
Trapnell, C., Hendrickson, D., Sauvageau, M. et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol 31, 46–53 (2013). https://doi.org/10.1038/nbt.2450
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.2450
This article is cited by
-
Evolution of chemosensory tissues and cells across ecologically diverse Drosophilids
Nature Communications (2024)
-
Type I interferon exacerbates Mycobacterium tuberculosis induced human macrophage death
EMBO Reports (2024)
-
Transcriptome Analysis Reveals Molecular Signatures Associated with Apical Rooted Cutting (ARC) Technology in Seed Potato Production
Journal of Plant Growth Regulation (2024)
-
Comparative Transcriptomics of the Entomopathogenic Fungus Beauveria bassiana Grown on Aerial Surface and in Liquid Environment
Current Microbiology (2024)
-
NtERF4 promotes the biosynthesis of chlorogenic acid and flavonoids by targeting PAL genes in Nicotiana tabacum
Planta (2024)