Towards Recovering Allele-Specific Cancer Genome Graphs
Pages 224 - 240
Abstract
Integrated analysis of structural variants SVs and copy number alterations CNAs in aneuploid cancer genomes is key to understanding the tumor genome complexity. A recently developed new algorithm Weaver can estimate, for the first time, allele-specific copy number of SVs and their interconnectivity in aneuploid cancer genomes. However, one major limitation is that not all SVs identified by Weaver are phased. In this paper, we develop a general convex programming framework that predicts the interconnectivity of unphased SVs with possibly noisy allele-specific copy number estimations as input. We demonstrated through applications to both simulated data and the HeLa whole-genome sequencing data that our method is robust to the noise in the input copy numbers and can predict SV phasings with high specificity. We found that our method can make consistent predictions with Weaver even if a large proportion of the input variants are unphased. We also applied our method to TCGA ovarian cancer whole-genome sequencing samples to phase unphased SVs obtained by Weaver. Our work provides an important new algorithmic framework for recovering more complete allele-specific cancer genome graphs.
References
[1]
Adey, A., Burton, J.N., Kitzman, J.O., Hiatt, J.B., Lewis, A.P., Martin, B.K., Qiu, R., Lee, C., Shendure, J.: The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line. Nature 5007461, 207---211 2013
[2]
Beroukhim, R., Mermel, C.H., Porter, D., Wei, G., Raychaudhuri, S., Donovan, J., Barretina, J., Boehm, J.S., Dobson, J., Urashima, M., et al.: The landscape of somatic copy-number alteration across human cancers. Nature 4637283, 899---905 2010
[3]
Carter, S.L., Cibulskis, K., Helman, E., McKenna, A., Shen, H., Zack, T., Laird, P.W., Onofrio, R.C., Winckler, W., Weir, B.A., et al.: Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 305, 413---421 2012
[4]
Diamond, S., Boyd, S.: CVXPY: a python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 1783, 1---5 2016
[5]
Dzamba, M., Ramani, A.K., Buczkowicz, P., Jiang, Y., Yu, M., Hawkins, C., Brudno, M.: Identification of complex genomic rearrangements in cancers using CouGaR. Genome Res. 271, 107---117 2017
[6]
Eid, J., Fehr, A., Gray, J., Luong, K., Lyle, J., Otto, G., Peluso, P., Rank, D., Baybayan, P., Bettman, B., et al.: Real-time DNA sequencing from single polymerase molecules. Science 3235910, 133---138 2009
[7]
Gordon, D.J., Resio, B., Pellman, D.: Causes and consequences of aneuploidy in cancer. Nat. Rev. Genet. 133, 189---203 2012
[8]
Greenman, C.D., Pleasance, E.D., Newman, S., Yang, F., Fu, B., Nik-Zainal, S., Jones, D., Lau, K.W., Carter, N., Edwards, P.A., et al.: Estimation of rearrangement phylogeny for cancer genomes. Genome Res. 222, 346---361 2012
[9]
Gupta, A., Place, M., Goldstein, S., Sarkar, D., Zhou, S., Potamousis, K., Kim, J., Flanagan, C., Li, Y., Newton, M.A., et al.: Single-molecule analysis reveals widespread structural variation in multiple myeloma. Proc. Nat. Acad. Sci. 11225, 7689---7694 2015
[10]
Gurobi Optimization Inc.: Gurobi optimizer reference manual 2015
[11]
Karp, R.M.: Reducibility among combinatorial problems. In: Miller, R.E., Thatcher, J.W., Bohlinger, J.D. eds. Complexity of Computer Computations, pp. 85---103. Springer, New York 1972
[12]
Kimura, M.: The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 614, 893 1969
[13]
Li, Y., Zhou, S., Schwartz, D.C., Ma, J.: Allele-specific quantification of structural variations in cancer genomes. Cell Syst. 31, 21---34 2016
[14]
Ma, J., Ratan, A., Raney, B.J., Suh, B.B., Miller, W., Haussler, D.: The infinite sites model of genome evolution. Proc. Nat. Acad. Sci. 10538, 14254---14261 2008
[15]
Medvedev, P., Fiume, M., Dzamba, M., Smith, T., Brudno, M.: Detecting copy number variation with mated short reads. Genome Res. 2011, 1613---1622 2010
[16]
Medvedev, P., Stanciu, M., Brudno, M.: Computational methods for discovering structural variation with next-generation sequencing. Nat. Methods 6, S13---S20 2009
[17]
Oesper, L., Ritz, A., Aerni, S.J., Drebin, R., Raphael, B.J.: Reconstructing cancer genomes from paired-end sequencing data. BMC Bioinform. 136, S10 2012
[18]
Van Loo, P., Nordgard, S.H., Lingjærde, O.C., Russnes, H.G., Rye, I.H., Sun, W., Weigman, V.J., Marynen, P., Zetterberg, A., Naume, B., et al.: Allele-specific copy number analysis of tumors. Proc. Nat. Acad. Sci. 10739, 16910---16915 2010
[19]
Wang, J., Mullighan, C.G., Easton, J., Roberts, S., Heatley, S.L., Ma, J., Rusch, M.C., Chen, K., Harris, C.C., Ding, L., et al.: Crest maps somatic structural variation in cancer genomes with base-pair resolution. Nat. Methods 88, 652---654 2011
[20]
Zack, T.I., Schumacher, S.E., Carter, S.L., Cherniack, A.D., Saksena, G., Tabak, B., Lawrence, M.S., Zhang, C.Z., Wala, J., Mermel, C.H., et al.: Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 4510, 1134---1140 2013
[21]
Zerbino, D.R., Ballinger, T., Paten, B., Hickey, G., Haussler, D.: Representing and decomposing genomic structural variants as balanced integer flows on sequence graphs. BMC Bioinform. 171, 400 2016
[22]
Zheng, G.X., Lau, B.T., Schnall-Levin, M., Jarosz, M., Bell, J.M., Hindson, C.M., Kyriazopoulou-Panagiotopoulou, S., Masquelier, D.A., Merrill, L., Terry, J.M., et al.: Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nat. Biotechnol. 343, 303---311 2016
Recommendations
Genome-wide identification and predictive modeling of lincRNAs polyadenylation in cancer genome
Display Omitted Investigate the distribution of PASs respect to the lincRNA.Acquire the distribution of lincRNA PAS in normal and cancer tissues.Get the motifs surrounding lincRNA of normal and cancer tissues.Propose a SVM-based lincRNA PASs ...
Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data
Motivation: Next-generation sequencing has become an important tool for genome-wide quantification of DNA and RNA. However, a major technical hurdle lies in the need to map short sequence reads back to their correct locations in a reference genome. ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In
Publisher
Springer-Verlag
Berlin, Heidelberg
Publication History
Published: 03 May 2017
Qualifiers
- Article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Reflects downloads up to 04 Jan 2025