[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3307339.3342181acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
short-paper

Copy Number Variation Detection Using Total Variation

Published: 04 September 2019 Publication History

Abstract

Next-generation sequencing (NGS) technologies offer new opportunities for precise and accurate identification of genomic aberrations, including copy number variations (CNVs). For high-throughput NGS data, using depth of coverage has become a major approach to identify CNVs, especially for whole exome sequencing (WES) data. Due to the high level of noise and biases of read-count data and complexity of the WES data, existing CNV detection tools identify many false CNV segments. Besides, NGS generates a huge amount of data, requiring to use effective and efficient methods. In this work, we propose a novel segmentation algorithm based on the total variation approach to detect CNVs more precisely and efficiently using WES data. The proposed method also filters out outlier read-counts and identifies significant change points to reduce false positives. We used real and simulated data to evaluate the performance of the proposed method and compare its performance with those of other commonly used CNV detection methods. Using simulated and real data, we show that the proposed method outperforms the existing CNV detection methods in terms of accuracy and false discovery rate and has a faster runtime compared to the circular binary segmentation method.

References

[1]
Álvaro Barbero and Suvrit Sra. 2014. Modular proximal optimization for multidimensional total-variation regularization. arXiv:1411.0589 {math, stat} (Nov. 2014). http://arxiv.org/abs/1411.0589 arXiv: 1411.0589.
[2]
Alvaro Furlani Bastos, Keng-Weng Lao, Grazia Todeschini, and Surya Santoso. 2018. Novel Moving Average Filter for Detecting RMS Voltage Step Changes in Triggerless PQ Data . IEEE Transactions on Power Delivery, Vol. 33, 6 (Dec. 2018), 2920--2929.
[3]
Pierre Raphael Bertrand, Mehdi Fhima, and Arnaud Guillin. 2011. Off-Line Detection of Multiple Change Points by the Filtered Derivative with p -Value Method . Sequential Analysis, Vol. 30, 2 (April 2011), 172--207.
[4]
Haeran Cho and Piotr Fryzlewicz. 2011. Multiscale interpretation of taut string estimation and its connection to Unbalanced Haar wavelets. Statistics and Computing, Vol. 21, 4 (Oct. 2011), 671--681.
[5]
Lutz Dümbgen, Arne Kovac, and others. 2009. Extensions of smoothing via taut strings. Electronic Journal of Statistics, Vol. 3 (2009), 41--75.
[6]
Junbo Duan, Ji-Gang Zhang, Hong-Wen Deng, and Yu-Ping Wang. 2013. CNV-TV: A robust method to discover copy number variation from short sequencing reads. BMC Bioinformatics, Vol. 14, 1 (2013), 150.
[7]
Jane Fridlyand, Antoine M. Snijders, Dan Pinkel, Donna G. Albertson, and Ajay N. Jain. 2004. Hidden Markov models approach to the analysis of array CGH data. Journal of Multivariate Analysis, Vol. 90, 1 (July 2004), 132--153.
[8]
Lene Hagen. 2017. Lasso-Path and Taut String Algorithm for One-Dimensional Total Variation Regularization . (2017), 58. https://ntnuopen.ntnu.no/ntnu-xmlui/handle/11250/2459271
[9]
Günter Klambauer, Karin Schwarzbauer, Andreas Mayr, Djork-Arné Clevert, Andreas Mitterecker, Ulrich Bodenhofer, and Sepp Hochreiter. 2012. cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Research, Vol. 40, 9 (May 2012), e69--e69.
[10]
Daniel C. Koboldt, Qunyuan Zhang, David E. Larson, Dong Shen, Michael D. McLellan, Ling Lin, Christopher A. Miller, Elaine R. Mardis, Li Ding, and Richard K. Wilson. 2012. VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Research, Vol. 22, 3 (March 2012), 568--576.
[11]
Jinhwa Kong, Jaemoon Shin, Jungim Won, Keonbae Lee, Unjoo Lee, and Jeehee Yoon. 2017. ExCNVSS: A Noise-Robust Method for Copy Number Variation Detection in Whole Exome Sequencing Data . BioMed Research International, Vol. 2017 (2017), 1--11.
[12]
Pranav Kulkarni and Peter Frommolt. 2017. Challenges in the Setup of Large-scale Next-Generation Sequencing Analysis Workflows . Computational and Structural Biotechnology Journal, Vol. 15 (2017), 471--477.
[13]
H. Li and R. Durbin. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, Vol. 25, 14 (July 2009), 1754--1760.
[14]
Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell, Jue Ruan, Nils Homer, Gabor Marth, Goncalo Abecasis, Richard Durbin, and others. 2009. The sequence alignment/map format and SAMtools . (2009), bibinfonumpages2078--2079 pages.
[15]
Jason Li, Richard Lupat, Kaushalya C. Amarasinghe, Ella R. Thompson, Maria A. Doyle, Georgina L. Ryland, Richard W. Tothill, Saman K. Halgamuge, Ian G. Campbell, and Kylie L. Gorringe. 2012. CONTRA: copy number analysis for targeted resequencing. Bioinformatics, Vol. 28, 10 (May 2012), 1307--1313.
[16]
Hancong Liu, Sirish Shah, and Wei Jiang. 2004. On-line outlier detection and data cleaning. Computers & Chemical Engineering, Vol. 28, 9 (Aug. 2004), 1635--1647.
[17]
Alberto Magi, Lorenzo Tattini, Ingrid Cifola, Romina D'Aurizio, Matteo Benelli, Eleonora Mangano, Cristina Battaglia, Elena Bonora, Ants Kurg, Marco Seri, Pamela Magini, Betti Giusti, Giovanni Romeo, Tommaso Pippucci, Gianluca De Bellis, Rosanna Abbate, and Gian Franco Gensini. 2013. EXCAVATOR: detecting copy number variants from whole-exome sequencing data. Genome Biology, Vol. 14, 10 (2013), R120.
[18]
Iman Mallakpour and Gabriele Villarini. 2016. A simulation study to examine the sensitivity of the Pettitt test to detect abrupt changes in mean. Hydrological Sciences Journal, Vol. 61, 2 (Jan. 2016), 245--254.
[19]
Adam B Olshen, ES Venkatraman, Robert Lucito, and Michael Wigler. 2004. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics, Vol. 5, 4 (2004), 557--572.
[20]
Johan Ottersten, Bo Wahlberg, and Cristian Rojas. 2016. Accurate Changing Point Detection for l1 Mean Filtering . IEEE Signal Processing Letters (2016), 1--1.
[21]
R.K. Pearson. 2001. Exploring process data. Journal of Process Control, Vol. 11, 2 (April 2001), 179--194.
[22]
R.K. Pearson. 2002. Outliers in process modeling and identification. IEEE Transactions on Control Systems Technology, Vol. 10, 1 (Jan. 2002), 55--63.
[23]
Ronald K. Pearson, Yrjö Neuvo, Jaakko Astola, and Moncef Gabbouj. 2016. Generalized Hampel Filters . EURASIP Journal on Advances in Signal Processing, Vol. 2016, 1 (Dec. 2016), 87.
[24]
Franck Picard, Stephane Robin, Marc Lavielle, Christian Vaisse, and Jean-Jacques Daudin. 2005. A statistical approach for array CGH data analysis. BMC bioinformatics, Vol. 6 (Feb. 2005), 27.
[25]
Thorsten Pohlert. Non-parametric trend tests and change-point detection. ( ????).
[26]
Aaron R Quinlan and Ira M Hall. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, Vol. 26, 6 (2010), 841--842.
[27]
Genetics Home Reference. 2019. What are whole exome sequencing and whole genome sequencing? (2019). https://ghr.nlm.nih.gov/primer/testing/sequencing
[28]
Iria Roca, Lorena González-Castro, Helena Fernández, Mª Luz Couce, and Ana Fernández-Marmiesse. 2019. Free-access copy-number variant detection tools for targeted next-generation sequencing data. Mutation Research/Reviews in Mutation Research, Vol. 779 (Jan. 2019), 114--125.
[29]
Cristian R. Rojas and Bo Wahlberg. 2014. On change point detection using the fused lasso method. arXiv:1401.5408 {math, stat} (Jan. 2014). http://arxiv.org/abs/1401.5408 arXiv: 1401.5408.
[30]
Cristian R. Rojas and Bo Wahlberg. 2015. How to monitor and mitigate stair-casing in L1 trend filtering. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . IEEE, South Brisbane, Queensland, Australia, 3946--3950.
[31]
Jarupon Fah Sathirapongsasuti, Hane Lee, Basil A. J. Horst, Georg Brunner, Alistair J. Cochran, Scott Binder, John Quackenbush, and Stanley F. Nelson. 2011. Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV . Bioinformatics (Oxford, England), Vol. 27, 19 (Oct. 2011), 2648--2654.
[32]
Adam Olshen Venkatraman E. Seshan. 2017. DNAcopy . (2017).
[33]
Chen Wang, Jared M Evans, Aditya V Bhagwate, Naresh Prodduturi, Vivekananda Sarangi, Mridu Middha, Hugues Sicotte, Peter T Vedell, Steven N Hart, Gavin R Oliver, and others. 2014. PatternCNV: a versatile tool for detecting copy number changes from exome sequencing data. Bioinformatics, Vol. 30, 18 (2014), 2678--2680.
[34]
Fatima Zare, Sardar Ansari, Kayvan Najarian, and Sheida Nabavi. 2017. Noise cancellation for robust copy number variation detection using next generation sequencing data. IEEE, 230--236.
[35]
Fatima Zare, Sardar Ansari, Kayvan Najarian, and Sheida Nabavi. 2018a. Copy number variation detection using partial alignment information. In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, Madrid, Spain, 2435--2441.
[36]
Fatima Zare, Sardar Ansari, Kayvan Najarian, and Sheida Nabavi. 2018b. Preprocessing Sequence Coverage Data for Precise Detection of Copy Number Variations . IEEE/ACM Transactions on Computational Biology and Bioinformatics (2018), 1--1.
[37]
Fatima Zare, Michelle Dow, Nicholas Monteleone, Abdelrahman Hosny, and Sheida Nabavi. 2017. An evaluation of copy number variation detection tools for cancer using whole exome sequencing data. BMC bioinformatics, Vol. 18, 1 (2017), 286.
[38]
Fatima Zare, Abdelrahman Hosny, and Sheida Nabavi. 2018. Noise cancellation using total variation for copy number variation detection. BMC Bioinformatics, Vol. 19, S11 (Oct. 2018).

Cited By

View all
  • (2021)Copy number variation detection using single cell sequencing dataProceedings of the 12th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3459930.3469556(1-6)Online publication date: 1-Aug-2021

Index Terms

  1. Copy Number Variation Detection Using Total Variation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
    September 2019
    716 pages
    ISBN:9781450366663
    DOI:10.1145/3307339
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 September 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. copy number variation
    2. next generation sequencing
    3. signal processing
    4. taut string
    5. total variation
    6. whole exome sequencing

    Qualifiers

    • Short-paper

    Funding Sources

    • National Institutes of Health

    Conference

    BCB '19
    Sponsor:

    Acceptance Rates

    BCB '19 Paper Acceptance Rate 42 of 157 submissions, 27%;
    Overall Acceptance Rate 254 of 885 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)23
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Copy number variation detection using single cell sequencing dataProceedings of the 12th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3459930.3469556(1-6)Online publication date: 1-Aug-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media