[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3314367.3314383acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbbbConference Proceedingsconference-collections
research-article

CNN-SVR for CRISPR-Cpf1 Guide RNA Activity Prediction with Data Augmentation

Published: 07 January 2019 Publication History

Abstract

CRISPR from Prevotella and Francisella 1 (Cpf1), a RNA-guided DNA endonuclease that belongs to a novel class II CRISPR system, has recently become a popular tool for genome editing. How to improve the on-target efficiency and specificity of this system is an important and challenging problem. This paper presents a method for CRISPR-Cpf1 guide RNA activity prediction. Convolutional Neural Network (CNN) and support vector regression (SVR) are combined for this purpose. In the proposed framework, single-base substitution mutation data augmentation technique is applied to generate guide RNAs with indel frequencies, thus increasing the labeled data. In the hybrid CNN-SVR model, CNN works as a trainable feature extractor and SVR performs as the regression operator. Specifically, a merged CNN-based regression model is used to pre-train the model for predicting Cpf1 activity based on target sequence composition. Considering the chromatin accessibility information, the SVR is used to generate the predictions. Experiments on the commonly datasets show that our algorithm outperforms the available state-of-the-art tools.

References

[1]
Zetsche, B., Heidenreich, M., Mohanraju, P., Fedorova, I., Kneppers, J., DeGennaro, E.M., Winblad, N., Choudhury, S.R., Abudayyeh, O.O., Gootenberg, J.S., Wu, W.Y., Scott, D.A., Severinov, K., van der Oost, J. and Zhang, F. 2017. Multiplex gene editing by CRISPR-Cpf1 using a single crRNA array. Nat. Biotechnol. 35, 31--34.
[2]
Kim, H.K., Song, M., Lee, J., Menon, A.V., Jung, S., Kang, Y.M., Choi, J.W., Woo, E., Koh, H.C., Nam, J.W. and Kim, H. 2017. In vivo high-throughput profiling of CRISPR-Cpf1 activity. Nat Methods 14, 153--159.
[3]
Zetsche, B., Gootenberg, J.S., Abudayyeh, O.O., Slaymaker, I.M., Makarova, K.S., Essletzbichler, P., Volz, S.E., Joung, J., van der Oost, J., Regev, A., Koonin, E.V. and Zhang, F. 2015. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell 163, 759--771.
[4]
Hsu, P.D., Scott, D.A., Weinstein, J.A., Ran, F.A., Konermann, S., Agarwala, V., Li, Y., Fine, E.J., Wu, X., Shalem, O., Cradick, T.J., Marraffini, L.A., Bao, G. and Zhang, F. 2013. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827--832.
[5]
Kuan, P.F., Powers, S., He, S., Li, K., Zhao, X. and Huang, B. 2017. A systematic evaluation of nucleotide properties for CRISPR sgRNA design. BMC Bioinformatics 18, 297.
[6]
Xie, S., Shen, B., Zhang, C., Huang, X. and Zhang, Y. 2014. sgRNAcas9: a software package for designing CRISPR sgRNA and evaluating potential off-target cleavage sites. Plos One 9, e100448.
[7]
Erard, N., Knott, S.R.V. and Hannon, G.J. 2017. A CRISPR Resource for Individual, Combinatorial, or Multiplexed Gene Knockout. Molecular Cell. 67, 348.
[8]
Ma, J., Köster, J., Qin, Q., Hu, S., Li, W., Chen, C., Cao, Q., Wang, J., Mei, S. and Liu, Q. 2016. CRISPR-DO for genome-wide CRISPR design and optimization. Bioinformatics 32, 3336--3338.
[9]
Kim, H.K., Min, S., Song, M., Jung, S., Choi, J.W., Kim, Y., Lee, S., Yoon, S. and Kim, H.H. 2018. Deep learning improves prediction of CRISPR-Cpf1 guide RNA activity. Nat. Biotechnol. 36, 239--241.
[10]
Kuscu, C., Arslan, S., Singh, R., Thorpe, J. and Adli, M. 2014. Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease. Nat. Biotechnol. 32, 677--683.
[11]
Doench, J.G., Fusi, N., Sullender, M., Hegde, M., Vaimberg, E.W., Donovan, K.F., Smith, I., Tothova, Z., Wilen, C., Orchard, R., Virgin, H.W., Listgarten, J. and Root, D.E. 2016. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 34, 184--191.
[12]
Aach, J., Mali, P. and Church, G.M. 2014. CasFinder: Flexible algorithm for identifying specific Cas9 targets in genomes. Biorxiv.
[13]
LeCun, Y., Bengio, Y. and Hinton, G. 2015. Deep learning. Nature 521, 436--444.
[14]
Chuai, G., Ma, H., Yan, J., Chen, M., Hong, N., Xue, D., Zhou, C., Zhu, C., Chen, K., Duan, B., Gu, F., Qu, S., Huang, D., Wei, J. and Liu, Q. 2018. DeepCRISPR: optimized CRISPR guide RNA design by deep learning. Genome Biol 19: 80.
[15]
Kim, D., Kim, J., Hur, J.K., Been, K.W., Yoon, S.H. and Kim, J.S. 2016. Genome-wide analysis reveals specificities of Cpf1 endonucleases in human cells. Nat. Biotechnol. 34, 863--868.
[16]
Kleinstiver, B.P., Tsai, S.Q., Prew, M.S., Nguyen, N.T., Welch, M.M., Lopez, J.M., Mccaw, Z.R., Aryee, M.J. and Joung, J.K. 2016. Genome-wide specificities of CRISPR-Cas Cpf1 nucleases in human cells. Nat. Biotechnol. 34, 869--874.
[17]
Ioffe, S. and Szegedy, C. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. 448--456
[18]
Nair, V. and Hinton, G.E. 2010.in International Conference on International Conference on Machine Learning, 807--814
[19]
Huang, G., Liu, Z., Laurens, V.D.M. and Weinberger, K.Q. 2016. Densely Connected Convolutional Networks. 2261--2269.
[20]
Drucker, H., Burges, C.J., Kaufman, L., Smola, A.J. and Vapnik, V. 1997.in Advances in neural information processing systems, 155--161.
[21]
Basak, D., Pal, S. and Patranabis, D.C. 2007. Support vector regression. Neural Information Processing-Letters and Reviews 11, 203--224
[22]
Chari, R., Mali, P., Moosburner, M. and Church, G.M. 2015. Unraveling CRISPR-Cas9 genome engineering parameters via a library-on-library approach. Nat Methods 12, 823--826.
[23]
Liu, Y., Fu, L., Kaufmann, K., Chen, D. and Chen, M. 2018. A practical guide for DNase-seq data analysis: from data management to common applications. Brief Bioinform
[24]
Kingma, D. and Ba, J. 2014. Adam: A Method for Stochastic Optimization. Computer Science

Index Terms

  1. CNN-SVR for CRISPR-Cpf1 Guide RNA Activity Prediction with Data Augmentation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICBBB '19: Proceedings of the 2019 9th International Conference on Bioscience, Biochemistry and Bioinformatics
    January 2019
    115 pages
    ISBN:9781450366540
    DOI:10.1145/3314367
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Natl University of Singapore: National University of Singapore

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 January 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CRISPR-Cpf1
    2. convolutional neural network (CNN)
    3. data augmentation
    4. on-target
    5. support vector regression (SVR)

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICBBB '19

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 182
      Total Downloads
    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media