Protein acetylation sites with complex-valued polynomial model

Wenzheng Bao¹ &
Bin Yang²

106 Accesses
20 Citations
1 Altmetric
Explore all metrics

Abstract

Protein acetylation refers to a process of adding acetyl groups (CH3CO-) to lysine residues on protein chains. As one of the most commonly used protein post-translational modifications, lysine acetylation plays an important role in different organisms. In our study, we developed a human-specific method which uses a cascade classifier of complex-valued polynomial model (CVPM), combined with sequence and structural feature descriptors to solve the problem of imbalance between positive and negative samples. Complex-valued gene expression programming and differential evolution are utilized to search the optimal CVPM model. We also made a systematic and comprehensive analysis of the acetylation data and the prediction results. The performances of our proposed method aie 79.15% in S_p, 78.17% in S_n, 78.66% in ACC 78.76% in F1, and 0.5733 in MCC, which performs better than other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Analysis and prediction of human acetylation using a cascade classifier based on support vector machine

Article Open access 17 June 2019

Computational Prediction of Lysine Acetylation Proteome-Wide

A deep learning method to more accurately recall known lysine acetylation sites

Article Open access 23 January 2019

References

Kouzarides T. Chromatin modifications and their function. Cell, 2007, 128(4): 693–705
Article Google Scholar
Mann M, Jensen O N. Proteomic analysis of post-translational modifications. Nature Biotechnology, 2003, 21(3): 255–261
Article Google Scholar
Lu CT, Lee TY, Chen YJ, et al. “An intelligent system for identifying acetylated lysine on histones and nonhistone proteins,” BioMed research international, 6(528650), 2014.
Deng W, Wang C, Zhang Y, et al. “GPS-PAIL: prediction of lysine acetyltransferase-specific modification sites from protein sequences,” Scientific reports, 6(39787), 2016.
Wysocka J, Swigut T, Xiao H, Milne T A, Kwon S Y, Landry J, Kauer M, Tackett A J, Chait B T, Badenhorst P, Wu C, Allis C D. A PHD finger of NURF couples histone H3 lysine 4 trimethylation with chromatin remodelling. Nature, 2006, 442(7098): 86–90
Article Google Scholar
Wysocka J, Swigut T, Milne T A, Dou Y, Zhang X, Burlingame A L, Roeder R G, Brivanlou A H, Allis C D. WDR5 associates with histone H3 methylated at K4 and is essential for H3 K4 methylation and vertebrate development. Cell, 2005, 121(6): 859–872
Article Google Scholar
Zeng L, Zhou M M. Bromodomain: an acetyl-lysine binding domain. FEBS Letters, 2002, 513(1): 124–128
Article Google Scholar
Jenuwein T, Allis C D. Translating the histone code. Science, 2001, 293(5532): 1074–1080
Article Google Scholar
Marmorstein R, Roth S Y. Histone acetyltransferases: function, structure, and catalysis. Current Opinion in Genetics & Development, 2001, 11(2): 155–161
Article Google Scholar
Bode A M, Dong Z. Post-translational modification of p53 in tumorigenesis. Nature Reviews Cancer, 2004, 4(10): 793–805
Article Google Scholar
Walsh G, Jefferis R. Post-translational modifications in the context of therapeutic proteins. Nature Biotechnology, 2006, 24(10): 1241–1252
Article Google Scholar
Westermann S, Weber K. Post-translational modifications regulate microtubule function. Nature Reviews Molecular Cell Biology, 2003, 4(12): 938–948
Article Google Scholar
Janke C, Bulinski J C. Post-translational regulation of the microtubule cytoskeleton: mechanisms and functions. Nature Reviews Molecular Cell Biology, 2011, 12(12): 773–786
Article Google Scholar
Xu Y, Shao X J, Wu L Y, Deng N Y, Chou K C. iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ, 2013, 1: e171
Article Google Scholar
Qiu W R, Xiao X, Lin W Z, Chou K C. iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach. BioMed Research International, 2014: 947416
Xu Y, Wen X, Shao X J, Deng N Y, Chou K C. iHyd-PseAAC: predicting hydroxyproline and hydroxylysine in proteins by incorporating dipeptide position-specific propensity into pseudo amino acid composition. International Journal of Molecular Sciences, 2014, 15(5): 7594–7610
Article Google Scholar
Xiao X, Ye H X, Liu Z, Jia J H, Chou K C. iROS-gPseKNC: predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition. Oncotarget, 2016, 7(23): 34180–34189
Article Google Scholar
Tu Y, Lin Y, Hou C, Mao S. Complex-valued networks for automatic modulation classification. IEEE Transactions on Vehicular Technology, 2020, 69(9): 10085–10089
Article Google Scholar
Rawat S, Rana K P S, Kumar V. A novel complex-valued convolutional neural network for medical image denoising. Biomedical Signal Processing and Control, 2021, 69: 102859
Article Google Scholar
Yang B, Bao W. Complex-valued ordinary differential equation modeling for time series identification. IEEE Access, 2019, 7: 41033–41042
Article Google Scholar
Chen W, Tang H, Ye J, Lin H, Chou K C. iRNA-PseU: identifying RNA pseudouridine sites. Molecular Therapy Nucleic Acids, 2016, 5: e332
Google Scholar
Jia J, Liu Z, Xiao X, Liu B, Chou K C. iCar-PseCp: identify carbonylation sites in proteins by Monte Carlo sampling and incorporating sequence coupled effects into general PseAAC. Oncotarget, 2016, 7(23): 34558–34570
Article Google Scholar
Jia J, Zhang L, Liu Z, Xiao X, Chou K C. pSumo-CD: predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC. Bioinformatics, 2016, 32(20): 3133–3141
Article Google Scholar
Liu Z, Xiao X, Yu D J, Jia J, Qiu W R, Chou K C. pRNAm-PC: predicting N⁶-methyladenosine sites in RNA sequences via physical-chemical properties. Analytical Biochemistry, 2016, 497: 60–67
Article Google Scholar
Qiu W R, Sun B Q, Xiao X, Xu Z C, Chou K C. iPTM-mLys: identifying multiple lysine PTM sites and their different types. Bioinformatics, 2016, 32(20): 3116–3123
Article Google Scholar
Qiu W R, Xiao X, Xu Z C, Chou K C. iPhos-PseEn: identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier. Oncotarget, 2016, 7(32): 51270–51283
Article Google Scholar
Feng P, Ding H, Yang H, Chen W, Lin H, Chou K C. iRNA-PseColl: identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC. Molecular Therapy Nucleic Acids, 2017, 7: 155–163
Article Google Scholar
Bao W, Huang Z, Yuan C A, Huang D S. Pupylation sites prediction with ensemble classification model. International Journal of Data Mining and Bioinformatics, 2017, 18(2): 91–104
Article Google Scholar
Qiu W R, Jiang S Y, Xu Z C, Xiao X, Chou K C. iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition. Oncotarget, 2017, 8(25): 41178–41188
Article Google Scholar
Qiu W R, Sun B Q, Xiao X, Xu D, Chou K C. iPhos - PseEvo: identifying human phosphorylated proteins by incorporating evolutionary information into general PseAAC via grey system theory. Molecular Informatics, 2017, 36(5–6): 1600010
Article Google Scholar
Qiu W R, Sun B Q, Xiao X, Xu Z C, Jia J H, Chou K C. iKcr-PseEns: identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics, 2018, 110(5): 239–246
Article Google Scholar
Xu Y, Wang Z, Li C, Chou K C. iPreny-PseAAC: identify C-terminal cysteine prenylation sites in proteins by incorporating two tiers of sequence couplings into PseAAC. Medicinal Chemistry, 2017, 13(6): 544–551
Article Google Scholar
Bao W, Jiang Z, Huang D S. Novel human microbe-disease association prediction using network consistency projection. BMC Bioinformatics, 2017, 18(S16): 543
Article Google Scholar
Chou K C. Prediction of human immunodeficiency virus protease cleavage sites in proteins. Analytical Biochemistry, 1996, 233(1): 1–14
Article MathSciNet Google Scholar
Khan Y D, Rasool N, Hussain W, Khan S A, Chou K C. iPhosT-PseAAC: identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Analytical Biochemistry, 2018, 550: 109–116
Article Google Scholar
Liu B, Liu F, Wang X, Chen J, Fang L, Chou K C. Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Research, 2015, 43(W1): W65–W71
Article Google Scholar
Chou K C. Impacts of bioinformatics to medicinal chemistry. Medicinal Chemistry, 2015, 11(3): 218–234
Article Google Scholar
Yuan L F, Ding C, Guo S H, Ding H, Chen W, Lin H. Prediction of the types of ion channel-targeted conotoxins based on radial basis function network. Toxicology in Vitro, 2013, 27(2): 852–856
Article Google Scholar
Chen W, Lin H, Chou K C. Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. Molecular Biosystems, 2015, 11(10): 2620–2634
Article Google Scholar
Cheng X, Zhao S G, Lin W Z, Xiao X, Chou K C. pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics, 2017, 33(22): 3524–3531
Article Google Scholar
Cheng X, Xiao X, Chou K C. pLoc-mGneg: predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics, 2018, 110(4): 231–239
Article Google Scholar
Cheng X, Xiao X, Chou K C. pLoc-mEuk: predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics, 2018, 110(1): 50–58
Article Google Scholar
Bao W, Chen Y, Wang D. Prediction of protein structure classes with flexible neural tree. Bio-Medical Materials and Engineering, 2014, 24(6): 3797–3806
Article Google Scholar
Bao W, Wang D, Chen Y. Classification of protein structure classes on flexible neutral tree. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2017, 14(5): 1122–1133
Article Google Scholar
Chen Y, Yang B, Dong J, Abraham A. Time-series forecasting using flexible neural tree model. Information Sciences, 2005, 174(3–4): 219–235
Article MathSciNet Google Scholar
Chen Y, Abraham A, Yang B. Hybrid flexible neural-tree-based intrusion detection systems. International Journal of Intelligent Systems, 2007, 22(4): 337–352
Article Google Scholar
Chen Y, Abraham A, Yang B. Feature selection and classification using flexible neural tree. Neurocomputing, 2006, 70(1–3): 305–313
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 61902337), Xuzhou Science and Technology Plan Project (KC21047), Jiangsu Provincial Natural Science Foundation (No. SBK2019040953), Natural Science Fund for Colleges and Universities in Jiangsu Province (No. 19KJB520016) and Young Talents of Science and Technology in Jiangsu, the Key Research Program of the Science Foundation of Shandong Province (ZR2020KE001), the talent project of “Qingtan Scholar” of Zaozhuang University, the PhD research startup foundation of Zaozhuang University (No.2014BS13), and Zaozhuang University Foundation (No. 2015YY02).

Author information

Authors and Affiliations

School of Information Engineering, Xuzhou University of Technology, Xuzhou, 221018, China
Wenzheng Bao
School of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277160, China
Bin Yang

Authors

Wenzheng Bao
View author publications
You can also search for this author in PubMed Google Scholar
Bin Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Yang.

Additional information

Wenzheng Bao received the PhD degree in Computer Science from Tongji University, China in 2018. He is an associate professor, the master’s tutor of School of Information Engineering, Xuzhou University of Technology, China. His research interests include bioinformatics and machine learning.

Bin Yang received the PhD degree in Computer Science from Shandong University, China in 2014. He is a professor, the master’s tutor of School of Information Science and Engineering, Zaozhuang University, China. His research interests include bioinformatics and machine learning.

Electronic supplementary material