Abstract
Protein-protein interactions are important for the majority of biological processes. A significant number of computational methods have been developed to predict protein-protein interactions using proteins’ sequence, structural and genomic data. Hence, this fact motivated us to perform a comparative study of various machine learning methods, training them on the set of known protein-protein interactions, using proteins’ global and local attributes. The results of the classifiers were evaluated through cross-validation and several performance measures were computed. It was noticed from the results that support vector machine outperformed other classifiers. This fact has also been established through statistical test, called Wilcoxon rank sum test, at 5% significance level.
An Erratum for this chapter can be found at http://dx.doi.org/10.1007/978-3-319-02309-9_71
An erratum to this chapter can be found at http://dx.doi.org/10.1007/978-3-319-02309-0_71
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press (1996)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Breitkreutz, B.J., Stark, C., Reguly, T., Boucher, L., Breitkreutz, A., Livstone, M., Oughtred, R., Lackner, D.H., Bähler, J., Wood, V., Dolinski, K., Tyers, M.: The BioGRID interaction database: 2008 update. Nucleic Acids Research 36, D637–D640 (2008)
Burger, L., van Nimwegen, E.: Accurate prediction of protein-protein interactions from sequence alignments using a bayesian method. Molecular Systems Biology 4 (2008)
Chatr-aryamontri, A., Ceol, A., Palazzi, L.M., Nardelli, G., Schneider, M.V., Castagnoli, L., Cesareni, G.: MINT: the molecular interaction database. Nucleic Acids Research 35, D572–D574 (2007)
Chu, Y.S., Liu, Y.Q., Wu, Q.: SVM-based prediction of protein-protein interactions of glucosinolate biosynthesis. In: Proceedings of International Conference on Machine Learning and Cybernetics (ICMLC 2012), vol. 2, pp. 471–476. IEEE (2012)
Deane, C.M., Salwiński, Ł., Xenarios, I., Eisenberg, D.: Protein interactions: Two methods for assessment of the reliability of high throughput observations. Molecular & Cellular Proteomics 1(5), 349–356 (2002)
Hollander, M., Wolfe, D.A.: Nonparametric Statistical Methods, 2nd edn. Wiley-Interscience (1999)
John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence (UAI 1995), pp. 338–345. Morgan Kaufmann Publishers Inc. (1995)
Kerrien, S., Alam-Faruque, Y., Aranda, B., Bancarz, I., Bridge, A., Derow, C., Dimmer, E., Feuermann, M., Friedrichsen, A., Huntley, R.P., Kohler, C., Khadake, J., Leroy, C., Liban, A., Lieftink, C., Montecchi-Palazzi, L., Orchard, S.E., Risse, J., Robbe, K., Roechert, B., Thorneycroft, D., Zhang, Y., Apweiler, R., Hermjakob, H.: IntAct–open source resource for molecular interaction data. Nucleic Acids Research 35, D561–D565 (2007)
Klingström, T., Plewczyński, D.: Protein-protein interaction and pathway databases, a graphical review. Briefings in Bioinformatics 12(6), 702–713 (2010)
MacKay, D.J.C.: The evidence framework applied to classification networks. Neural Computation 4(5), 720–736 (1992)
Muley, V.Y.: Improved computational prediction and analysis of protein - protein interaction networks. Ph.D. thesis, Manipal University, References pp. 138–150, Appendix 151–157 (2012)
Plewczynski, D., Tkacz, A., Wyrwicz, L.S., Rychlewski, L., Ginalski, K.: AutoMotif Server for prediction of phosphorylation sites in proteins using support vector machine: 2007 update. Journal of Molecular Modeling 14(1), 69–76 (2008)
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42(3), 203–231 (2001)
Reyes, J.A.: Machine learning for the prediction of protein-protein interactions. Ph.D. thesis, University of Glasgow (2010)
Saha, I., Maulik, U., Bandyopadhyay, S., Plewczynski, D.: Improvement of new automatic differential fuzzy clustering using SVM classifier for microarray analysis. Expert Systems with Applications 38(12), 15,122–15,133 (2011)
Saha, I., Mazzocco, G., Plewczynski, D.: Consensus classification of human leukocyte antigen class II proteins. Immunogenetics 65(2), 97–105 (2013)
Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., Eisenberg, D.: The database of interacting proteins: 2004 update. Nucleic Acids Research 32, D449–D451 (2004)
The Gene Ontology Consortium: Gene Ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)
Vapnik, V.: The nature of statistical learning theory. Springer (1995)
Vapnik, V.: Statistical Learning Theory. Wiley-Interscience (1998)
Wang, Y., Wang, J., Yang, Z., Deng, N.: Sequence-based protein-protein interaction prediction via support vector machine. Journal of Systems Science and Complexity 23(5), 1012–1023 (2010)
Yellaboina, S., Tasneem, A., Zaykin, D.V., Raghavachari, B., Jothi, R.: DOMINE: a comprehensive collection of known and predicted domain-domain interactions. Nucleic Acids Research 39, D730–D735 (2011)
Yu, G., Li, F., Qin, Y., Bo, X., Wu, Y., Wang, S.: GOSemSim: an R package for measuring semantic similarity among GO terms and gene products. Bioinformatics 26(7), 976–978 (2010)
Yuan, Y., Shaw, M.J.: Induction of fuzzy decision trees. Fuzzy Sets and Systems 69(2), 125–139 (1995)
Zhang, L.V., Wong, S.L., King, O.D., Roth, F.P.: Predicting co-complexed protein pairs using genomic and proteomic data integration. BMC Bioinformatics 5(1), 38 (2004)
Zhao, X.W., Ma, Z.Q., Yin, M.H.: Predicting protein-protein interactions by combing various sequence-derived features into the general form of chou’s pseudo amino acid composition. Protein and Peptide Letters 19(5), 492–500 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Saha, I. et al. (2014). Evaluation of Machine Learning Algorithms on Protein-Protein Interactions. In: Gruca, D., Czachórski, T., Kozielski, S. (eds) Man-Machine Interactions 3. Advances in Intelligent Systems and Computing, vol 242. Springer, Cham. https://doi.org/10.1007/978-3-319-02309-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-02309-0_22
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02308-3
Online ISBN: 978-3-319-02309-0
eBook Packages: EngineeringEngineering (R0)