Abstract
Hot spot residues play a crucial role in protein-protein interactions, which are conducive to drug discovery and rational drug design. Only several amino acid residues provide most of the binding free energy for protein interface. These amino acids are called hot spots. This work is to predict hot spot residues by an ensemble machine learning method called Gradient Boosting Decision Tree in Alanine Scanning Energetics Database (ASEdb) and Structural Kinetic and Energetic database of Mutant Protein Interactions (SKEMPI). According to properties of amino acid and protein complex chain where the amino acid is, we design the a program that will not stop until the last most unimportant feature calculated in GBDT method is discarded in every iteration. Consequently, the greedy GBDT method can get a better prediction on hot spot residues after comparing the result, one of evaluation criteria F-score reach at 0.808 in the ASEdb dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chothia, C., Janin, J.: Principles of protein-protein recognition. Nature 256(5520), 705 (1975)
Clackson, T., Wells, J.A.: A hot spot of binding energy in a hormone-receptor interface. Science 267(5196), 383–386 (1995)
Bogan, A.A., Thorn, K.S.: Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280, 1–9 (1998)
Gul, S., Hadian, K.: Protein-protein interaction modulator drug discovery: past efforts and future opportunities using a rich source of low- and high-throughput screening assays. Expert Opin. Drug Discov. 9(12), 1393–1404 (2014)
Thorn, K.S., Bogan, A.A.: ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17(3), 284–285 (2001)
Kortemme, T., Baker, D.: A simple physical model for binding energy hot spots in protein-protein complexes. Proc. Natl. Acad. Sci. U. S. A. 99(22), 14116–14121 (2002)
Tuncbag, N., Gursoy, A., Keskin, O.: Identification of Computational Hot Spots in Protein Interfaces: Combining Solvent Accessibility and Inter-residue Potentials Improves the Accuracy. Oxford University Press, Oxford (2009)
Tuncbag, N., Keskin, O., Gursoy, A.: Hotpoint: hot spot prediction server for protein interfaces. Nucleic Acids Research 38(Web Server issue), 402–406 (2010)
Agrawal, N.J., Bernhard, H., Trout, B.L.: A computational tool to predict the evolutionarily conserved protein-protein interaction hot-spot residues from the structure of the unbound protein. FEBS Lett. 588(2), 326–333 (2014)
Chen, P., Li, J., Wong, L., Kuwahara, H., Huang, J., Gao, X.: Accurate prediction of hot spot residues through physicochemical characteristics of amino acid sequences. Proteins Struct. Funct. Bioinform. 81(8), 1351–1362 (2013)
Xia, J.F., Zhao, X.M., Song, J., Huang, D.S.: APIs: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinform. 11(1), 174 (2010)
Huang, Q.Q., Zhang, X.L.: An improved ensemble learning method with SMOTE for protein interaction hot spots prediction. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 1584–1589 (2017)
Hu, S.S., Peng, C., Bing, W., Li, J.: Protein binding hot spots prediction from sequence only by a new ensemble learning method. Amino Acids 49(1), 1–13 (2017)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Ma, X., Ding, C., Luan, S., Wang, Y., Wang, Y.: Prioritizing influential factors for freeway incident clearance time prediction using the gradient boosting decision trees method. IEEE Trans. Intell. Transp. Syst. 18(9), 2303–2310 (2017)
Moal, I.H., Fernándezrecio, J.: SKEMPI: a structural kinetic and energetic database of mutant protein interactions and its use in empirical models. Bioinformatics 28(20), 2600–2607 (2012)
Mihel, J., Sikić, M., Tomić, S., Jeren, B., Vlahovicek, K.: PSAIA - protein structure and interaction analyzer. BMC Struct. Biol. 8(1), 21 (2008)
Li, X., Keskin, O., Ma, B., Nussinov, R., Liang, J.: Protein-protein interactions: hot spots and structurally conserved residues often locate in complemented pockets that pre-organized in the unbound states: implications for docking. J. Mol. Biol. 344(3), 781–795 (2004)
Jing, H., Li, J., Chen, N., Zhang, X.: Conservation of hot regions in protein-protein interaction in evolution. Methods 110, 73–80 (2016)
Collins, J.C., Bedford, J.T., Greene, L.H.: Elucidating the key determinants of structure, folding, and stability for the, conformation of the b1 domain of protein g using bioinformatics approaches. IEEE Trans. Nanobiosci. 15(2), 140–147 (2016)
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., et al.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389 (1997)
Xu, Z., Huang, G., Weinberger, K.Q., Zheng, A.X.: Gradient boosted feature selection, pp. 522–531. ACM (2014)
Nan, D., Zhang, X.: Prediction of hot regions in protein-protein interactions based on complex network and community detection. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 17–23. IEEE (2014)
Hu, J., Zhang, X., Liu, X., Tang, J.: Prediction of hot regions in protein-protein interaction by combining density-based incremental clustering with feature-based classification. Comput. Biol. Med. 61(C), 127–137 (2015)
Acknowledgment
The authors thank the members of Machine Learning and Artificial Intelligence Laboratory, School of Computer Science and Technology, Wuhan University of Science and Technology, for their helpful discussion within seminars. This work is supported by the National Natural Science Foundation of China (No. 61702385).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Gan, H., Hu, J., Zhang, X., Huang, Q., Zhao, J. (2018). Accurate Prediction of Hot Spots with Greedy Gradient Boosting Decision Tree. In: Huang, DS., Jo, KH., Zhang, XL. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10955. Springer, Cham. https://doi.org/10.1007/978-3-319-95933-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-95933-7_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95932-0
Online ISBN: 978-3-319-95933-7
eBook Packages: Computer ScienceComputer Science (R0)