Exploring Target Identification for Drug Design with K-Nearest Neighbors’ Algorithm

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14126))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

467 Accesses

Abstract

The identification of possible targets for a known compound by its sole molecular representation is one of the most important tasks for drug design and development. In this work, a methodology is proposed for target identification using supervised machine learning. To predict drug binding targets, classification models across targets were constructed using the k-NN algorithm by integrating multiple data types. Two different groups of descriptors are used: 1) Morgan’s fingerprint and 2) general molecular properties of interest. The findings demonstrate that the k-NN classification models achieved a higher f1-score with descriptors based on molecular properties of interest with 0.7 in comparison to the Morgan fingerprint descriptors that achieved a score of 0.57 or the fusion of both with a score of 0.58.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 51.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 64.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Czarnecki, W.M.: Weighted Tanimoto extreme learning machine with case study in drug discovery. IEEE Comput. Intell. Mag. 10(3), 19–29 (2015)
Article Google Scholar
Zhang, W., Lin, W., Zhang, D., Wang, S., Shi, J., Niu, Y.: Recent advances in the machine learning-based drug-target interaction prediction. Curr. Drug Metab. 20(3), 194–202 (2019)
Article Google Scholar
Sydow, D., et al.: Advances and challenges in computational target prediction. J. Chem. Inf. Model. 59 (2019)
Google Scholar
Mathai, N., Kirchmair, J.: Similarity-based methods and machine learning approaches for target prediction in early drug discovery: performance and scope. Int. J. Mol. Sci. 21(10), 3585 (2020)
Article Google Scholar
Yang, S., et al.: Current advances in ligand-based target prediction. Wiley Interdisc. Rev. Comput. Mol. Sci. 11, 1–21 (2020)
Google Scholar
Schuffenhauer, A., Floersheim, P., Acklin, P., Jacoby, E.: Similarity metrics for ligands reflecting the similarity of the target proteins. J. Chem. Inf. Comput. Sci. 43(2), 391–405 (2003)
Article Google Scholar
Nogueira, M.S., Koch, O.: The development of target-specific machine learning models as scoring functions for docking-based target prediction. J. Chem. Inf. Model. 59(3), 1238–1252 (2019). PMID: 30802041
Article Google Scholar
Zhao, S., Shao, L.: Network-based relating pharmacological and genomic spaces for drug target identification. PLoS ONE 5(7) (2010)
Google Scholar
Shaikh, F., Tai, H.K., Desai, N., Siu, S.: Ligtmap: ligand and structure-based target identification and activity prediction for small molecules. J. Cheminform. (2020)
Google Scholar
Bento, A.P., et al.: The ChEMBL bioactivity database: an update. Nucleic Acids Res. 42(D1), D1083–D1090 (2013)
Google Scholar
Mendez, D., et al.: ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 47(D1), D930–D940 (2018)
Google Scholar
Wishart, D.S., et al.: DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46(D1), D1074–D1082 (2017)
Google Scholar
Wang, Y., et al.: PubChem BioAssay: 2017 update. Nucleic Acids Res. 45(D1), D955–D963 (2016)
Google Scholar
Ding, Y., Tang, J., Guo, F.: Identification of drug-target interactions via multiple information integration. Inf. Sci. 418–419, 546–560 (2017)
Article Google Scholar
Peón, A., et al.: Moltarpred: a web tool for comprehensive target prediction with reliability estimation. Chem. Biol. Drug Des. 94 (2019)
Google Scholar
Cockroft, N.T., Cheng, X., Fuchs, J.R.: Starfish: a stacked ensemble target fishing approach and its application to natural products. J. Chem. Inf. Model. 59(11), 4906–4920 (2019). PMID: 31589422
Article Google Scholar
Awale, M., Reymond, J.-L.: The polypharmacology browser ppb2: target prediction combining nearest neighbors with machine learning. J. Chem. Inf. Model. 59, 12 (2018)
Google Scholar
Cui, X., Liu, J., Zhang, J., Qiuyun, W., Li, X.: In silico prediction of drug-induced rhabdomyolysis with machine-learning models and structural alerts. J. Appl. Toxicol. 39, 1224–1232 (2019)
Article Google Scholar
Shi, Y., Hua, Y., Wang, B., Zhang, R., Li, X.: In silico prediction and insights into the structural basis of drug induced nephrotoxicity. Front. Pharmacol. 12, 01 (2022)
Article Google Scholar
Landrum, G., et al.: rdkit/rdkit: 2022_09_1b1 (q3 2022) release, October 2022
Google Scholar
Prakisya, N.P.T., Liantoni, F., Hatta, P., Aristyagama, Y.H., Setiawan, A.: Utilization of k-nearest neighbor algorithm for classification of white blood cells in AML m4, m5, and m7. Open Eng. 11, 662–668 (2021)
Google Scholar
Klimo, M., Škvarek, O., Tarábek, P., Šuch, O., Hrabovsky, J.: Nearest neighbor classification in Minkowski quasi-metric space. In: 2018 World Symposium on Digital Intelligence for Systems and Machines (DISA), pp. 227–232 (2018)
Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Article MATH Google Scholar
Wettschereck, D.: A study of distance-based machine learning algorithms. Ph.D. thesis, Oregon State University, USA, AAI9507711 (1994)
Google Scholar
Bramer, M.: Principles of Data Mining. Springer, London (2007). https://doi.org/10.1007/978-1-84628-766-4
Book MATH Google Scholar
Li-Yu, H., Huang, M.-W., Ke, S.-W., Tsai, C.-F.: The distance function effect on k-nearest neighbor classification for medical datasets. Springerplus 5, 12 (2016)
Google Scholar
Williams, J., Li, Y.: Comparative study of distance functions for nearest neighbors. Adv. Tech. Comput. Sci. Softw. Eng. 79–84 (2008)
Google Scholar
Berrar, D.: Cross-validation. In: Ranganathan, S., Gribskov, M., Nakai, K., Schönbach, C. (eds.) Encyclopedia of Bioinformatics and Computational Biology, pp. 542–545. Academic Press, Oxford (2019)
Google Scholar
Deegalla, S., Boström, H.: Classification of microarrays with kNN: comparison of dimensionality reduction methods. In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds.) IDEAL 2007. LNCS, vol. 4881, pp. 800–809. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-77226-2_80
Chapter Google Scholar
Gfeller, D., Michielin, O., Zoete, V.: Shaping the interaction landscape of bioactive molecules. Bioinformatics 29(23), 3073–3079 (2013)
Google Scholar
Wang, L., Ma, C., Wipf, P., Liu, H., Weiwei, S., Xie, X.-Q.: Targethunter: an in silico target identification tool for predicting therapeutic potential of small organic molecules based on chemogenomic database. AAPS J. 15(2), 395–406 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Bio-Cheminformatics Research Group, Universidad de Las Américas, Quito, 170504, Ecuador
Karina Jimenes-Vargas, Yunierkis Perez-Castillo & Eduardo Tejera
RNASA Group, Department of Computer Science and Information Technologies, Computer Science Faculty, CITIC, University of A Coruna, 15071, A Coruña, Spain
Karina Jimenes-Vargas & Cristian R. Munteanu

Authors

Karina Jimenes-Vargas
View author publications
You can also search for this author in PubMed Google Scholar
Yunierkis Perez-Castillo
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Tejera
View author publications
You can also search for this author in PubMed Google Scholar
Cristian R. Munteanu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Karina Jimenes-Vargas or Eduardo Tejera .

Editor information

Editors and Affiliations

Systems Research Institute of the Polish Academy of Sciences, Warsaw, Poland
Leszek Rutkowski
Częstochowa University of Technology, Częstochowa, Poland
Rafał Scherer
Częstochowa University of Technology, Częstochowa, Poland
Marcin Korytkowski
University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
AGH University of Krakow, Kraków, Poland
Ryszard Tadeusiewicz
University of Louisville, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jimenes-Vargas, K., Perez-Castillo, Y., Tejera, E., Munteanu, C.R. (2023). Exploring Target Identification for Drug Design with K-Nearest Neighbors’ Algorithm. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2023. Lecture Notes in Computer Science(), vol 14126. Springer, Cham. https://doi.org/10.1007/978-3-031-42508-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-42508-0_20
Published: 14 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42507-3
Online ISBN: 978-3-031-42508-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics