Perspective
Published: 08 December 2023

Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR

Nature Reviews Drug Discovery volume 23, pages 141–155 (2024)Cite this article

13k Accesses
53 Citations
47 Altmetric
Metrics details

Subjects

Abstract

Quantitative structure–activity relationship (QSAR) modelling, an approach that was introduced 60 years ago, is widely used in computer-aided drug design. In recent years, progress in artificial intelligence techniques, such as deep learning, the rapid growth of databases of molecules for virtual screening and dramatic improvements in computational power have supported the emergence of a new field of QSAR applications that we term ‘deep QSAR’. Marking a decade from the pioneering applications of deep QSAR to tasks involved in small-molecule drug discovery, we herein describe key advances in the field, including deep generative and reinforcement learning approaches in molecular design, deep learning models for synthetic planning and the application of deep QSAR models in structure-based virtual screening. We also reflect on the emergence of quantum computing, which promises to further accelerate deep QSAR applications and the need for open-source and democratized resources to support computer-aided drug design.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Contrasting traditional and deep QSAR models.**

**Fig. 2: Generative molecular design.**

**Fig. 4: Molecular simulations enhanced by deep learning potentials in the calculation of ligand binding affinity.**

Computational approaches streamlining drug discovery

Article 26 April 2023

Topological regression as an interpretable and efficient tool for quantitative structure-activity relationship modeling

Article Open access 13 June 2024

QM40, Realistic Quantum Mechanical Dataset for Machine Learning in Molecular Science

Article Open access 18 December 2024

References

Hansch, C., Maloney, P., Fujita, T. & Muir, R. Correlation of biological activity of phenoxyacetic acids with hammett substituent constants and partition coefficients. Nature 194, 178–180 (1962).
Article CAS Google Scholar
Cherkasov, A. et al. QSAR modeling: where have you been? Where are you going to? J. Med. Chem. 57, 4977–5010 (2014).
Article CAS PubMed PubMed Central Google Scholar
Muratov, E. N. et al. QSAR without borders. Chem. Soc. Rev. 49, 3525–3564 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ivakhnenko, A. G. & Lapa, V. G. Cybernetics and Forecasting Techniques (American Elsevier Co, 1967).
Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E. & Svetnik, V. Deep neural nets as a method for quantitative structure-activity relationships. J. Chem. Inf. Model. 55, 263–274 (2015).
Article CAS PubMed Google Scholar
Chen, H., Engkvist, O., Wang, Y., Olivecrona, M. & Blaschke, T. The rise of deep learning in drug discovery. Drug Discov. Today 23, 1241–1250 (2018).
Article PubMed Google Scholar
Yang, X., Wang, Y., Byrne, R., Schneider, G. & Yang, S. Concepts of artificial intelligence for computer-assisted drug discovery. Chem. Rev. 119, 10520–10594 (2019).
Article CAS PubMed Google Scholar
Jiménez-Luna, J., Grisoni, F. & Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2, 573–584 (2020).
Article Google Scholar
Pandey, M. et al. The transformational role of GPU computing and deep learning in drug discovery. Nat. Mach. Intell. 4, 211–221 (2022).
Article Google Scholar
Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 (2012).
Article Google Scholar
Real, E., Aggarwal, A., Huang, Y. & Le, Q. V. Regularized evolution for image classifier architecture search. Preprint at:arXiv https://doi.org/10.48550/arXiv.1802.01548 (2018).
Article Google Scholar
Elsken, T., Metzen, J. H. & Hutter, F. Neural architecture search: a survey.J. Mach. Learn. Res. 20, 1–21 (2019).
Google Scholar
Li, X. & Fourches, D. Inductive transfer learning for molecular activity prediction: next-gen QSAR models with MolPMoFiT. J. Cheminform. 12, 27 (2020).
Article CAS PubMed PubMed Central Google Scholar
Xu, Y., Ma, J., Liaw, A., Sheridan, R. P. & Svetnik, V. Demystifying multitask deep neural networks for quantitative structure–activity relationships. J. Chem. Inf. Model. 57, 2490–2504 (2017).
Article CAS PubMed Google Scholar
Moon, C. & Kim, D. Prediction of drug-target interactions through multi-task learning. Sci. Rep. 12, 18323 (2022).
Article CAS PubMed PubMed Central Google Scholar
Fourches, D., Muratov, E. & Tropsha, A. Trust, but verify: on the importance of chemical structure curation in cheminformatics and QSAR modeling research. J. Chem. Inf. Model. 50, 1189–1204 (2010).
Article CAS PubMed PubMed Central Google Scholar
Fourches, D. et al. Trust, but verify II: a practical guide to chemogenomics data curation. J. Chem. Inf. Model. 56, 1243–1252 (2016).
Article CAS PubMed PubMed Central Google Scholar
Fourches, D., Muratov, E. & Tropsha, A. Curation of chemogenomics data. Nat. Chem. Biol. 11, 535 (2015).
Article CAS PubMed Google Scholar
Alves, V. M. et al. Curated data in — trustworthy in silico models out: the impact of data quality on the reliability of artificial intelligence models as alternatives to animal testing. Altern. Lab. Anim. 49, 73–82 (2021).
Article PubMed PubMed Central Google Scholar
Tropsha, A. Best practices for QSAR model development, validation, and exploitation. Mol. Inform. 29, 476–488 (2010).
Article CAS PubMed Google Scholar
Golbraikh, A., Muratov, E., Fourches, D. & Tropsha, A. Data set modelability by QSAR. J. Chem. Inf. Model. 54, 1–4 (2014).
Article CAS PubMed PubMed Central Google Scholar
Maggiora, G. M. On outliers and activity cliffs — why QSAR often disappoints. J. Chem. Inf. Model. 46, 1535 (2006).
Article CAS PubMed Google Scholar
Aldeghi, M. et al. Roughness of molecular property landscapes and its impact on modellability. J. Chem. Inf. Model. 62, 4660–4671 (2022).
Article CAS PubMed Google Scholar
Bosc, N. et al. Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery. J. Cheminform. 11, 4 (2019).
Article PubMed PubMed Central Google Scholar
Varnek, A. & Tropsha, A. Chemoinformatics Approaches to Virtual Screening. https://doi.org/10.1039/9781847558879 (Royal Society of Chemistry, 2008).
Schneider, G. & Fechner, U. Computer-based de novo design of drug-like molecules. Nat. Rev. Drug Discov. 4, 649–663 (2005).
Article CAS PubMed Google Scholar
Popova, M., Isayev, O. & Tropsha, A. Deep reinforcement learning for de novo drug design. Sci. Adv. 4, eaap7885 (2018).
Article CAS PubMed PubMed Central Google Scholar
Schneider, P. et al. Rethinking drug design in the artificial intelligence era. Nat. Rev. Drug Discov. 19, 353–364 (2019).
Article PubMed Google Scholar
Schneider, G. Mind and machine in drug design. Nat. Mach. Intell. 1, 128–130 (2019).
Article Google Scholar
Schneider, G. & Clark, D. E. Automated de novo drug design: are we nearly there yet? Angew. Chem. Int. Ed. Engl. 58, 10792–10803 (2019).
Article CAS PubMed Google Scholar
Hartenfeller, M. et al. DOGS: reaction-driven de novo design of bioactive compounds. PLoS Comput. Biol. 8, e1002380 (2012).
Article CAS PubMed PubMed Central Google Scholar
Tong, X. et al. Generative models for de novo drug design. J. Med. Chem. 64, 14011–14027 (2021).
Article CAS PubMed Google Scholar
Moret, M. et al. Leveraging molecular structure and bioactivity with chemical language models for de novo drug design. Nat. Commun. 14, 114 (2023).
Article CAS PubMed PubMed Central Google Scholar
Segler, M. H. S., Kogej, T., Tyrchan, C. & Waller, M. P. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4, 120–131 (2018).
Article CAS PubMed Google Scholar
Blaschke, T., Olivecrona, M., Engkvist, O., Bajorath, J. & Chen, H. Application of generative autoencoder in de novo molecular design. Mol. Inform. 37, 1700123 (2018).
Article PubMed Google Scholar
Putin, E. et al. Reinforced adversarial neural computer for de novo molecular design. J. Chem. Inf. Model. 58, 1194–1204 (2018).
Article CAS PubMed Google Scholar
Atz, K., Grisoni, F. & Schneider, G. Geometric deep learning on molecular representations. Nat. Mach. Intell. 3, 1023–1032 (2021).
Article Google Scholar
Button, A., Merk, D., Hiss, J. A. & Schneider, G. Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis. Nat. Mach. Intell. 1, 307–315 (2019).
Article Google Scholar
Grisoni, F. Chemical language models for de novo drug design: challenges and opportunities. Curr. Opin. Struct. Biol. 79, 102527 (2023).
Article CAS PubMed Google Scholar
Kotsias, P. C. et al. Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks. Nat. Mach. Intell. 2, 254–265 (2020).
Article Google Scholar
Korshunova, M. et al. Generative and reinforcement learning approaches for the automated de novo design of bioactive compounds. Commun. Chem. 5, 129 (2022).
Article PubMed PubMed Central Google Scholar
Baskin, I. I. Is one-shot learning a viable option in drug discovery? Expert Opin. Drug Discov. 14, 601–603 (2019).
Article PubMed Google Scholar
Simões, R. S., Maltarollo, V. G., Oliveira, P. R. & Honorio, K. M. Transfer and multi-task learning in QSAR modeling: advances and challenges. Front. Pharmacol. 9, 74 (2018).
Article PubMed PubMed Central Google Scholar
Moret, M., Helmstädter, M., Grisoni, F., Schneider, G. & Merk, D. Beam search for automated design and scoring of novel ROR ligands with machine intelligence. Angew. Chem. Int. Ed. Engl. 60, 19477–19482 (2021).
Article CAS PubMed PubMed Central Google Scholar
Moret, M., Friedrich, L., Grisoni, F., Merk, D. & Schneider, G. Generative molecular design in low data regimes. Nat. Mach. Intell. 2, 171–180 (2020).
Article Google Scholar
Blaschke, T. et al. REINVENT 2.0: an AI tool for de novo drug design. J. Chem. Inf. Model. 60, 5918–5922 (2020).
Article CAS PubMed Google Scholar
Grisoni, F. & Schneider, G. De novo molecular design with chemical language models. Methods Mol. Biol. 2390, 207–232 (2022).
Article CAS PubMed Google Scholar
Chen, H. Can generative-model-based drug design become a new normal in drug discovery? J. Med. Chem. 65, 100–102 (2022).
Article CAS PubMed Google Scholar
Lam, L. & Suen, C. Y. Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Trans. Syst. Man Cybern. Part. A Syst. Hum. 27, 553–568 (1997).
Article Google Scholar
Nippa, D. F. et al. Enabling late-stage drug diversification by high-throughput experimentation with geometric deep learning. Preprint at: ChemRxiv https://doi.org/10.26434/CHEMRXIV-2022-GKXM6 (2022).
Article Google Scholar
Clark, K., Luong, M.-T., Le, Q. V. & Manning, C. D. ELECTRA: pre-training text encoders as discriminators rather than generators. Preprint at:arXiv https://doi.org/10.48550/arxiv.2003.10555 (2020).
Article Google Scholar
Corey, E. J. & Wipke, W. T. Computer-assisted design of complex organic syntheses. Science 166, 178–192 (1969).
Article CAS PubMed Google Scholar
Corey, E. J. General methods for the construction of complex molecules. Pure Appl. Chem. 14, 19–38 (1967).
Article CAS Google Scholar
Coley, C. W., Green, W. H. & Jensen, K. F. Machine learning in computer-aided synthesis planning. Acc. Chem. Res. 51, 1281–1289 (2018).
Article CAS PubMed Google Scholar
Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).
Article CAS PubMed Google Scholar
Segler, M. H. S. & Waller, M. P. Neural-symbolic machine learning for retrosynthesis and reaction prediction. Chemistry 23, 5966–5971 (2017).
Article CAS PubMed Google Scholar
Szymkuć, S. et al. Computer-assisted synthetic planning: the end of the beginning. Angew. Chem. Int. Ed. Engl. 55, 5904–5937 (2016).
Article PubMed Google Scholar
Genheden, S. et al. AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning. J. Cheminform. 12, 70 (2020).
Article PubMed PubMed Central Google Scholar
Jin, W., Coley, C. W., Barzilay, R. & Jaakkola, T. In: NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems 2608–2617 (Neural Information Processing Systems Foundation, 2017).
Sutskever, I., Vinyals, O. & Le, Q. V. In: Proceedings of the 27th International Conference on Neural Information Processing Systems 2, 3104–3112 (Neural Information Processing Systems Foundation, 2014).
Liu, B. et al. Retrosynthetic reaction prediction using neural sequence-to-sequence models. ACS Cent. Sci. 3, 1103–1113 (2017).
Article CAS PubMed PubMed Central Google Scholar
Schwaller, P., Gaudin, T., Lányi, D., Bekas, C. & Laino, T. “Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem. Sci. 9, 6091–6098 (2018).
Article CAS PubMed PubMed Central Google Scholar
Wołos, A. et al. Computer-designed repurposing of chemical wastes into drugs. Nature 604, 668–676 (2022).
Article PubMed Google Scholar
Patel, H. et al. SAVI, in silico generation of billions of easily synthesizable compounds through expert-system type rules. Sci. Data 7, 384 (2020).
Article PubMed PubMed Central Google Scholar
Zabolotna, Y. et al. SynthI: a new open-source tool for synthon-based library design. J. Chem. Inf. Model. 62, 2151–2163 (2022).
Article CAS PubMed Google Scholar
Bonnet, P. Is chemical synthetic accessibility computationally predictable for drug and lead-like molecules? A comparative assessment between medicinal and computational chemists. Eur. J. Med. Chem. 54, 679–689 (2012).
Article CAS PubMed Google Scholar
Boda, K., Seidel, T. & Gasteiger, J. Structure and reaction based evaluation of synthetic accessibility. J. Comput. Aided Mol. Des. 21, 311–325 (2007).
Article CAS PubMed Google Scholar
Ertl, P. & Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 1, 8 (2009).
Article PubMed PubMed Central Google Scholar
Coley, C. W., Rogers, L., Green, W. H. & Jensen, K. F. SCScore: synthetic complexity learned from a reaction corpus. J. Chem. Inf. Model. 58, 252–261 (2018).
Article CAS PubMed Google Scholar
Hoonakker, F., Lachiche, N., Varnek, A. & Wagner, A. A representation to apply usual data mining techniques to chemical reactions — illustration on the rate constant of S(N)2 reactions in water. Int. J. Artif. Intell. Tools 20, 253–270 (2010).
Article Google Scholar
Gimadiev, T. et al. Bimolecular nucleophilic substitution reactions: predictive models for rate constants and molecular reaction pairs analysis. Mol. Inform. 38, 1800104 (2019).
Article Google Scholar
Baskin, I. I., Madzhidov, T. I., Antipin, I. S. & Varnek, A. A. Artificial intelligence in synthetic chemistry: achievements and prospects. Russ. Chem. Rev. 86, 1127–1156 (2017).
Article CAS Google Scholar
Glavatskikh, M. et al. predictive models for kinetic parameters of cycloaddition reactions. Mol. Inform. 38, 1800077 (2019).
Article CAS Google Scholar
Gimadiev, T. R. et al. Assessment of tautomer distribution using the condensed reaction graph approach. J. Comput. Aided Mol. Des. 32, 401–414 (2018).
Article CAS PubMed Google Scholar
Granda, J. M., Donina, L., Dragone, V., Long, D. L. & Cronin, L. Controlling an organic synthesis robot with machine learning to search for new reactivity. Nature 559, 377–381 (2018).
Article CAS PubMed PubMed Central Google Scholar
Ahneman, D. T., Estrada, J. G., Lin, S., Dreher, S. D. & Doyle, A. G. Predicting reaction performance in C–N cross-coupling using machine learning. Science 360, 186–190 (2018).
Article CAS PubMed Google Scholar
Skoraczyñski, G. et al. Predicting the outcomes of organic reactions via machine learning: are current descriptors sufficient? Sci. Rep. 7, 3582 (2017).
Article PubMed PubMed Central Google Scholar
Probst, D., Schwaller, P. & Reymond, J.-L. Reaction classification and yield prediction using the differential reaction fingerprint DRFP. Digit. Discov. 1, 91–97 (2022).
Article CAS PubMed PubMed Central Google Scholar
Marcou, G. et al. Expert system for predicting reaction conditions: the Michael reaction case. J. Chem. Inf. Model. 55, 239–250 (2015).
Article CAS PubMed Google Scholar
Gao, H. et al. Using machine learning to predict suitable conditions for organic reactions. ACS Cent. Sci. 4, 1465–1476 (2018).
Article CAS PubMed PubMed Central Google Scholar
Afonina, V. A. et al. Prediction of optimal conditions of hydrogenation reaction using the likelihood ranking approach. Int. J. Mol. Sci. 23, 248 (2021).
Article PubMed PubMed Central Google Scholar
Lin, A. I. et al. Automatized assessment of protective group reactivity: a step toward big reaction data analysis. J. Chem. Inf. Model. 56, 2140–2148 (2016).
Article CAS PubMed Google Scholar
Schneider, G. Automating drug discovery. Nat. Rev. Drug Discov. 17, 97–113 (2018).
Article CAS PubMed Google Scholar
Abolhasani, M. & Kumacheva, E. The rise of self-driving labs in chemical and materials sciences. Nat. Synth. 2, 483–492 (2023).
Article Google Scholar
Reutlinger, M., Rodrigues, T., Schneider, P. & Schneider, G. Combining on-chip synthesis of a focused combinatorial library with computational target prediction reveals imidazopyridine GPCR ligands. Angew. Chem. Int. Ed. Engl. 53, 582–585 (2014).
Article CAS PubMed Google Scholar
Burger, B. et al. A mobile robotic chemist. Nature 583, 237–241 (2020).
Article CAS PubMed Google Scholar
Genheden, S., Norrby, P. O. & Engkvist, O. AiZynthTrain: robust, reproducible, and extensible pipelines for training synthesis prediction models. J. Chem. Inf. Model. 63, 1841–1846 (2023).
Article CAS PubMed Google Scholar
Ton, A.-T., Gentile, F., Hsing, M., Ban, F. & Cherkasov, A. Rapid identification of potential inhibitors of SARS- CoV-2 main protease by deep docking of 1.3 billion compounds. Mol. Inform. 39, e2000028 (2020).
Article PubMed Google Scholar
Cherkasov, A., Ban, F., Li, Y., Fallahi, M. & Hammond, G. L. Progressive docking: a hybrid QSAR/docking approach for accelerating in silico high throughput screening. J. Med. Chem. 49, 7466–7478 (2006).
Article CAS PubMed Google Scholar
Hilpert, K., Fjell, C. D. & Cherkasov, A. Peptide-based drug design. Methods Mol. Biol. 494, 127–159 (2008).
Article CAS PubMed Google Scholar
Durrant, J. D. & McCammon, J. A. NNScore 2.0: a neural-network receptor-ligand scoring function. J. Chem. Inf. Model. 51, 2897–2903 (2011).
Article CAS PubMed PubMed Central Google Scholar
Svensson, F., Norinder, U. & Bender, A. Improving screening efficiency through iterative screening using docking and conformal prediction. J. Chem. Inf. Model. 57, 439–444 (2017).
Article CAS PubMed Google Scholar
Ahmed, L. et al. Efficient iterative virtual screening with Apache Spark and conformal prediction. J. Cheminform. 10, 8 (2018).
Article PubMed PubMed Central Google Scholar
Rossetti, G. G. et al. Non-covalent SARS-CoV-2 Mpro inhibitors developed from in silico screen hits. Sci. Rep. 12, 2505 (2022).
Article CAS PubMed PubMed Central Google Scholar
Gentile, F. et al. Automated discovery of noncovalent inhibitors of SARS-CoV-2 main protease by consensus deep docking of 40 billion small molecules. Chem. Sci. 12, 15960–15974 (2021).
Article CAS PubMed PubMed Central Google Scholar
Garland, O. et al. Large-scale virtual screening for the discovery of SARS-CoV-2 papain-like protease (PLpro) non-covalent inhibitors. J. Chem. Inf. Model. 63, 2158–2169 (2023).
Article CAS PubMed Google Scholar
Radaeva, M. et al. Discovery of novel Lin28 Inhibitors to suppress cancer cell stemness. Cancers 14, 5687 (2022).
Article CAS PubMed PubMed Central Google Scholar
Gentile, F. et al. Deep docking: a deep learning platform for augmentation of structure based drug discovery. ACS Cent. Sci. 6, 939–949 (2020).
Article CAS PubMed PubMed Central Google Scholar
Gorgulla, C. et al. VirtualFlow Ants — ultra-large virtual screenings with artificial intelligence driven docking algorithm based on ant colony optimization. Int. J. Mol. Sci. 22, 5807 (2021).
Article CAS PubMed PubMed Central Google Scholar
Charifson, P. S., Corkery, J. J., Murcko, M. A. & Walters, W. P. Consensus scoring: a method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins. J. Med. Chem. 42, 5100–5109 (1999).
Article CAS PubMed Google Scholar
Palacio-Rodríguez, K., Lans, I., Cavasotto, C. N. & Cossio, P. Exponential consensus ranking improves the outcome in docking and receptor ensemble docking. Sci. Rep. 9, 5142 (2019).
Article PubMed PubMed Central Google Scholar
Ban, F. et al. Best practices of computer-aided drug discovery: lessons learned from the development of a preclinical candidate for prostate cancer with a new mechanism of action. J. Chem. Inf. Model. 57, 1018–1028 (2017).
Article CAS PubMed Google Scholar
Liu, Z. et al. In: 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 381–385 https://doi.org/10.1109/BIBM52615.2021.9669513 (2021).
McNutt, A. T. & Koes, D. R. Improving ΔΔG predictions with a multitask convolutional siamese network. J. Chem. Inf. Model. 62, 1819–1829 (2022).
Article CAS PubMed PubMed Central Google Scholar
Wang, J. & Dokholyan, N. V. Yuel: improving the generalizability of structure-free compound-protein interaction prediction. J. Chem. Inf. Model. 62, 463–471 (2022).
Article CAS PubMed PubMed Central Google Scholar
Li, X. et al. Deep learning enhancing kinome-wide polypharmacology profiling: model construction and experiment validation. J. Med. Chem. 63, 8723–8737 (2020).
Article CAS PubMed Google Scholar
Li, Z. et al. KinomeX: a web application for predicting kinome-wide polypharmacology effect of small molecules. Bioinformatics 35, 5354–5356 (2019).
Article CAS PubMed Google Scholar
Krishnan, S. R., Bung, N., Bulusu, G. & Roy, A. Accelerating de novo drug design against novel proteins using deep learning. J. Chem. Inf. Model. 61, 621–630 (2021).
Article CAS PubMed Google Scholar
Gentile, F. et al. Artificial intelligence-enabled virtual screening of ultra-large chemical libraries with deep docking. Nat. Protoc. 17, 672–697 (2022).
Article CAS PubMed Google Scholar
LeGrand, S. et al. In: BCB ‘20: Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics https://doi.org/10.1145/3388440.3412472 (Association for Computing Machinery, Inc., 2020).
Gorgulla, C. et al. An open-source drug discovery platform enables ultra-large virtual screens. Nature 580, 663–668 (2020).
Article CAS PubMed PubMed Central Google Scholar
Venkatraman, V. et al. Drugsniffer: an open source workflow for virtually screening billions of molecules for binding affinity to protein targets. Front. Pharmacol. 13, 1389 (2022).
Article Google Scholar
Lyu, J. et al. Ultra-large library docking for discovering new chemotypes. Nature 566, 224–229 (2019).
Article CAS PubMed PubMed Central Google Scholar
Häse, F., Roch, L. M. & Aspuru-Guzik, A. Next-generation experimentation with self-driving laboratories. Trends Chem. 1, 282–291 (2019).
Article Google Scholar
Zubatiuk, T. & Isayev, O. Development of multimodal machine learning potentials: toward a physics-aware artificial intelligence. Acc. Chem. Res. 54, 1575–1585 (2021).
Article CAS PubMed Google Scholar
Behler, J. Four generations of high-dimensional neural network potentials. Chem. Rev. 121, 10037–10072 (2021).
Article CAS PubMed Google Scholar
Smith, J. S., Nebgen, B., Lubbers, N., Isayev, O. & Roitberg, A. E. Less is more: sampling chemical space with active learning. J. Chem. Phys. 148, 241733 (2018).
Article PubMed Google Scholar
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).
Article CAS PubMed PubMed Central Google Scholar
Devereux, C. et al. Extending the applicability of the ANI deep learning molecular potential to sulfur and halogens. J. Chem. Theory Comput. 16, 4192–4202 (2020).
Article CAS PubMed Google Scholar
Galvelis, R., Doerr, S., Damas, J. M., Harvey, M. J. & De Fabritiis, G. A scalable molecular force field parameterization method based on density functional theory and quantum-level machine learning. J. Chem. Inf. Model. 59, 3485–3493 (2019).
Article CAS PubMed Google Scholar
Rufa, D. A. et al. Towards chemical accuracy for alchemical free energy calculations with hybrid physics-based machine learning / molecular mechanics potentials. Preprint at: bioRxiv https://doi.org/10.1101/2020.07.29.227959 (2020).
Article Google Scholar
Wang, L. et al. Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. J. Am. Chem. Soc. 137, 2695–2703 (2015).
Article CAS PubMed Google Scholar
Zubatyuk, R., Smith, J. S., Leszczynski, J. & Isayev, O. Accurate and transferable multitask prediction of chemical properties with an atoms-in-molecules neural network. Sci. Adv. 5, eaav6490 (2019).
Article CAS PubMed PubMed Central Google Scholar
Matta, C. F. & Boyd, R. J. An introduction to the quantum theory of atoms in molecules. The Quantum Theory of Atoms in Molecules https://doi.org/10.1002/9783527610709.ch1 (2007).
Gokcan, H. & Isayev, O. Prediction of protein pKa with representation learning. Chem. Sci. 13, 2462–2474 (2022).
Article CAS PubMed PubMed Central Google Scholar
Bas, D. C., Rogers, D. M. & Jensen, J. H. Very fast prediction and rationalization of pKa values for protein-ligand complexes. Proteins 73, 765–783 (2008).
Article CAS PubMed Google Scholar
Lam, Y. H. et al. Applications. Org. Process. Res. Dev. 24, 1496–1507 (2020).
Article CAS Google Scholar
Hassanzadeh, P. Towards the quantum of quantum chemistry in pharmaceutical process development: current state and opportunities-enabled technologies for development of drugs or delivery systems. J. Control. Rel. 324, 260–279 (2020).
Article CAS Google Scholar
Li, Q. et al. The role of UNC5C in Alzheimer’s disease. Ann. Transl. Med. 6, 178 (2018).
Article CAS PubMed PubMed Central Google Scholar
Cao, Y., Romero, J. & Aspuru-Guzik, A. Potential of quantum computing for drug discovery. IBM J. Res. Dev. 62, 10.1147/JRD.2018.2888987 (2018).
Kirsopp, J. J. M. et al. Quantum computational quantification of protein-ligand interactions. Int. J. Quantum Chem. 122, e26975 (2022).
Article CAS Google Scholar
Outeiral, C. et al. The prospects of quantum computing in computational molecular biology. Wiley Interdiscip. Rev. Comput. Mol. Sci. 11, e1481 (2021).
Article CAS Google Scholar
Li, J. et al. Drug discovery approaches using quantum machine learning. Preprint at: arXiv https://doi.org/10.48550/arxiv.2104.00746 (2021).
Article PubMed PubMed Central Google Scholar
Romero, J., Olson, J. P. & Aspuru-Guzik, A. Quantum autoencoders for efficient compression of quantum data. Quantum Sci. Technol. 2, 045001 (2017).
Article Google Scholar
Cavasotto, C. N. Binding free energy calculation using quantum mechanics aimed for drug lead optimization. Methods Mol. Biol. 2114, 257–268 (2020).
Article CAS PubMed Google Scholar
Heinen, S. et al. Predicting toxicity by quantum machine learning. J. Phys. Commun. 4, 125012 (2020).
Article Google Scholar
Jayatunga, M. K. P., Xie, W., Ruder, L., Schulze, U. & Meier, C. AI in small-molecule drug discovery: a coming wave? Nat. Rev. Drug Discov. 21, 175–176 (2022).
Article CAS PubMed Google Scholar
Pyzer-Knapp, E. O. Using Bayesian optimization to accelerate virtual screening for the discovery of therapeutics appropriate for repurposing for COVID-19. Preprint at: arXiv https://doi.org/10.48550/arxiv.2005.07121 (2020).
Article Google Scholar
Jastrzębski, S. et al. Emulating docking results using a deep neural network: a new perspective for virtual screening. J. Chem. Inf. Model. 60, 4246–4262 (2020).
Article PubMed Google Scholar
Graff, D. E., Shakhnovich, E. I. & Coley, C. W. Accelerating high-throughput virtual screening through molecular pool-based active learning. Chem. Sci. 12, 7866–7881 (2020).
Article Google Scholar
Martin, L. J. State of the art iterative docking with logistic regression and Morgan fingerprints. ChemRxiv https://doi.org/10.26434/chemrxiv.14348117.v1 (2021).
Article Google Scholar
Berenger, F., Kumar, A., Zhang, K. Y. J. & Yamanishi, Y. Lean-docking: exploiting ligands’ predicted docking scores to accelerate molecular docking. J. Chem. Inf. Model. 61, 2341–2352 (2021).
Article CAS PubMed Google Scholar
Kalliokoski, T. Machine learning boosted docking (HASTEN): an open-source tool to accelerate structure-based virtual screening campaigns. Mol. Inform. 40, 2100089 (2021).
Article CAS Google Scholar
Mehta, S. et al. MEMES: machine learning framework for enhanced molecular screening. Chem. Sci. 12, 11710–11721 (2021).
Article CAS PubMed PubMed Central Google Scholar
Yang, Y. et al. Efficient exploration of chemical space with docking and deep learning. J. Chem. Theory Comput. 17, 7106–7119 (2021).
Article CAS PubMed Google Scholar
Choi, J. & Lee, J. V-Dock: fast generation of novel drug-like molecules using machine-learning-based docking score and molecular optimization. Int. J. Mol. Sci. 22, 11635 (2021).
Article CAS PubMed PubMed Central Google Scholar
Bucinsky, L. et al. Machine learning prediction of 3CLpro SARS-CoV-2 docking scores. Comput. Biol. Chem. 98, 107656 (2022).
Article CAS PubMed PubMed Central Google Scholar
Sha, C. M., Wang, J. & Dokholyan, N. V. NeuralDock: rapid and conformation-agnostic docking of small molecules. Front. Mol. Biosci. 9, 244 (2022).
Article Google Scholar
Morris, C. J., Stern, J. A., Stark, B., Christopherson, M. & Della Corte, D. MILCDock: machine learning enhanced consensus docking for virtual screening in drug discovery. J. Chem. Inf. Model. 62, 5342–5350 (2022).
Article CAS PubMed Google Scholar
García-Ortegón, M. et al. DOCKSTRING: easy molecular docking yields better benchmarks for ligand design. J. Chem. Inf. Model. 62, 3486–3502 (2022).
Article PubMed PubMed Central Google Scholar
Qiu, Y. et al. Development and benchmarking of open force field v1.0.0 — the parsley small-molecule force field. J. Chem. Theory Comput. 17, 6262–6280 (2021).
Article CAS PubMed PubMed Central Google Scholar
Tingle, B. I. et al. ZINC-22 — a free multi-billion-scale database of tangible compounds for ligand discovery. J. Chem. Inf. Model. 63, 1166–1176 (2023).
Article CAS PubMed PubMed Central Google Scholar
Babuji, Y. Targeting SARS-CoV-2 with AI- and HPC-enabled lead generation: a first data release. Preprint at: arXiv https://doi.org/10.48550/arXiv.2006.02431 (2020).
Article Google Scholar
Warr, W. A., Nicklaus, M. C., Nicolaou, C. A. & Rarey, M. Exploration of ultralarge compound collections for drug discovery. J. Chem. Inf. Model. 62, 2021–2034 (2022).
Article CAS PubMed Google Scholar
Ruddigkeit, L., van Deursen, R., Blum, L. C. & Reymond, J. L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17.J. Chem. Inf. Model. 52, 2864–2875 (2012).
Article CAS PubMed Google Scholar
Oprea, T. I. & Gottfries, J. Chemography: the art of navigating in chemical space. J. Comb. Chem. 3, 157–166 (2001).
Article CAS PubMed Google Scholar
Medina-Franco, J., Martinez-Mayorga, K., Giulianotti, M., Houghten, R. & Pinilla, C. Visualization of the chemical space in drug discovery. Curr. Comput. Aided Drug Des. 4, 322–333 (2008).
Article CAS Google Scholar
Kireeva, N. et al. Generative topographic mapping (GTM): universal tool for data visualization, structure-activity modeling and dataset comparison. Mol. Inform. 31, 301–312 (2012).
Article CAS PubMed Google Scholar
Zabolotna, Y. et al. Chemography: searching for hidden treasures. J. Chem. Inf. Model. 61, 179–188 (2021).
Article CAS PubMed Google Scholar
Casciuc, I. et al. Virtual screening with generative topographic maps: how many maps are required? J. Chem. Inf. Model. 59, 564–572 (2019).
Article CAS PubMed Google Scholar
Zabolotna, Y. et al. ChemSpace Atlas: multiscale chemography of ultralarge libraries for drug discovery. J. Chem. Inf. Model. 62, 4537–4548 (2022).
Article CAS PubMed Google Scholar
Sattarov, B. et al. De novo molecular design by combining deep autoencoder recurrent neural networks with generative topographic mapping. J. Chem. Inf. Model. 59, 1182–1196 (2019).
Article CAS PubMed Google Scholar
Bort, W. et al. Discovery of novel chemical reactions by deep generative recurrent neural network. Sci. Rep. 11, 3178 (2021).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors acknowledge support of their studies by the National Institutes of Health (grant R01GM140154) for A.T. and National Science Foundation (grant CHE-2154447) for O.I.

Author information

Authors and Affiliations

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Alexander Tropsha
Carnegie Mellon University, Pittsburgh, PA, USA
Olexandr Isayev
University of Strasbourg, Strasbourg, France
Alexandre Varnek
ETH, Zurich, Switzerland
Gisbert Schneider
University of British Columbia, Vancouver, BC, Canada
Artem Cherkasov
Photonic Inc., Coquitlam, BC, Canada
Artem Cherkasov

Authors

Alexander Tropsha
View author publications
You can also search for this author in PubMed Google Scholar
Olexandr Isayev
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Varnek
View author publications
You can also search for this author in PubMed Google Scholar
Gisbert Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Artem Cherkasov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Alexander Tropsha or Artem Cherkasov.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Drug Discovery thanks Esben Jannik Bjerrum, Eric Martin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tropsha, A., Isayev, O., Varnek, A. et al. Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat Rev Drug Discov 23, 141–155 (2024). https://doi.org/10.1038/s41573-023-00832-0

Download citation

Accepted: 21 October 2023
Published: 08 December 2023
Issue Date: February 2024
DOI: https://doi.org/10.1038/s41573-023-00832-0

This article is cited by

QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool
- Helle W. van den Maagdenberg
- Martin Šícho
- Gerard J. P. van Westen
Journal of Cheminformatics (2024)
Understanding predictions of drug profiles using explainable machine learning models
- Caroline König
- Alfredo Vellido
BioData Mining (2024)
Utilizing machine learning-based QSAR model to overcome standalone consensus docking limitation in beta-lactamase inhibitors screening: a proof-of-concept study
- Thanet Pitakbut
- Jennifer Munkert
- Gregor Fuhrmann
BMC Chemistry (2024)
HBCVTr: an end-to-end transformer with a deep neural network hybrid model for anti-HBV and HCV activity predictor from SMILES
- Ittipat Meewan
- Jiraporn Panmanee
- Pichaya Lertvilai
Scientific Reports (2024)
Machine learning in preclinical drug discovery
- Denise B. Catacutan
- Jeremie Alexander
- Jonathan M. Stokes
Nature Chemical Biology (2024)

Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR

Subjects

Abstract

Access options

Similar content being viewed by others

Computational approaches streamlining drug discovery

Topological regression as an interpretable and efficient tool for quantitative structure-activity relationship modeling

QM40, Realistic Quantum Mechanical Dataset for Machine Learning in Molecular Science

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Related links

Rights and permissions

About this article

Cite this article

This article is cited by

QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool

Understanding predictions of drug profiles using explainable machine learning models

Utilizing machine learning-based QSAR model to overcome standalone consensus docking limitation in beta-lactamase inhibitors screening: a proof-of-concept study

HBCVTr: an end-to-end transformer with a deep neural network hybrid model for anti-HBV and HCV activity predictor from SMILES

Machine learning in preclinical drug discovery

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Related links

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links