[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Compound Property Prediction Based on Multiple Different Molecular Features and Ensemble Learning

  • Conference paper
  • First Online:
CCKS 2022 - Evaluation Track (CCKS 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1711))

Included in the following conference series:

  • 653 Accesses

Abstract

China Conference on Knowledge Graph and Semantic Computing (ccks2022) proposed the task of chemical element knowledge graph construction and compound properties prediction.

For this task, we proposed to generate vector representations of chemical molecules by using molecular descriptors and pharmacophore fingerprints, and using large-scale chemical molecular data for unsupervised training to generate vector representations of chemical molecules. Then we discussed the performance of molecular representations generated by different methods in molecular properties prediction. The vector representations generated based on different ways were concatenated, and they were input into the ensemble model for prediction. Finally, the score of 0.8985 was obtained in the test dataset, and won the first place.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 51.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 64.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Durant, J.L., Leland, B.A., Henry, D.R., Nourse, J.G.: Reoptimization of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 42(6), 1273–1280 (2002). https://doi.org/10.1021/ci010132r

    Article  Google Scholar 

  2. Cereto-Massagué, A., Ojeda, M.J., Valls, C., Mulero, M., Garcia-Vallvé, S., Pujadas, G.: Molecular fingerprint similarity search in virtual screening. Methods 71, 58–63 (2015). https://doi.org/10.1016/j.ymeth.2014.08.005

    Article  Google Scholar 

  3. Morgan, H.L.: The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Doc. 5(2), 107–113 (1965). https://doi.org/10.1021/c160017a018

    Article  Google Scholar 

  4. Goh, G.B., Hodas, N.O., Siegel, C., Vishnu, A.: Smiles2vec: an interpretable general-purpose deep neural network for predicting chemical properties. arXiv preprint arXiv:1712.02034 (2017)

  5. Jaeger, S., Fulle, S., Turk, S.: Mol2vec: unsupervised machine learning approach with chemical intuition. J. Chem. Inf. Model. 58(1), 27–35 (2018)

    Article  Google Scholar 

  6. Chithrananda, S., Grand, G., Ramsundar, B.: ChemBERTa: large-scale self-supervised pretraining for molecular property prediction. arXiv preprint arXiv:2010.09885 (2020)

  7. Wang, S., Guo, Y., Wang, Y., et al.: SMILES-BERT: large scale unsupervised pre-training for molecular property prediction. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, pp. 429–436 (2019)

    Google Scholar 

  8. Wu, Z., Jiang, D., Wang, J., et al.: Knowledge-based BERT: a method to extract molecular features like computational chemists. Briefings Bioinform. 23(3), bbac131 (2022)

    Google Scholar 

  9. Hu, W., Liu, B., Gomes, J., et al.: Strategies for pre-training graph neural networks. arXiv preprint arXiv:1905.12265 (2019)

  10. Sun, M., Xing, J., Wang, H., et al.: MoCL: data-driven molecular fingerprint via knowledge-aware contrastive learning from molecular graph. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 3585–3594 (2021)

    Google Scholar 

  11. Fang, Y., Zhang, Q., Yang, H., et al.: Molecular contrastive learning with chemical element knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 4, pp. 3968–3976 (2022)

    Google Scholar 

  12. Zhang, Z., Liu, Q., Wang, H., et al.: Motif-based graph self-supervised learning for molecular property prediction. Adv. Neural Inf. Process. Syst. 34, 15870–15882 (2021)

    Google Scholar 

  13. Fang, X., Liu, L., Lei, J., et al.: Geometry-enhanced molecular representation learning for property prediction. Nat. Mach. Intell. 4(2), 127–134 (2022)

    Article  Google Scholar 

  14. Rong, Y., Bian, Y., Xu, T., et al.: Self-supervised graph transformer on large-scale molecular data. Adv. Neural Inf. Process. Syst. 33, 12559–12571 (2020)

    Google Scholar 

  15. Ying, C., Cai, T., Luo, S., et al.: Do transformers really perform badly for graph representation? Adv. Neural Inf. Process. Syst. 34, 28877–28888 (2021)

    Google Scholar 

  16. Zhang, X.C., Wu, C.K., Yang, Z.J., et al.: MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction. Briefings Bioinform. 22(6), bbab152 (2021)

    Google Scholar 

  17. Zeng, Z., Yao, Y., Liu, Z., et al.: A deep-learning system bridging molecule structure and biomedical text with comprehension comparable to human professionals. Nat. Commun. 13(1), 1–11 (2022)

    Article  Google Scholar 

  18. Qian, Y., Li, X., Wu, J., et al.: Picture-word order compound protein interaction: predicting compound-protein interaction using structural images of compounds. J. Comput. Chem. 43(4), 255–264 (2022)

    Article  Google Scholar 

  19. Li, B., Liu, Y., Wang, X.: Gradient harmonized single-stage detector. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 8577–8584 (2019)

    Google Scholar 

  20. Zhang, N., et al.: OntoProtein: protein pretraining with gene ontology embedding. In: International Conference on Learning Representations, 29 September 2021

    Google Scholar 

  21. Wu, F., et al.: Molformer: motif-based transformer on 3D heterogeneous molecular graphs. arXiv preprint arXiv:2110.01191, 4 October 2021

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenming Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, W., Zou, J., Yin, L. (2022). Compound Property Prediction Based on Multiple Different Molecular Features and Ensemble Learning. In: Zhang, N., Wang, M., Wu, T., Hu, W., Deng, S. (eds) CCKS 2022 - Evaluation Track. CCKS 2022. Communications in Computer and Information Science, vol 1711. Springer, Singapore. https://doi.org/10.1007/978-981-19-8300-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-8300-9_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-8299-6

  • Online ISBN: 978-981-19-8300-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics