Quantitative Biology > Biomolecules

arXiv:2405.10343 (q-bio)

[Submitted on 15 May 2024]

Title:UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning

Authors:Shikun Feng, Yuyan Ni, Minghao Li, Yanwen Huang, Zhi-Ming Ma, Wei-Ying Ma, Yanyan Lan

View PDF

Abstract:Recently, a noticeable trend has emerged in developing pre-trained foundation models in the domains of CV and NLP. However, for molecular pre-training, there lacks a universal model capable of effectively applying to various categories of molecular tasks, since existing prevalent pre-training methods exhibit effectiveness for specific types of downstream tasks. Furthermore, the lack of profound understanding of existing pre-training methods, including 2D graph masking, 2D-3D contrastive learning, and 3D denoising, hampers the advancement of molecular foundation models. In this work, we provide a unified comprehension of existing pre-training methods through the lens of contrastive learning. Thus their distinctions lie in clustering different views of molecules, which is shown beneficial to specific downstream tasks. To achieve a complete and general-purpose molecular representation, we propose a novel pre-training framework, named UniCorn, that inherits the merits of the three methods, depicting molecular views in three different levels. SOTA performance across quantum, physicochemical, and biological tasks, along with comprehensive ablation study, validate the universality and effectiveness of UniCorn.

Subjects:	Biomolecules (q-bio.BM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2405.10343 [q-bio.BM]
	(or arXiv:2405.10343v1 [q-bio.BM] for this version)
	https://doi.org/10.48550/arXiv.2405.10343

Submission history

From: Yuyan Ni [view email]
[v1] Wed, 15 May 2024 09:20:02 UTC (38,936 KB)

Quantitative Biology > Biomolecules

Title:UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Biology > Biomolecules

Title:UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators