Physics > Chemical Physics

arXiv:2306.10066 (physics)

[Submitted on 15 Jun 2023]

Title:On the Interplay of Subset Selection and Informed Graph Neural Networks

Authors:Niklas Breustedt, Paolo Climaco, Jochen Garcke, Jan Hamaekers, Gitta Kutyniok, Dirk A. Lorenz, Rick Oerder, Chirag Varun Shukla

View PDF

Abstract:Machine learning techniques paired with the availability of massive datasets dramatically enhance our ability to explore the chemical compound space by providing fast and accurate predictions of molecular properties. However, learning on large datasets is strongly limited by the availability of computational resources and can be infeasible in some scenarios. Moreover, the instances in the datasets may not yet be labelled and generating the labels can be costly, as in the case of quantum chemistry computations. Thus, there is a need to select small training subsets from large pools of unlabelled data points and to develop reliable ML methods that can effectively learn from small training sets. This work focuses on predicting the molecules atomization energy in the QM9 dataset. We investigate the advantages of employing domain knowledge-based data sampling methods for an efficient training set selection combined with informed ML techniques. In particular, we show how maximizing molecular diversity in the training set selection process increases the robustness of linear and nonlinear regression techniques such as kernel methods and graph neural networks. We also check the reliability of the predictions made by the graph neural network with a model-agnostic explainer based on the rate distortion explanation framework.

Subjects:	Chemical Physics (physics.chem-ph); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2306.10066 [physics.chem-ph]
	(or arXiv:2306.10066v1 [physics.chem-ph] for this version)
	https://doi.org/10.48550/arXiv.2306.10066

Submission history

From: Chirag Varun Shukla [view email]
[v1] Thu, 15 Jun 2023 09:09:27 UTC (571 KB)

Physics > Chemical Physics

Title:On the Interplay of Subset Selection and Informed Graph Neural Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Physics > Chemical Physics

Title:On the Interplay of Subset Selection and Informed Graph Neural Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators