As machine learning models are becoming mainstream tools for molecular and materials research, there is an urgent need to improve the nature, quality, and accessibility of atomistic data. In turn, there are opportunities for a new generation of generally applicable datasets and distillable models.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
£14.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
£79.00 per year
only £6.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O. & Walsh, A. Nature 559, 547–555 (2018).
Zhou, Y., Zhang, W., Ma, E. & Deringer, V. L. Nat. Electron. 6, 746–754 (2023).
Merchant, A. et al. Nature 624, 80–85 (2023).
Behler, J. Angew. Chem. Int. Ed. 56, 12828–12840 (2017).
Unke, O. T. et al. Chem. Rev. 121, 10142–10186 (2021).
Deringer, V. L. et al. Chem. Rev. 121, 10073–10141 (2021).
Batzner, S. et al. Nat. Commun. 13, 2453 (2022).
Ko, T. W. & Ong, S. P. Nat. Comput. Sci. 3, 998–1000 (2023).
Gardner, J. L. A., Baker, K. T. & Deringer, V. L. Mach. Learn. Sci. Technol. 5, 015003 (2024).
Morrow, J. D. & Deringer, V. L. J. Chem. Phys. 157, 104105 (2022).
Zhang, D. et al. Preprint at https://arxiv.org/abs/2312.15492 (2023).
Noé, F., Olsson, S., Köhler, J. & Wu, H. Science 365, eaaw1147 (2019).
Oganov, A. R., Pickard, C. J., Zhu, Q. & Needs, R. J. Nat. Rev. Mater. 4, 331–348 (2019).
Batatia, I. et al. Preprint at https://arxiv.org/abs/2401.00096 (2024).
Focassio, B., Freitas, L. P. M. & Schleder, G. R. Preprint at http://arxiv.org/abs/2403.04217 (2024).
Unke, O. T. & Meuwly, M. J. Chem. Theory Comput. 15, 3678–3693 (2019).
Artrith, N. et al. Nat. Chem. 13, 505–508 (2021).
Tedersoo, L. et al. Sci. Data 8, 192 (2021).
Acknowledgements
We thank Z. Faure Beaulieu for useful discussions. J.L.A.G. acknowledges a UKRI Linacre - The EPA Cephalosporin Scholarship, support from an EPSRC DTP award (grant no. EP/T517811/1), and from the Department of Chemistry, University of Oxford. V.L.D. acknowledges a UK Research and Innovation Frontier Research grant (grant no. EP/X016188/1).
Author information
Authors and Affiliations
Contributions
All authors contributed to the writing of this Comment.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Ekin Cubuk and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Rights and permissions
About this article
Cite this article
Ben Mahmoud, C., Gardner, J.L.A. & Deringer, V.L. Data as the next challenge in atomistic machine learning. Nat Comput Sci 4, 384–387 (2024). https://doi.org/10.1038/s43588-024-00636-1
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43588-024-00636-1