Efficient Training and Implementation of Gaussian Process Potentials

Molecular simulations are a powerful tool for translating information about the intermolecular interactions within a system to thermophysical properties via statistical mechanics. However, the accuracy of any simulation is limited by the potentials that model the microscopic interactions. Most first...

Full description

Bibliographic Details
Main Author: Broad, Jack W.
Format: Thesis (University of Nottingham only)
Language:English
Published: 2022
Subjects:
Online Access:https://eprints.nottingham.ac.uk/69868/
_version_ 1848800590667186176
author Broad, Jack W.
author_facet Broad, Jack W.
author_sort Broad, Jack W.
building Nottingham Research Data Repository
collection Online Access
description Molecular simulations are a powerful tool for translating information about the intermolecular interactions within a system to thermophysical properties via statistical mechanics. However, the accuracy of any simulation is limited by the potentials that model the microscopic interactions. Most first principles methods are too computationally expensive for use at every time-step or cycle of a simulation, which require typically thousands of energy evaluations. Meanwhile, cheaper semi-empirical potentials give rise to only qualitatively accurate simulations. Consequently, methods for efficient first principles predictions in simulations are of interest. Machine-learned potentials (MLPs) have shown promise in this area, offering first principles predictions at a fraction of the cost of ab initio calculation. Of particular interest are Gaussian process (GP) potentials, which achieve equivalent accuracy to other MLPs with smaller training sets. They therefore offer the best route to employing information from expensive ab initio calculations, for which building a large data set is time-consuming. GP potentials, however, are among the most computationally intensive MLPs. Thus, they are far more costly to employ in simulations than semi-empirical potentials. This work addresses the computational expense of GP potentials by both reducing the training set size at a given accuracy and developing a method to invoke GP potentials efficiently for first principles prediction in simulations. By varying the cross-over distance between the GP and a long-range function with the accuracy of the former, training by sequential design requires up to 40 % fewer training points at fixed accuracy. This method was applied successfully to the CO-Ne, HF-Ne, HF-Na+, CO2-Ne, 2CO, 2HF and 2HCl systems, and can be extended easily to other interactions and methods of prediction. Meanwhile, a significant reduction in the time taken for Monte Carlo displacement and volume change moves is achieved by parallelisation of the requisite GP calculations. Though this exploits in part the framework of GP regression, the distribution of the calculations themselves is general to other methods of prediction. The work also shows that current kernels and input transforms for modelling intermolecular interactions are not improved easily.
first_indexed 2025-11-14T20:53:59Z
format Thesis (University of Nottingham only)
id nottingham-69868
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T20:53:59Z
publishDate 2022
recordtype eprints
repository_type Digital Repository
spelling nottingham-698682022-10-15T04:40:04Z https://eprints.nottingham.ac.uk/69868/ Efficient Training and Implementation of Gaussian Process Potentials Broad, Jack W. Molecular simulations are a powerful tool for translating information about the intermolecular interactions within a system to thermophysical properties via statistical mechanics. However, the accuracy of any simulation is limited by the potentials that model the microscopic interactions. Most first principles methods are too computationally expensive for use at every time-step or cycle of a simulation, which require typically thousands of energy evaluations. Meanwhile, cheaper semi-empirical potentials give rise to only qualitatively accurate simulations. Consequently, methods for efficient first principles predictions in simulations are of interest. Machine-learned potentials (MLPs) have shown promise in this area, offering first principles predictions at a fraction of the cost of ab initio calculation. Of particular interest are Gaussian process (GP) potentials, which achieve equivalent accuracy to other MLPs with smaller training sets. They therefore offer the best route to employing information from expensive ab initio calculations, for which building a large data set is time-consuming. GP potentials, however, are among the most computationally intensive MLPs. Thus, they are far more costly to employ in simulations than semi-empirical potentials. This work addresses the computational expense of GP potentials by both reducing the training set size at a given accuracy and developing a method to invoke GP potentials efficiently for first principles prediction in simulations. By varying the cross-over distance between the GP and a long-range function with the accuracy of the former, training by sequential design requires up to 40 % fewer training points at fixed accuracy. This method was applied successfully to the CO-Ne, HF-Ne, HF-Na+, CO2-Ne, 2CO, 2HF and 2HCl systems, and can be extended easily to other interactions and methods of prediction. Meanwhile, a significant reduction in the time taken for Monte Carlo displacement and volume change moves is achieved by parallelisation of the requisite GP calculations. Though this exploits in part the framework of GP regression, the distribution of the calculations themselves is general to other methods of prediction. The work also shows that current kernels and input transforms for modelling intermolecular interactions are not improved easily. 2022-10-15 Thesis (University of Nottingham only) NonPeerReviewed application/pdf en cc_by https://eprints.nottingham.ac.uk/69868/1/thesis.pdf Broad, Jack W. (2022) Efficient Training and Implementation of Gaussian Process Potentials. PhD thesis, University of Nottingham. Machine Learning Applied Mathematics Machine-learned potentials Gaussian processes
spellingShingle Machine Learning
Applied Mathematics
Machine-learned potentials
Gaussian processes
Broad, Jack W.
Efficient Training and Implementation of Gaussian Process Potentials
title Efficient Training and Implementation of Gaussian Process Potentials
title_full Efficient Training and Implementation of Gaussian Process Potentials
title_fullStr Efficient Training and Implementation of Gaussian Process Potentials
title_full_unstemmed Efficient Training and Implementation of Gaussian Process Potentials
title_short Efficient Training and Implementation of Gaussian Process Potentials
title_sort efficient training and implementation of gaussian process potentials
topic Machine Learning
Applied Mathematics
Machine-learned potentials
Gaussian processes
url https://eprints.nottingham.ac.uk/69868/