Efficient Training and Implementation of Gaussian Process Potentials
Molecular simulations are a powerful tool for translating information about the intermolecular interactions within a system to thermophysical properties via statistical mechanics. However, the accuracy of any simulation is limited by the potentials that model the microscopic interactions. Most first...
| Main Author: | |
|---|---|
| Format: | Thesis (University of Nottingham only) |
| Language: | English |
| Published: |
2022
|
| Subjects: | |
| Online Access: | https://eprints.nottingham.ac.uk/69868/ |
| _version_ | 1848800590667186176 |
|---|---|
| author | Broad, Jack W. |
| author_facet | Broad, Jack W. |
| author_sort | Broad, Jack W. |
| building | Nottingham Research Data Repository |
| collection | Online Access |
| description | Molecular simulations are a powerful tool for translating information about the intermolecular interactions within a system to thermophysical properties via statistical mechanics. However, the accuracy of any simulation is limited by the potentials that model the microscopic interactions. Most first principles methods are too computationally expensive for use at every time-step or cycle of a simulation, which require typically thousands of energy evaluations. Meanwhile, cheaper semi-empirical potentials give rise to only qualitatively accurate simulations. Consequently, methods for efficient first principles predictions in simulations are of interest.
Machine-learned potentials (MLPs) have shown promise in this area, offering first principles predictions at a fraction of the cost of ab initio calculation. Of particular interest are Gaussian process (GP) potentials, which achieve equivalent accuracy to other MLPs with smaller training sets. They therefore offer the best route to employing information from expensive ab initio calculations, for which building a large data set is time-consuming.
GP potentials, however, are among the most computationally intensive MLPs. Thus, they are far more costly to employ in simulations than semi-empirical potentials. This work addresses the computational expense of GP potentials by both reducing the training set size at a given accuracy and developing a method to invoke GP potentials efficiently for first principles prediction in simulations.
By varying the cross-over distance between the GP and a long-range function with the accuracy of the former, training by sequential design requires up to 40 % fewer training points at fixed accuracy. This method was applied successfully to the CO-Ne, HF-Ne, HF-Na+, CO2-Ne, 2CO, 2HF and 2HCl systems, and can be extended easily to other interactions and methods of prediction. Meanwhile, a significant reduction in the time taken for Monte Carlo displacement and volume change moves is achieved by parallelisation of the requisite GP calculations. Though this exploits in part the framework of GP regression, the distribution of the calculations themselves is general to other methods of prediction. The work also shows that current kernels and input transforms for modelling intermolecular interactions are not improved easily. |
| first_indexed | 2025-11-14T20:53:59Z |
| format | Thesis (University of Nottingham only) |
| id | nottingham-69868 |
| institution | University of Nottingham Malaysia Campus |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-14T20:53:59Z |
| publishDate | 2022 |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | nottingham-698682022-10-15T04:40:04Z https://eprints.nottingham.ac.uk/69868/ Efficient Training and Implementation of Gaussian Process Potentials Broad, Jack W. Molecular simulations are a powerful tool for translating information about the intermolecular interactions within a system to thermophysical properties via statistical mechanics. However, the accuracy of any simulation is limited by the potentials that model the microscopic interactions. Most first principles methods are too computationally expensive for use at every time-step or cycle of a simulation, which require typically thousands of energy evaluations. Meanwhile, cheaper semi-empirical potentials give rise to only qualitatively accurate simulations. Consequently, methods for efficient first principles predictions in simulations are of interest. Machine-learned potentials (MLPs) have shown promise in this area, offering first principles predictions at a fraction of the cost of ab initio calculation. Of particular interest are Gaussian process (GP) potentials, which achieve equivalent accuracy to other MLPs with smaller training sets. They therefore offer the best route to employing information from expensive ab initio calculations, for which building a large data set is time-consuming. GP potentials, however, are among the most computationally intensive MLPs. Thus, they are far more costly to employ in simulations than semi-empirical potentials. This work addresses the computational expense of GP potentials by both reducing the training set size at a given accuracy and developing a method to invoke GP potentials efficiently for first principles prediction in simulations. By varying the cross-over distance between the GP and a long-range function with the accuracy of the former, training by sequential design requires up to 40 % fewer training points at fixed accuracy. This method was applied successfully to the CO-Ne, HF-Ne, HF-Na+, CO2-Ne, 2CO, 2HF and 2HCl systems, and can be extended easily to other interactions and methods of prediction. Meanwhile, a significant reduction in the time taken for Monte Carlo displacement and volume change moves is achieved by parallelisation of the requisite GP calculations. Though this exploits in part the framework of GP regression, the distribution of the calculations themselves is general to other methods of prediction. The work also shows that current kernels and input transforms for modelling intermolecular interactions are not improved easily. 2022-10-15 Thesis (University of Nottingham only) NonPeerReviewed application/pdf en cc_by https://eprints.nottingham.ac.uk/69868/1/thesis.pdf Broad, Jack W. (2022) Efficient Training and Implementation of Gaussian Process Potentials. PhD thesis, University of Nottingham. Machine Learning Applied Mathematics Machine-learned potentials Gaussian processes |
| spellingShingle | Machine Learning Applied Mathematics Machine-learned potentials Gaussian processes Broad, Jack W. Efficient Training and Implementation of Gaussian Process Potentials |
| title | Efficient Training and Implementation of Gaussian Process Potentials |
| title_full | Efficient Training and Implementation of Gaussian Process Potentials |
| title_fullStr | Efficient Training and Implementation of Gaussian Process Potentials |
| title_full_unstemmed | Efficient Training and Implementation of Gaussian Process Potentials |
| title_short | Efficient Training and Implementation of Gaussian Process Potentials |
| title_sort | efficient training and implementation of gaussian process potentials |
| topic | Machine Learning Applied Mathematics Machine-learned potentials Gaussian processes |
| url | https://eprints.nottingham.ac.uk/69868/ |