Automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment

The process of protein structure prediction is a crucial part of understanding the function of the building blocks of life. It is based on the approximation of a protein free energy that is used to guide the search through the space of protein structures towards the thermodynamic equilibrium of the...

Full description

Bibliographic Details
Main Author: Widera, Paweł
Format: Thesis (University of Nottingham only)
Language:English
Published: 2010
Subjects:
Online Access:https://eprints.nottingham.ac.uk/11394/
_version_ 1848791266594127872
author Widera, Paweł
author_facet Widera, Paweł
author_sort Widera, Paweł
building Nottingham Research Data Repository
collection Online Access
description The process of protein structure prediction is a crucial part of understanding the function of the building blocks of life. It is based on the approximation of a protein free energy that is used to guide the search through the space of protein structures towards the thermodynamic equilibrium of the native state. A function that gives a good approximation of the protein free energy should be able to estimate the structural distance of the evaluated candidate structure to the protein native state. This correlation between the energy and the similarity to the native is the key to high quality predictions. State-of-the-art protein structure prediction methods use very simple techniques to design such energy functions. The individual components of the energy functions are created by human experts with the use of statistical analysis of common structural patterns that occurs in the known native structures. The energy function itself is then defined as a simple weighted sum of these components. Exact values of the weights are set in the process of maximisation of the correlation between the energy and the similarity to the native measured by a root mean square deviation between coordinates of the protein backbone. In this dissertation I argue that this process is oversimplified and could be improved on at least two levels. Firstly, a more complex functional combination of the energy components might be able to reflect the similarity more accurately and thus improve the prediction quality. Secondly, a more robust similarity measure that combines different notions of the protein structural similarity might provide a much more realistic baseline for the energy function optimisation. To test these two hypotheses I have proposed a novel approach to the design of energy functions for protein structure prediction using a genetic programming algorithm to evolve the energy functions and a structural similarity consensus to provide a reference similarity measure. The best evolved energy functions were found to reflect the similarity to the native better than the optimised weighted sum of terms, and therefore opening a new interesting area of research for the machine learning techniques.
first_indexed 2025-11-14T18:25:47Z
format Thesis (University of Nottingham only)
id nottingham-11394
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T18:25:47Z
publishDate 2010
recordtype eprints
repository_type Digital Repository
spelling nottingham-113942025-02-28T11:13:08Z https://eprints.nottingham.ac.uk/11394/ Automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment Widera, Paweł The process of protein structure prediction is a crucial part of understanding the function of the building blocks of life. It is based on the approximation of a protein free energy that is used to guide the search through the space of protein structures towards the thermodynamic equilibrium of the native state. A function that gives a good approximation of the protein free energy should be able to estimate the structural distance of the evaluated candidate structure to the protein native state. This correlation between the energy and the similarity to the native is the key to high quality predictions. State-of-the-art protein structure prediction methods use very simple techniques to design such energy functions. The individual components of the energy functions are created by human experts with the use of statistical analysis of common structural patterns that occurs in the known native structures. The energy function itself is then defined as a simple weighted sum of these components. Exact values of the weights are set in the process of maximisation of the correlation between the energy and the similarity to the native measured by a root mean square deviation between coordinates of the protein backbone. In this dissertation I argue that this process is oversimplified and could be improved on at least two levels. Firstly, a more complex functional combination of the energy components might be able to reflect the similarity more accurately and thus improve the prediction quality. Secondly, a more robust similarity measure that combines different notions of the protein structural similarity might provide a much more realistic baseline for the energy function optimisation. To test these two hypotheses I have proposed a novel approach to the design of energy functions for protein structure prediction using a genetic programming algorithm to evolve the energy functions and a structural similarity consensus to provide a reference similarity measure. The best evolved energy functions were found to reflect the similarity to the native better than the optimised weighted sum of terms, and therefore opening a new interesting area of research for the machine learning techniques. 2010-07-20 Thesis (University of Nottingham only) NonPeerReviewed application/pdf en arr https://eprints.nottingham.ac.uk/11394/1/thesis.pdf Widera, Paweł (2010) Automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment. PhD thesis, University of Nottingham. proteins protein structure protein free energy genetic programming
spellingShingle proteins
protein structure
protein free energy
genetic programming
Widera, Paweł
Automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment
title Automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment
title_full Automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment
title_fullStr Automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment
title_full_unstemmed Automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment
title_short Automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment
title_sort automated design of energy functions for protein structure prediction by means of genetic programming and improved structure similarity assessment
topic proteins
protein structure
protein free energy
genetic programming
url https://eprints.nottingham.ac.uk/11394/