Prediction of Protein Phosphorylation Sites by Using the Composition of k-Spaced Amino Acid Pairs

As one of the most widespread protein post-translational modifications, phosphorylation is involved in many biological processes such as cell cycle, apoptosis. Identification of phosphorylated substrates and their corresponding sites will facilitate the understanding of the molecular mechanism of ph...

Full description

Bibliographic Details
Main Authors: Zhao, Xiaowei, Zhang, Wenyi, Xu, Xin, Ma, Zhiqiang, Yin, Minghao
Format: Online
Language:English
Published: Public Library of Science 2012
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3478286/
id pubmed-3478286
recordtype oai_dc
spelling pubmed-34782862012-10-29 Prediction of Protein Phosphorylation Sites by Using the Composition of k-Spaced Amino Acid Pairs Zhao, Xiaowei Zhang, Wenyi Xu, Xin Ma, Zhiqiang Yin, Minghao Research Article As one of the most widespread protein post-translational modifications, phosphorylation is involved in many biological processes such as cell cycle, apoptosis. Identification of phosphorylated substrates and their corresponding sites will facilitate the understanding of the molecular mechanism of phosphorylation. Comparing with the labor-intensive and time-consuming experiment approaches, computational prediction of phosphorylation sites is much desirable due to their convenience and fast speed. In this paper, a new bioinformatics tool named CKSAAP_PhSite was developed that ignored the kinase information and only used the primary sequence information to predict protein phosphorylation sites. The highlight of CKSAAP_PhSite was to utilize the composition of k-spaced amino acid pairs as the encoding scheme, and then the support vector machine was used as the predictor. The performance of CKSAAP_PhSite was measured with a sensitivity of 84.81%, a specificity of 86.07% and an accuracy of 85.43% for serine, a sensitivity of 78.59%, a specificity of 82.26% and an accuracy of 80.31% for threonine as well as a sensitivity of 74.44%, a specificity of 78.03% and an accuracy of 76.21% for tyrosine. Experimental results obtained from cross validation and independent benchmark suggested that our method was very promising to predict phosphorylation sites and can be served as a useful supplement tool to the community. For public access, CKSAAP_PhSite is available at http://59.73.198.144/cksaap_phsite/. Public Library of Science 2012-10-22 /pmc/articles/PMC3478286/ /pubmed/23110047 http://dx.doi.org/10.1371/journal.pone.0046302 Text en © 2012 Zhao et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Zhao, Xiaowei
Zhang, Wenyi
Xu, Xin
Ma, Zhiqiang
Yin, Minghao
spellingShingle Zhao, Xiaowei
Zhang, Wenyi
Xu, Xin
Ma, Zhiqiang
Yin, Minghao
Prediction of Protein Phosphorylation Sites by Using the Composition of k-Spaced Amino Acid Pairs
author_facet Zhao, Xiaowei
Zhang, Wenyi
Xu, Xin
Ma, Zhiqiang
Yin, Minghao
author_sort Zhao, Xiaowei
title Prediction of Protein Phosphorylation Sites by Using the Composition of k-Spaced Amino Acid Pairs
title_short Prediction of Protein Phosphorylation Sites by Using the Composition of k-Spaced Amino Acid Pairs
title_full Prediction of Protein Phosphorylation Sites by Using the Composition of k-Spaced Amino Acid Pairs
title_fullStr Prediction of Protein Phosphorylation Sites by Using the Composition of k-Spaced Amino Acid Pairs
title_full_unstemmed Prediction of Protein Phosphorylation Sites by Using the Composition of k-Spaced Amino Acid Pairs
title_sort prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs
description As one of the most widespread protein post-translational modifications, phosphorylation is involved in many biological processes such as cell cycle, apoptosis. Identification of phosphorylated substrates and their corresponding sites will facilitate the understanding of the molecular mechanism of phosphorylation. Comparing with the labor-intensive and time-consuming experiment approaches, computational prediction of phosphorylation sites is much desirable due to their convenience and fast speed. In this paper, a new bioinformatics tool named CKSAAP_PhSite was developed that ignored the kinase information and only used the primary sequence information to predict protein phosphorylation sites. The highlight of CKSAAP_PhSite was to utilize the composition of k-spaced amino acid pairs as the encoding scheme, and then the support vector machine was used as the predictor. The performance of CKSAAP_PhSite was measured with a sensitivity of 84.81%, a specificity of 86.07% and an accuracy of 85.43% for serine, a sensitivity of 78.59%, a specificity of 82.26% and an accuracy of 80.31% for threonine as well as a sensitivity of 74.44%, a specificity of 78.03% and an accuracy of 76.21% for tyrosine. Experimental results obtained from cross validation and independent benchmark suggested that our method was very promising to predict phosphorylation sites and can be served as a useful supplement tool to the community. For public access, CKSAAP_PhSite is available at http://59.73.198.144/cksaap_phsite/.
publisher Public Library of Science
publishDate 2012
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3478286/
_version_ 1611918025506684928