Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains

A variety of protein domain predictors were developed to predict protein domain boundaries in recent years, but most of them cannot predict discontinuous domains. Considering nearly 40% of multidomain proteins contain one or more discontinuous domains, we have developed DomEx to enable domain bounda...

Full description

Bibliographic Details
Main Authors: Xue, Zhidong, Jang, Richard, Govindarajoo, Brandon, Huang, Yichu, Wang, Yan
Format: Online
Language:English
Published: Public Library of Science 2015
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4621036/
id pubmed-4621036
recordtype oai_dc
spelling pubmed-46210362015-10-29 Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains Xue, Zhidong Jang, Richard Govindarajoo, Brandon Huang, Yichu Wang, Yan Research Article A variety of protein domain predictors were developed to predict protein domain boundaries in recent years, but most of them cannot predict discontinuous domains. Considering nearly 40% of multidomain proteins contain one or more discontinuous domains, we have developed DomEx to enable domain boundary predictors to detect discontinuous domains by assembling the continuous domain segments. Discontinuous domains are predicted by matching the sequence profile of concatenated continuous domain segments with the profiles from a single-domain library derived from SCOP and CATH, and Pfam. Then the matches are filtered by similarity to library templates, a symmetric index score and a profile-profile alignment score. DomEx recalled 32.3% discontinuous domains with 86.5% precision when tested on 97 non-homologous protein chains containing 58 continuous and 99 discontinuous domains, in which the predicted domain segments are within ±20 residues of the boundary definitions in CATH 3.5. Compared with our recently developed predictor, ThreaDom, which is the state-of-the-art tool to detect discontinuous-domains, DomEx recalled 26.7% discontinuous domains with 72.7% precision in a benchmark with 29 discontinuous-domain chains, where ThreaDom failed to predict any discontinuous domains. Furthermore, combined with ThreaDom, the method ranked number one among 10 predictors. The source code and datasets are available at https://github.com/xuezhidong/DomEx. Public Library of Science 2015-10-26 /pmc/articles/PMC4621036/ /pubmed/26502173 http://dx.doi.org/10.1371/journal.pone.0141541 Text en © 2015 Xue et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Xue, Zhidong
Jang, Richard
Govindarajoo, Brandon
Huang, Yichu
Wang, Yan
spellingShingle Xue, Zhidong
Jang, Richard
Govindarajoo, Brandon
Huang, Yichu
Wang, Yan
Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains
author_facet Xue, Zhidong
Jang, Richard
Govindarajoo, Brandon
Huang, Yichu
Wang, Yan
author_sort Xue, Zhidong
title Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains
title_short Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains
title_full Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains
title_fullStr Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains
title_full_unstemmed Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains
title_sort extending protein domain boundary predictors to detect discontinuous domains
description A variety of protein domain predictors were developed to predict protein domain boundaries in recent years, but most of them cannot predict discontinuous domains. Considering nearly 40% of multidomain proteins contain one or more discontinuous domains, we have developed DomEx to enable domain boundary predictors to detect discontinuous domains by assembling the continuous domain segments. Discontinuous domains are predicted by matching the sequence profile of concatenated continuous domain segments with the profiles from a single-domain library derived from SCOP and CATH, and Pfam. Then the matches are filtered by similarity to library templates, a symmetric index score and a profile-profile alignment score. DomEx recalled 32.3% discontinuous domains with 86.5% precision when tested on 97 non-homologous protein chains containing 58 continuous and 99 discontinuous domains, in which the predicted domain segments are within ±20 residues of the boundary definitions in CATH 3.5. Compared with our recently developed predictor, ThreaDom, which is the state-of-the-art tool to detect discontinuous-domains, DomEx recalled 26.7% discontinuous domains with 72.7% precision in a benchmark with 29 discontinuous-domain chains, where ThreaDom failed to predict any discontinuous domains. Furthermore, combined with ThreaDom, the method ranked number one among 10 predictors. The source code and datasets are available at https://github.com/xuezhidong/DomEx.
publisher Public Library of Science
publishDate 2015
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4621036/
_version_ 1613493251232235520