Protein sequences classification based on weighting scheme
We present a new technique to recognize remote protein homologies that rely on combining probabilistic modeling and supervised learning in high-dimensional feature spaces. The main novelty of our technique is the method of constructing feature vectors using Hidden Markov Model and the combination of...
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Assumption University
2005
|
| Subjects: | |
| Online Access: | http://eprints.utm.my/5576/ http://eprints.utm.my/5576/1/N.M.Zaki2005_ProteinSequencesClassificationBasedOn.pdf |
| _version_ | 1848891087194685440 |
|---|---|
| author | Zaki, N. M. Deris, Safaai Md Illias, Rosli |
| author_facet | Zaki, N. M. Deris, Safaai Md Illias, Rosli |
| author_sort | Zaki, N. M. |
| building | UTeM Institutional Repository |
| collection | Online Access |
| description | We present a new technique to recognize remote protein homologies that rely on combining probabilistic modeling and supervised learning in high-dimensional feature spaces. The main novelty of our technique is the method of constructing feature vectors using Hidden Markov Model and the combination of this representation with a classifier capable of learning in very sparse high-dimensional spaces. Each feature vector records the sensitivity of each protein domain to a previously learned set of sub-sequences (strings). Unlike other previous methods, our method takes in consideration the conserved and non-conserved regions. The system subsequently utilizes Support Vector Machines (SVM) classifiers to learn the boundaries between structural protein classes. Experiments show that this method, which we call the String Weighting Scheme-SVM (SWS-SVM) method, significantly improves on previous methods for the classification of protein domains based on remote homologies. Our method is then compared to five existing homology detection methods. |
| first_indexed | 2025-11-15T20:52:23Z |
| format | Article |
| id | utm-5576 |
| institution | Universiti Teknologi Malaysia |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T20:52:23Z |
| publishDate | 2005 |
| publisher | Assumption University |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | utm-55762010-06-01T15:32:30Z http://eprints.utm.my/5576/ Protein sequences classification based on weighting scheme Zaki, N. M. Deris, Safaai Md Illias, Rosli T Technology (General) We present a new technique to recognize remote protein homologies that rely on combining probabilistic modeling and supervised learning in high-dimensional feature spaces. The main novelty of our technique is the method of constructing feature vectors using Hidden Markov Model and the combination of this representation with a classifier capable of learning in very sparse high-dimensional spaces. Each feature vector records the sensitivity of each protein domain to a previously learned set of sub-sequences (strings). Unlike other previous methods, our method takes in consideration the conserved and non-conserved regions. The system subsequently utilizes Support Vector Machines (SVM) classifiers to learn the boundaries between structural protein classes. Experiments show that this method, which we call the String Weighting Scheme-SVM (SWS-SVM) method, significantly improves on previous methods for the classification of protein domains based on remote homologies. Our method is then compared to five existing homology detection methods. Assumption University 2005 Article PeerReviewed application/pdf en http://eprints.utm.my/5576/1/N.M.Zaki2005_ProteinSequencesClassificationBasedOn.pdf Zaki, N. M. and Deris, Safaai and Md Illias, Rosli (2005) Protein sequences classification based on weighting scheme. International Journal of Computer, the Internet and Management, 13 (1). pp. 50-60. |
| spellingShingle | T Technology (General) Zaki, N. M. Deris, Safaai Md Illias, Rosli Protein sequences classification based on weighting scheme |
| title | Protein sequences classification based on weighting scheme |
| title_full | Protein sequences classification based on weighting scheme |
| title_fullStr | Protein sequences classification based on weighting scheme |
| title_full_unstemmed | Protein sequences classification based on weighting scheme |
| title_short | Protein sequences classification based on weighting scheme |
| title_sort | protein sequences classification based on weighting scheme |
| topic | T Technology (General) |
| url | http://eprints.utm.my/5576/ http://eprints.utm.my/5576/1/N.M.Zaki2005_ProteinSequencesClassificationBasedOn.pdf |