Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance

Traditionally, two protein sequences are classified into the same class if their feature patterns have high homology. These feature patterns were originally extracted by sequence alignment algorithms, which measure similarity between an unseen protein sequence and identified protein sequences. Neur...

Full description

Bibliographic Details
Main Authors: Wang, Dianhui, Lee, Nung Kion, Dillon, Tharam S.
Format: Proceeding
Language:English
Published: IEEE 2003
Subjects:
Online Access:http://ir.unimas.my/id/eprint/11927/
http://ir.unimas.my/id/eprint/11927/1/Data%20Mining_abstract.pdf
_version_ 1848837090662416384
author Wang, Dianhui
Lee, Nung Kion
Dillon, Tharam S.
author_facet Wang, Dianhui
Lee, Nung Kion
Dillon, Tharam S.
author_sort Wang, Dianhui
building UNIMAS Institutional Repository
collection Online Access
description Traditionally, two protein sequences are classified into the same class if their feature patterns have high homology. These feature patterns were originally extracted by sequence alignment algorithms, which measure similarity between an unseen protein sequence and identified protein sequences. Neural network approaches, while reasonably accurate at classification, give no information ahout the relationship between the unseen case and the classified items that is useful to biologist. In contrast, in this paper we use a generalized radial basis function (GRBF) neural network architecture 'that generates fuzzy classification rules that could he used for further knowledge discovery. Our proposed techniques were evaluated using protein sequences with ten classes of super-families downloaded from a public domain database, and the results compared favorably with other standard machine learning techniques.
first_indexed 2025-11-15T06:34:08Z
format Proceeding
id unimas-11927
institution Universiti Malaysia Sarawak
institution_category Local University
language English
last_indexed 2025-11-15T06:34:08Z
publishDate 2003
publisher IEEE
recordtype eprints
repository_type Digital Repository
spelling unimas-119272016-05-12T04:32:07Z http://ir.unimas.my/id/eprint/11927/ Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance Wang, Dianhui Lee, Nung Kion Dillon, Tharam S. QA75 Electronic computers. Computer science T Technology (General) Traditionally, two protein sequences are classified into the same class if their feature patterns have high homology. These feature patterns were originally extracted by sequence alignment algorithms, which measure similarity between an unseen protein sequence and identified protein sequences. Neural network approaches, while reasonably accurate at classification, give no information ahout the relationship between the unseen case and the classified items that is useful to biologist. In contrast, in this paper we use a generalized radial basis function (GRBF) neural network architecture 'that generates fuzzy classification rules that could he used for further knowledge discovery. Our proposed techniques were evaluated using protein sequences with ten classes of super-families downloaded from a public domain database, and the results compared favorably with other standard machine learning techniques. IEEE 2003 Proceeding NonPeerReviewed text en http://ir.unimas.my/id/eprint/11927/1/Data%20Mining_abstract.pdf Wang, Dianhui and Lee, Nung Kion and Dillon, Tharam S. (2003) Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance. In: Neural Networks, 2003. Proceedings of the International Joint Conference on, 20-24 July 2003. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1223671 10.1109/IJCNN.2003.1223671
spellingShingle QA75 Electronic computers. Computer science
T Technology (General)
Wang, Dianhui
Lee, Nung Kion
Dillon, Tharam S.
Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance
title Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance
title_full Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance
title_fullStr Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance
title_full_unstemmed Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance
title_short Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance
title_sort data mining for building neural protein sequence classification systems with improved performance
topic QA75 Electronic computers. Computer science
T Technology (General)
url http://ir.unimas.my/id/eprint/11927/
http://ir.unimas.my/id/eprint/11927/
http://ir.unimas.my/id/eprint/11927/
http://ir.unimas.my/id/eprint/11927/1/Data%20Mining_abstract.pdf