Improved ENSPART for DNA Motif Prediction

In our previous work we proposed ENSPART-an ensemble method for DNA motif discovery which partitions input dataset into several equal size subsets runs by several distinct tools for candidate motif prediction. The candidate motifs obtained from different data subsets are merged to obtain the final m...

Full description

Bibliographic Details
Main Authors: Choong, Allen Chieng Hoon, Lee, Nung Kion, Bong, Chih How, Norshafarina, Omar
Format: Article
Language:English
Published: Universiti Malaysia Sarawak (UNIMAS) 2017
Subjects:
Online Access:http://ir.unimas.my/id/eprint/19016/
http://ir.unimas.my/id/eprint/19016/1/SCT-073-revised-deposit%20%28abstrak%29.pdf
_version_ 1848838631140098048
author Choong, Allen Chieng Hoon
Lee, Nung Kion
Bong, Chih How
Norshafarina, Omar
author_facet Choong, Allen Chieng Hoon
Lee, Nung Kion
Bong, Chih How
Norshafarina, Omar
author_sort Choong, Allen Chieng Hoon
building UNIMAS Institutional Repository
collection Online Access
description In our previous work we proposed ENSPART-an ensemble method for DNA motif discovery which partitions input dataset into several equal size subsets runs by several distinct tools for candidate motif prediction. The candidate motifs obtained from different data subsets are merged to obtain the final motifs. Nevertheless, the original ENSPART has several limitations: (1) the same background sequences are used for the calculation of Receiver Operating Cost (ROC) of motifs obtained from different datasets. This causes bias because different datasets might have different background distribution; (2) it does not consider the duplication of a motif and its reverse complement. This causes many redundant motifs in the result set which requires filtering. In this work, we extended the original ENSPART to solve those two issues. For the first issue, we employed background sequences that is based on the distribution of bases in the input sequences. As for the second issue, we employ a "triple" merging strategy to reduce redundant motifs. Our evaluation results indicate that the two improvements obtain better AUC values in comparison to the original implementation.
first_indexed 2025-11-15T06:58:37Z
format Article
id unimas-19016
institution Universiti Malaysia Sarawak
institution_category Local University
language English
last_indexed 2025-11-15T06:58:37Z
publishDate 2017
publisher Universiti Malaysia Sarawak (UNIMAS)
recordtype eprints
repository_type Digital Repository
spelling unimas-190162018-01-03T06:06:16Z http://ir.unimas.my/id/eprint/19016/ Improved ENSPART for DNA Motif Prediction Choong, Allen Chieng Hoon Lee, Nung Kion Bong, Chih How Norshafarina, Omar Q Science (General) T Technology (General) In our previous work we proposed ENSPART-an ensemble method for DNA motif discovery which partitions input dataset into several equal size subsets runs by several distinct tools for candidate motif prediction. The candidate motifs obtained from different data subsets are merged to obtain the final motifs. Nevertheless, the original ENSPART has several limitations: (1) the same background sequences are used for the calculation of Receiver Operating Cost (ROC) of motifs obtained from different datasets. This causes bias because different datasets might have different background distribution; (2) it does not consider the duplication of a motif and its reverse complement. This causes many redundant motifs in the result set which requires filtering. In this work, we extended the original ENSPART to solve those two issues. For the first issue, we employed background sequences that is based on the distribution of bases in the input sequences. As for the second issue, we employ a "triple" merging strategy to reduce redundant motifs. Our evaluation results indicate that the two improvements obtain better AUC values in comparison to the original implementation. Universiti Malaysia Sarawak (UNIMAS) 2017-12 Article PeerReviewed text en http://ir.unimas.my/id/eprint/19016/1/SCT-073-revised-deposit%20%28abstrak%29.pdf Choong, Allen Chieng Hoon and Lee, Nung Kion and Bong, Chih How and Norshafarina, Omar (2017) Improved ENSPART for DNA Motif Prediction. International Journal of Business and Society, 18 (S4). pp. 1-6. ISSN 15116670 http://www.ijbs.unimas.my/
spellingShingle Q Science (General)
T Technology (General)
Choong, Allen Chieng Hoon
Lee, Nung Kion
Bong, Chih How
Norshafarina, Omar
Improved ENSPART for DNA Motif Prediction
title Improved ENSPART for DNA Motif Prediction
title_full Improved ENSPART for DNA Motif Prediction
title_fullStr Improved ENSPART for DNA Motif Prediction
title_full_unstemmed Improved ENSPART for DNA Motif Prediction
title_short Improved ENSPART for DNA Motif Prediction
title_sort improved enspart for dna motif prediction
topic Q Science (General)
T Technology (General)
url http://ir.unimas.my/id/eprint/19016/
http://ir.unimas.my/id/eprint/19016/
http://ir.unimas.my/id/eprint/19016/1/SCT-073-revised-deposit%20%28abstrak%29.pdf