Optimization of MISCORE-based Motif Identification Systems

Identification of motifs in DNA sequences using classification techniques is one of computational approaches to discovering novel binding sites. In the previous work [16], we proposed a simple and effective method for motif detection using a single crisp rule governed by a mismatch-based matrix simi...

Full description

Bibliographic Details
Main Authors: Lee, Nung Kion, Wang, Dianhui
Format: Proceeding
Language:English
Published: IEEE 2009
Subjects:
Online Access:http://ir.unimas.my/id/eprint/11946/
http://ir.unimas.my/id/eprint/11946/1/Optimization%20of%20MISCORE_abstract.pdf
Description
Summary:Identification of motifs in DNA sequences using classification techniques is one of computational approaches to discovering novel binding sites. In the previous work [16], we proposed a simple and effective method for motif detection using a single crisp rule governed by a mismatch-based matrix similarity score (MISCORE). In this paper, we consider the problem of finding suitable motif cut-off value for MISCORE-based motif identification systems using cost-sensitivity metric. We utilize phylogenetic footprinting data to estimate the parameters in the cost function. We also extend the MISCORE to include entropy to weigh each motif model position to minimize the false positive rate. The performance evaluation is done by using artificial and real DNA sequences. The results demonstrate the feasibility and usefulness of our proposed approach for model based cut-off value estimation.