Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach

This work implements an enhanced Bayesian classifier with better performance as compared to the ordinary naïve Bayes classifier when used with domains and datasets of varying characteristics. Text classification is an active and on-going research field of Artificial Intelligence (AI). Text classific...

Full description

Bibliographic Details
Main Authors: Isa, Dino, Lee, Lam Hong, Kallimani, V.P., Prasad, R.
Format: Article
Language:English
Published: Canadian Center of Science and Education 2008
Subjects:
Online Access:https://eprints.nottingham.ac.uk/3051/
_version_ 1848801175248306176
author Isa, Dino
Lee, Lam Hong
Kallimani, V.P.
Prasad, R.
author_facet Isa, Dino
Lee, Lam Hong
Kallimani, V.P.
Prasad, R.
author_sort Isa, Dino
building Nottingham Research Data Repository
collection Online Access
description This work implements an enhanced Bayesian classifier with better performance as compared to the ordinary naïve Bayes classifier when used with domains and datasets of varying characteristics. Text classification is an active and on-going research field of Artificial Intelligence (AI). Text classification is defined as the task of learning methods for categorising collections of electronic text documents into their annotated classes, based on its contents. An increasing number of statistical approaches have been developed for text classification, including k-nearest neighbor classification, naïve Bayes classification, decision tree, rules induction, and the algorithm implementing the structural risk minimisation theory called the support vector machine. Among the approaches used in these applications, naïve Bayes classifiers have been widely used because of its simplicity. However this generative method has been reported to be less accurate than the discriminative methods such as SVM. Some researches have proven that the naïve Bayes classifier performs surprisingly well in many other domains with certain specialised characteristics. The main aim of this work is to quantify the weakness of traditional naïve Bayes classification and introduce an enhance Bayesian classification approach with additional innovative techniques to perform better than the traditional naïve Bayes classifier. Our research goal is to develop an enhanced Bayesian probabilistic classifier by introducing different tournament structures ranking algorithms along with a high relevance keywords extraction facility and an accurately calculated weighting factors facility. These were done to improve the performance of the classification tasks for specific datasets with different characteristics. Other researches have used general datasets, such as Reuters-21578 and 20_newsgroups to validate the performance of their classifiers. Our approach is easily adapted to datasets with different characteristics in terms of the degree of similarity between classes, multi-categorised documents, and different dataset organisations. As previously mentioned we introduce several techniques such as tournament structures ranking algorithms, higher relevance keyword extraction, and automatically computed document dependent (ACDD) weighting factors. Each technique has unique response while been implemented in datasets with different characteristics but has shown to give outstanding performance in most cases. We have successfully optimised our techniques for individual datasets with different characteristics based on our experimental results.
first_indexed 2025-11-14T18:20:37Z
format Article
id nottingham-3051
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T21:03:16Z
publishDate 2008
publisher Canadian Center of Science and Education
recordtype eprints
repository_type Digital Repository
spelling nottingham-30512025-09-10T14:44:35Z https://eprints.nottingham.ac.uk/3051/ Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach Isa, Dino Lee, Lam Hong Kallimani, V.P. Prasad, R. This work implements an enhanced Bayesian classifier with better performance as compared to the ordinary naïve Bayes classifier when used with domains and datasets of varying characteristics. Text classification is an active and on-going research field of Artificial Intelligence (AI). Text classification is defined as the task of learning methods for categorising collections of electronic text documents into their annotated classes, based on its contents. An increasing number of statistical approaches have been developed for text classification, including k-nearest neighbor classification, naïve Bayes classification, decision tree, rules induction, and the algorithm implementing the structural risk minimisation theory called the support vector machine. Among the approaches used in these applications, naïve Bayes classifiers have been widely used because of its simplicity. However this generative method has been reported to be less accurate than the discriminative methods such as SVM. Some researches have proven that the naïve Bayes classifier performs surprisingly well in many other domains with certain specialised characteristics. The main aim of this work is to quantify the weakness of traditional naïve Bayes classification and introduce an enhance Bayesian classification approach with additional innovative techniques to perform better than the traditional naïve Bayes classifier. Our research goal is to develop an enhanced Bayesian probabilistic classifier by introducing different tournament structures ranking algorithms along with a high relevance keywords extraction facility and an accurately calculated weighting factors facility. These were done to improve the performance of the classification tasks for specific datasets with different characteristics. Other researches have used general datasets, such as Reuters-21578 and 20_newsgroups to validate the performance of their classifiers. Our approach is easily adapted to datasets with different characteristics in terms of the degree of similarity between classes, multi-categorised documents, and different dataset organisations. As previously mentioned we introduce several techniques such as tournament structures ranking algorithms, higher relevance keyword extraction, and automatically computed document dependent (ACDD) weighting factors. Each technique has unique response while been implemented in datasets with different characteristics but has shown to give outstanding performance in most cases. We have successfully optimised our techniques for individual datasets with different characteristics based on our experimental results. Canadian Center of Science and Education 2008-02 Article PeerReviewed application/pdf en cc_by https://eprints.nottingham.ac.uk/3051/1/1932-5864-1-PB.pdf Isa, Dino, Lee, Lam Hong, Kallimani, V.P. and Prasad, R. (2008) Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach. Computer and Information Science, 1 (1). pp. 57-68. ISSN 1913-8989 Text classification Bayes theorem Bayesian filtering Probability Case-based Rrasoning http://ccsenet.org/journal/index.php/cis/article/view/1932 doi:10.5539/cis.v1n1P57 doi:10.5539/cis.v1n1P57
spellingShingle Text classification
Bayes theorem
Bayesian filtering
Probability
Case-based Rrasoning
Isa, Dino
Lee, Lam Hong
Kallimani, V.P.
Prasad, R.
Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_full Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_fullStr Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_full_unstemmed Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_short Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_sort polychotomiser for case-based reasoning beyond the traditional bayesian classification approach
topic Text classification
Bayes theorem
Bayesian filtering
Probability
Case-based Rrasoning
url https://eprints.nottingham.ac.uk/3051/
https://eprints.nottingham.ac.uk/3051/
https://eprints.nottingham.ac.uk/3051/