Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach

This work implements an enhanced Bayesian classifier with better performance as compared to the ordinary naïve Bayes classifier when used with domains and datasets of varying characteristics. Text classification is an active and on-going research field of Artificial Intelligence (AI). Text classific...

Full description

Bibliographic Details
Main Authors:	Isa, Dino, Lee, Lam Hong, Kallimani, V.P., Prasad, R.
Format:	Article
Language:	English
Published:	Canadian Center of Science and Education 2008
Subjects:	Text classification Bayes theorem Bayesian filtering Probability Case-based Rrasoning
Online Access:	https://eprints.nottingham.ac.uk/3051/

_version_	1848801175248306176
author	Isa, Dino Lee, Lam Hong Kallimani, V.P. Prasad, R.
author_facet	Isa, Dino Lee, Lam Hong Kallimani, V.P. Prasad, R.
author_sort	Isa, Dino
building	Nottingham Research Data Repository
collection	Online Access
description	This work implements an enhanced Bayesian classifier with better performance as compared to the ordinary naïve Bayes classifier when used with domains and datasets of varying characteristics. Text classification is an active and on-going research field of Artificial Intelligence (AI). Text classification is defined as the task of learning methods for categorising collections of electronic text documents into their annotated classes, based on its contents. An increasing number of statistical approaches have been developed for text classification, including k-nearest neighbor classification, naïve Bayes classification, decision tree, rules induction, and the algorithm implementing the structural risk minimisation theory called the support vector machine. Among the approaches used in these applications, naïve Bayes classifiers have been widely used because of its simplicity. However this generative method has been reported to be less accurate than the discriminative methods such as SVM. Some researches have proven that the naïve Bayes classifier performs surprisingly well in many other domains with certain specialised characteristics. The main aim of this work is to quantify the weakness of traditional naïve Bayes classification and introduce an enhance Bayesian classification approach with additional innovative techniques to perform better than the traditional naïve Bayes classifier. Our research goal is to develop an enhanced Bayesian probabilistic classifier by introducing different tournament structures ranking algorithms along with a high relevance keywords extraction facility and an accurately calculated weighting factors facility. These were done to improve the performance of the classification tasks for specific datasets with different characteristics. Other researches have used general datasets, such as Reuters-21578 and 20_newsgroups to validate the performance of their classifiers. Our approach is easily adapted to datasets with different characteristics in terms of the degree of similarity between classes, multi-categorised documents, and different dataset organisations. As previously mentioned we introduce several techniques such as tournament structures ranking algorithms, higher relevance keyword extraction, and automatically computed document dependent (ACDD) weighting factors. Each technique has unique response while been implemented in datasets with different characteristics but has shown to give outstanding performance in most cases. We have successfully optimised our techniques for individual datasets with different characteristics based on our experimental results.
first_indexed	2025-11-14T18:20:37Z
format	Article
id	nottingham-3051
institution	University of Nottingham Malaysia Campus
institution_category	Local University
language	English
last_indexed	2025-11-14T21:03:16Z
publishDate	2008
publisher	Canadian Center of Science and Education
recordtype	eprints
repository_type	Digital Repository
spelling	nottingham-30512025-09-10T14:44:35Z https://eprints.nottingham.ac.uk/3051/ Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach Isa, Dino Lee, Lam Hong Kallimani, V.P. Prasad, R. This work implements an enhanced Bayesian classifier with better performance as compared to the ordinary naïve Bayes classifier when used with domains and datasets of varying characteristics. Text classification is an active and on-going research field of Artificial Intelligence (AI). Text classification is defined as the task of learning methods for categorising collections of electronic text documents into their annotated classes, based on its contents. An increasing number of statistical approaches have been developed for text classification, including k-nearest neighbor classification, naïve Bayes classification, decision tree, rules induction, and the algorithm implementing the structural risk minimisation theory called the support vector machine. Among the approaches used in these applications, naïve Bayes classifiers have been widely used because of its simplicity. However this generative method has been reported to be less accurate than the discriminative methods such as SVM. Some researches have proven that the naïve Bayes classifier performs surprisingly well in many other domains with certain specialised characteristics. The main aim of this work is to quantify the weakness of traditional naïve Bayes classification and introduce an enhance Bayesian classification approach with additional innovative techniques to perform better than the traditional naïve Bayes classifier. Our research goal is to develop an enhanced Bayesian probabilistic classifier by introducing different tournament structures ranking algorithms along with a high relevance keywords extraction facility and an accurately calculated weighting factors facility. These were done to improve the performance of the classification tasks for specific datasets with different characteristics. Other researches have used general datasets, such as Reuters-21578 and 20_newsgroups to validate the performance of their classifiers. Our approach is easily adapted to datasets with different characteristics in terms of the degree of similarity between classes, multi-categorised documents, and different dataset organisations. As previously mentioned we introduce several techniques such as tournament structures ranking algorithms, higher relevance keyword extraction, and automatically computed document dependent (ACDD) weighting factors. Each technique has unique response while been implemented in datasets with different characteristics but has shown to give outstanding performance in most cases. We have successfully optimised our techniques for individual datasets with different characteristics based on our experimental results. Canadian Center of Science and Education 2008-02 Article PeerReviewed application/pdf en cc_by https://eprints.nottingham.ac.uk/3051/1/1932-5864-1-PB.pdf Isa, Dino, Lee, Lam Hong, Kallimani, V.P. and Prasad, R. (2008) Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach. Computer and Information Science, 1 (1). pp. 57-68. ISSN 1913-8989 Text classification Bayes theorem Bayesian filtering Probability Case-based Rrasoning http://ccsenet.org/journal/index.php/cis/article/view/1932 doi:10.5539/cis.v1n1P57 doi:10.5539/cis.v1n1P57
spellingShingle	Text classification Bayes theorem Bayesian filtering Probability Case-based Rrasoning Isa, Dino Lee, Lam Hong Kallimani, V.P. Prasad, R. Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title	Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_full	Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_fullStr	Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_full_unstemmed	Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_short	Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach
title_sort	polychotomiser for case-based reasoning beyond the traditional bayesian classification approach
topic	Text classification Bayes theorem Bayesian filtering Probability Case-based Rrasoning
url	https://eprints.nottingham.ac.uk/3051/ https://eprints.nottingham.ac.uk/3051/ https://eprints.nottingham.ac.uk/3051/

Polychotomiser for case-based reasoning beyond the traditional Bayesian classification approach

Similar Items