Case Slicing Technique for Feature Selection

One of the problems addressed by machine learning is data classification. Finding a good classification algorithm is an important component of many data mining projects. Since the 1960s, many algorithms for data classification have been proposed. Data mining researchers often use classifiers to id...

Full description

Bibliographic Details
Main Author: A. Shiba, Omar A.
Format: Thesis
Language:English
Published: 2004
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/5838/
http://psasir.upm.edu.my/id/eprint/5838/1/FSKTM_2004_6%20IR.pdf
_version_ 1848840210488492032
author A. Shiba, Omar A.
author_facet A. Shiba, Omar A.
author_sort A. Shiba, Omar A.
building UPM Institutional Repository
collection Online Access
description One of the problems addressed by machine learning is data classification. Finding a good classification algorithm is an important component of many data mining projects. Since the 1960s, many algorithms for data classification have been proposed. Data mining researchers often use classifiers to identify important classes of objects within a data repository.This research undertakes two main tasks. The first task is to introduce slicing technique for feature subset selection. The second task is to enhance classification accuracy based on the first task, so that it can be used to classify objects or cases based on selected relevant features only. This new approach called Case Slicing Technique (CST). Applying to this technique on classification task can result in further enhancing case classification accuracy. Case Slicing Technique (CST) helps in identifying the subset of features used in computing the similarity measures needed by classification algorithms. CST was tested on nine datasets from UCI machine learning repositories and domain theories. The maximum and minimum accuracy obtained is 99% and 96% respectively, based on the evaluation approach. The most commonly used evaluation technique is called k-cross validation technique. This technique with k = 10 has been used in this thesis to evaluate the proposed approach. CST was compared to other selected classification methods based on feature subset selection such as Induction of Decision Tree Algorithm (ID3), Base Learning Algorithm K-Nearest Nighbour Algorithm (k-NN) and NaYve Bay~sA lgorithm (NB). All these approaches are implemented with RELIEF feature selection approach. The classification accuracy obtained from the CST method is compared to other selected classification methods such as Value Difference Metric (VDM), Pre-Category Feature Importance (PCF), Cross-Category Feature Importance (CCF), Instance-Based Algorithm (IB4), Decision Tree Algorithms such as Induction of Decision Tree Algorithm (ID3) and Base Learning Algorithm (C4.5), Rough Set methods such as Standard Integer Programming (SIP) and Decision Related Integer Programming (DRIP) and Neural Network methods such as the Multilayer method.
first_indexed 2025-11-15T07:23:43Z
format Thesis
id upm-5838
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T07:23:43Z
publishDate 2004
recordtype eprints
repository_type Digital Repository
spelling upm-58382022-01-05T02:32:46Z http://psasir.upm.edu.my/id/eprint/5838/ Case Slicing Technique for Feature Selection A. Shiba, Omar A. One of the problems addressed by machine learning is data classification. Finding a good classification algorithm is an important component of many data mining projects. Since the 1960s, many algorithms for data classification have been proposed. Data mining researchers often use classifiers to identify important classes of objects within a data repository.This research undertakes two main tasks. The first task is to introduce slicing technique for feature subset selection. The second task is to enhance classification accuracy based on the first task, so that it can be used to classify objects or cases based on selected relevant features only. This new approach called Case Slicing Technique (CST). Applying to this technique on classification task can result in further enhancing case classification accuracy. Case Slicing Technique (CST) helps in identifying the subset of features used in computing the similarity measures needed by classification algorithms. CST was tested on nine datasets from UCI machine learning repositories and domain theories. The maximum and minimum accuracy obtained is 99% and 96% respectively, based on the evaluation approach. The most commonly used evaluation technique is called k-cross validation technique. This technique with k = 10 has been used in this thesis to evaluate the proposed approach. CST was compared to other selected classification methods based on feature subset selection such as Induction of Decision Tree Algorithm (ID3), Base Learning Algorithm K-Nearest Nighbour Algorithm (k-NN) and NaYve Bay~sA lgorithm (NB). All these approaches are implemented with RELIEF feature selection approach. The classification accuracy obtained from the CST method is compared to other selected classification methods such as Value Difference Metric (VDM), Pre-Category Feature Importance (PCF), Cross-Category Feature Importance (CCF), Instance-Based Algorithm (IB4), Decision Tree Algorithms such as Induction of Decision Tree Algorithm (ID3) and Base Learning Algorithm (C4.5), Rough Set methods such as Standard Integer Programming (SIP) and Decision Related Integer Programming (DRIP) and Neural Network methods such as the Multilayer method. 2004-06 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/5838/1/FSKTM_2004_6%20IR.pdf A. Shiba, Omar A. (2004) Case Slicing Technique for Feature Selection. Doctoral thesis, Universiti Putra Malaysia. Machine learning Classification Data mining
spellingShingle Machine learning
Classification
Data mining
A. Shiba, Omar A.
Case Slicing Technique for Feature Selection
title Case Slicing Technique for Feature Selection
title_full Case Slicing Technique for Feature Selection
title_fullStr Case Slicing Technique for Feature Selection
title_full_unstemmed Case Slicing Technique for Feature Selection
title_short Case Slicing Technique for Feature Selection
title_sort case slicing technique for feature selection
topic Machine learning
Classification
Data mining
url http://psasir.upm.edu.my/id/eprint/5838/
http://psasir.upm.edu.my/id/eprint/5838/1/FSKTM_2004_6%20IR.pdf