Feature selection for traditional Malay musical instrument sound classification using rough set
With the growing volume of data and feature (attribute) schemes, feature selection has become a very vital aspect in many data mining tasks including musical instrument sounds classification problem. The purpose of feature selection is to alleviate the effect of the 'curse of dimensionality...
Main Author: | |
---|---|
Format: | Thesis |
Published: |
2013
|
Subjects: | |
Online Access: | http://eprints.uthm.edu.my/3951/ http://eprints.uthm.edu.my/3951/1/NORHALINA_SENAN_1.pdf |
Summary: | With the growing volume of data and feature (attribute) schemes, feature selection
has become a very vital aspect in many data mining tasks including musical
instrument sounds classification problem. The purpose of feature selection is to
alleviate the effect of the 'curse of dimensionality'. This problem normally deals
with the irrelevant and redundant features. Using the whole set of features is also
inefficient in terms of processing time and storage requirement. In addition, it may be
difficult to interpret and may decrease the classification performance respectively. To
solve the problem, various feature selection techniques have been proposed in this
area of research. One of the potential techniques is based on the rough set theory.
The theory of rough set proposed by Pawlak in 1980s is a mathematical tool for
dealing with the vagueness and uncertainty data. The concepts of reduct and core in
rough set are relevant in feature selection to identify the important features among
the irrelevant and redundant ones. However, there are two common problems related
to the existing rough set-based feature selection techniques which are no warranty to
find an optimal reduction and high complexity in finding the optimal ones. Thus, in
this study, an alternative feature selection technique based on rough set theory for
traditional Malay musical instrument sounds classification was proposed. This
technique was developed using rough set approximation based on the maximum
degree of dependency of attributes. The idea of this technique was to choose the most
significant features by ranking the relevant features based on the highest dependency
of attributes and then removing the redundant features with the similar dependency
value. In overall, the results showed that the proposed technique was able to select
the 17 important features out of 37 full features (with 54% of reduction), achieve the
average of 98.84% accuracy rate, and reduce the complexity of the process (where
the time processing is less than 1 second) significantly. |
---|