The performance of K-Means and K-Modes clustering to identify cluster in numerical data

Cluster analysis is a formal study of methods and algorithms for natural grouping of objects according to the perceived intrinsic characteristics and the measure similarities in each group of the objects. The pattern of each cluster and the relationship for each cluster are identified, then they are...

Full description

Bibliographic Details
Main Author: Mohd Zulhimi Bin Zolkefali
Other Authors: Nur Atiqah Hamzah
Format: Journal
Published: Journal of Science and Technology, Universiti Tun Hussein Onn Malaysia 2017
Subjects:
Online Access:http://www.myjurnal.my/public/article-view.php?id=111650
id oai:www.myjurnal.my:111650
recordtype eprints
spelling oai:www.myjurnal.my:1116502018-09-20T00:00:00Z The performance of K-Means and K-Modes clustering to identify cluster in numerical data Mohd Zulhimi Bin Zolkefali Physics Cluster analysis is a formal study of methods and algorithms for natural grouping of objects according to the perceived intrinsic characteristics and the measure similarities in each group of the objects. The pattern of each cluster and the relationship for each cluster are identified, then they are related to the frequency of occurrence in the data set. Meanwhile, the mean and the mode are known as the measures of central tendency in a distribution. In clustering, the mean and the mode are applied as a technique to discover the existing of the cluster in the data set. Therefore, this study aims to compare the performance of K-means and K-modes clustering techniques in finding the group of cluster that exists in the numerical data. The difference between these methods is that the K-modes method is usually applied to categorical data, while K-means method is applied to numerical data. However, both methods would be used to cluster the numerical data in this study. Moreover, performance of these two clustering methods are demonstrated using the output from R software. The results obtained are compared such that the method giving the best output could be determined. In conclusion, the efficiency of the methods is highly presented. Journal of Science and Technology, Universiti Tun Hussein Onn Malaysia Nur Atiqah Hamzah 2017-00-00 Journal application/pdf 111650 www.myjurnal.my/filebank/published_article/606075.pdf www.myjurnal.my/public/article-view.php?id=111650
repository_type Digital Repository
institution_category Local Institution
institution MyJournal
building MyJournal Repository
collection Online Access
topic Physics
spellingShingle Physics
Mohd Zulhimi Bin Zolkefali
The performance of K-Means and K-Modes clustering to identify cluster in numerical data
description Cluster analysis is a formal study of methods and algorithms for natural grouping of objects according to the perceived intrinsic characteristics and the measure similarities in each group of the objects. The pattern of each cluster and the relationship for each cluster are identified, then they are related to the frequency of occurrence in the data set. Meanwhile, the mean and the mode are known as the measures of central tendency in a distribution. In clustering, the mean and the mode are applied as a technique to discover the existing of the cluster in the data set. Therefore, this study aims to compare the performance of K-means and K-modes clustering techniques in finding the group of cluster that exists in the numerical data. The difference between these methods is that the K-modes method is usually applied to categorical data, while K-means method is applied to numerical data. However, both methods would be used to cluster the numerical data in this study. Moreover, performance of these two clustering methods are demonstrated using the output from R software. The results obtained are compared such that the method giving the best output could be determined. In conclusion, the efficiency of the methods is highly presented.
author2 Nur Atiqah Hamzah
author_facet Nur Atiqah Hamzah
Mohd Zulhimi Bin Zolkefali
format Journal
author Mohd Zulhimi Bin Zolkefali
author_sort Mohd Zulhimi Bin Zolkefali
title The performance of K-Means and K-Modes clustering to identify cluster in numerical data
title_short The performance of K-Means and K-Modes clustering to identify cluster in numerical data
title_full The performance of K-Means and K-Modes clustering to identify cluster in numerical data
title_fullStr The performance of K-Means and K-Modes clustering to identify cluster in numerical data
title_full_unstemmed The performance of K-Means and K-Modes clustering to identify cluster in numerical data
title_sort performance of k-means and k-modes clustering to identify cluster in numerical data
publisher Journal of Science and Technology, Universiti Tun Hussein Onn Malaysia
publishDate 2017
url http://www.myjurnal.my/public/article-view.php?id=111650
first_indexed 2018-09-20T16:01:59Z
last_indexed 2018-09-20T16:01:59Z
_version_ 1612250181286232064