Improved normalization and standardization techniques for higher purity in K-means clustering

Clustering is basically one of the major sources of primary data mining tools, which make researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with aim of partitioning, where objects in the same cluster are similar, and objects be...

Full description

Bibliographic Details
Main Authors:	Dalatu, Paul Inuwa, Fitrianto, Anwar, Mustapha, Aida
Format:	Article
Language:	English
Published:	Pushpa Publishing House 2016
Subjects:	Normalization; Standardization; K-means algorithm; Clustering; Purity; Rand index
Online Access:	http://psasir.upm.edu.my/id/eprint/54519/ http://psasir.upm.edu.my/id/eprint/54519/1/Improved%20normalization%20and%20standardization%20techniques%20for%20higher%20purity%20in%20K-means%20clustering.pdf

_version_	1848852567651516416
author	Dalatu, Paul Inuwa Fitrianto, Anwar Mustapha, Aida
author_facet	Dalatu, Paul Inuwa Fitrianto, Anwar Mustapha, Aida
author_sort	Dalatu, Paul Inuwa
building	UPM Institutional Repository
collection	Online Access
description	Clustering is basically one of the major sources of primary data mining tools, which make researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with aim of partitioning, where objects in the same cluster are similar, and objects belong to different clusters vary significantly, with respect to their attributes. The K-means algorithm is a famous and fast technique in non-hierarchical cluster algorithms. Based on its simplicity, the K-means algorithm has been used in many fields. This paper proposes improved normalization and standardization techniques for higher purity in K-means clustering experimented with benchmark datasets from UCI machine learning repository and it was found that all the proposed techniques’ performance was much higher compared to the conventional K-means and the three classic transformations, and it is evidently shown by purity and Rand index accuracy results.
first_indexed	2025-11-15T10:40:08Z
format	Article
id	upm-54519
institution	Universiti Putra Malaysia
institution_category	Local University
language	English
last_indexed	2025-11-15T10:40:08Z
publishDate	2016
publisher	Pushpa Publishing House
recordtype	eprints
repository_type	Digital Repository
spelling	upm-545192018-03-27T01:36:34Z http://psasir.upm.edu.my/id/eprint/54519/ Improved normalization and standardization techniques for higher purity in K-means clustering Dalatu, Paul Inuwa Fitrianto, Anwar Mustapha, Aida Clustering is basically one of the major sources of primary data mining tools, which make researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with aim of partitioning, where objects in the same cluster are similar, and objects belong to different clusters vary significantly, with respect to their attributes. The K-means algorithm is a famous and fast technique in non-hierarchical cluster algorithms. Based on its simplicity, the K-means algorithm has been used in many fields. This paper proposes improved normalization and standardization techniques for higher purity in K-means clustering experimented with benchmark datasets from UCI machine learning repository and it was found that all the proposed techniques’ performance was much higher compared to the conventional K-means and the three classic transformations, and it is evidently shown by purity and Rand index accuracy results. Pushpa Publishing House 2016-09 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/54519/1/Improved%20normalization%20and%20standardization%20techniques%20for%20higher%20purity%20in%20K-means%20clustering.pdf Dalatu, Paul Inuwa and Fitrianto, Anwar and Mustapha, Aida (2016) Improved normalization and standardization techniques for higher purity in K-means clustering. Far East Journal of Mathematical Sciences, 100 (6). pp. 859-871. ISSN 0972-0871 http://www.pphmj.com/abstract/10134.htm Normalization; Standardization; K-means algorithm; Clustering; Purity; Rand index 10.17654/MS100060859
spellingShingle	Normalization; Standardization; K-means algorithm; Clustering; Purity; Rand index Dalatu, Paul Inuwa Fitrianto, Anwar Mustapha, Aida Improved normalization and standardization techniques for higher purity in K-means clustering
title	Improved normalization and standardization techniques for higher purity in K-means clustering
title_full	Improved normalization and standardization techniques for higher purity in K-means clustering
title_fullStr	Improved normalization and standardization techniques for higher purity in K-means clustering
title_full_unstemmed	Improved normalization and standardization techniques for higher purity in K-means clustering
title_short	Improved normalization and standardization techniques for higher purity in K-means clustering
title_sort	improved normalization and standardization techniques for higher purity in k-means clustering
topic	Normalization; Standardization; K-means algorithm; Clustering; Purity; Rand index
url	http://psasir.upm.edu.my/id/eprint/54519/ http://psasir.upm.edu.my/id/eprint/54519/ http://psasir.upm.edu.my/id/eprint/54519/ http://psasir.upm.edu.my/id/eprint/54519/1/Improved%20normalization%20and%20standardization%20techniques%20for%20higher%20purity%20in%20K-means%20clustering.pdf

Improved normalization and standardization techniques for higher purity in K-means clustering

Similar Items