Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis

Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and ob...

Full description

Bibliographic Details
Main Authors: Dalatu, Paul Inuwa, Midi, Habshah
Format: Article
Language:English
Published: Universiti Putra Malaysia Press 2018
Online Access:http://psasir.upm.edu.my/id/eprint/66312/
http://psasir.upm.edu.my/id/eprint/66312/1/17%20JST-1003-2017.pdf
_version_ 1848855533517275136
author Dalatu, Paul Inuwa
Midi, Habshah
author_facet Dalatu, Paul Inuwa
Midi, Habshah
author_sort Dalatu, Paul Inuwa
building UPM Institutional Repository
collection Online Access
description Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis.
first_indexed 2025-11-15T11:27:16Z
format Article
id upm-66312
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T11:27:16Z
publishDate 2018
publisher Universiti Putra Malaysia Press
recordtype eprints
repository_type Digital Repository
spelling upm-663122019-02-12T07:04:42Z http://psasir.upm.edu.my/id/eprint/66312/ Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis Dalatu, Paul Inuwa Midi, Habshah Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis. Universiti Putra Malaysia Press 2018 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/66312/1/17%20JST-1003-2017.pdf Dalatu, Paul Inuwa and Midi, Habshah (2018) Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis. Pertanika Journal of Science & Technology, 26 (4). pp. 1823-1836. ISSN 0128-7680; ESSN: 2231-8526 http://www.pertanika.upm.edu.my/Pertanika%20PAPERS/JST%20Vol.%2026%20(4)%20Oct.%202018/17%20JST-1003-2017.pdf
spellingShingle Dalatu, Paul Inuwa
Midi, Habshah
Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_full Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_fullStr Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_full_unstemmed Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_short Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_sort statistical estimators as an alternative to standard deviation in weighted euclidean distance cluster analysis
url http://psasir.upm.edu.my/id/eprint/66312/
http://psasir.upm.edu.my/id/eprint/66312/
http://psasir.upm.edu.my/id/eprint/66312/1/17%20JST-1003-2017.pdf