The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms

Circular data or also called angular data has been piqued the interest of researchers to explore and extend the procedure of outlier detection since many decades ago. Outliers are the set of observations that are significantly deviated or dissimilar from the rest of the dataset. In univariate circul...

Full description

Bibliographic Details
Main Authors: Nur Syahirah, Zulkipli, Siti Zanariah, Satari, Wan Nur Syahidah, Wan Yusoff
Format: Conference or Workshop Item
Language:English
Published: AIP Publishing 2024
Subjects:
Online Access:https://umpir.ump.edu.my/id/eprint/46001/
_version_ 1848827543312924672
author Nur Syahirah, Zulkipli
Siti Zanariah, Satari
Wan Nur Syahidah, Wan Yusoff
author_facet Nur Syahirah, Zulkipli
Siti Zanariah, Satari
Wan Nur Syahidah, Wan Yusoff
author_sort Nur Syahirah, Zulkipli
building UMP Institutional Repository
collection Online Access
description Circular data or also called angular data has been piqued the interest of researchers to explore and extend the procedure of outlier detection since many decades ago. Outliers are the set of observations that are significantly deviated or dissimilar from the rest of the dataset. In univariate circular data, the presence of outliers is acclaimed will affect the parameter estimates and inferences. This study proposes the procedure of detecting multiple outliers, particularly for univariate circular data based on agglomerative clustering algorithms. Three circular similarity measures are used to obtain the dendrogram from three agglomerative clustering algorithms. The outliers are detected by cutting the dendrogram at a specific height using the stopping rule and classifying the observations that exceed the stopping rule as potential outliers. The simulation studies that consider various data conditions with a certain level of contamination are conducted. Then, the results performance of the agglomerative clustering algorithms were compared and the best method for certain data conditions is chosen. It has been found that SL-Chang, CL-Chang and AL-Chang algorithms work the best in detecting outliers with low masking and swamping effect when sample size is small for any percentage of outliers and concentration parameter. Meanwhile, SL-Satari/Di, CL-Satari/Di, and AL-Satari/Di algorithms are recommended to be used for large sample sizes since these algorithms perform very well in detecting the outliers and have low masking and swamping effect at any percentage of outliers and concentration parameter. The proposed procedures are successfully applied in real data using a historical dataset in this study..
first_indexed 2025-11-15T04:02:23Z
format Conference or Workshop Item
id ump-46001
institution Universiti Malaysia Pahang
institution_category Local University
language English
last_indexed 2025-11-15T04:02:23Z
publishDate 2024
publisher AIP Publishing
recordtype eprints
repository_type Digital Repository
spelling ump-460012025-10-22T04:39:56Z https://umpir.ump.edu.my/id/eprint/46001/ The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms Nur Syahirah, Zulkipli Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff Q Science (General) QA Mathematics Circular data or also called angular data has been piqued the interest of researchers to explore and extend the procedure of outlier detection since many decades ago. Outliers are the set of observations that are significantly deviated or dissimilar from the rest of the dataset. In univariate circular data, the presence of outliers is acclaimed will affect the parameter estimates and inferences. This study proposes the procedure of detecting multiple outliers, particularly for univariate circular data based on agglomerative clustering algorithms. Three circular similarity measures are used to obtain the dendrogram from three agglomerative clustering algorithms. The outliers are detected by cutting the dendrogram at a specific height using the stopping rule and classifying the observations that exceed the stopping rule as potential outliers. The simulation studies that consider various data conditions with a certain level of contamination are conducted. Then, the results performance of the agglomerative clustering algorithms were compared and the best method for certain data conditions is chosen. It has been found that SL-Chang, CL-Chang and AL-Chang algorithms work the best in detecting outliers with low masking and swamping effect when sample size is small for any percentage of outliers and concentration parameter. Meanwhile, SL-Satari/Di, CL-Satari/Di, and AL-Satari/Di algorithms are recommended to be used for large sample sizes since these algorithms perform very well in detecting the outliers and have low masking and swamping effect at any percentage of outliers and concentration parameter. The proposed procedures are successfully applied in real data using a historical dataset in this study.. AIP Publishing 2024 Conference or Workshop Item PeerReviewed pdf en https://umpir.ump.edu.my/id/eprint/46001/1/The%20multiple%20outliers%20detection%20for%20circular%20univariate%20data.pdf Nur Syahirah, Zulkipli and Siti Zanariah, Satari and Wan Nur Syahidah, Wan Yusoff (2024) The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms. In: AIP Conference Proceedings. 3rd International Conference on Applied and Industrial Mathematics and Statistics 2022, ICoAIMS 2022 , 24 - 26 August 2022 , Pahang, Malaysia. pp. 1-16., 2895 (1). ISSN 0094-243X (Published) https://doi.org/10.1063/5.0192148
spellingShingle Q Science (General)
QA Mathematics
Nur Syahirah, Zulkipli
Siti Zanariah, Satari
Wan Nur Syahidah, Wan Yusoff
The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms
title The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms
title_full The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms
title_fullStr The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms
title_full_unstemmed The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms
title_short The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms
title_sort multiple outliers detection for circular univariate data using different agglomerative clustering algorithms
topic Q Science (General)
QA Mathematics
url https://umpir.ump.edu.my/id/eprint/46001/
https://umpir.ump.edu.my/id/eprint/46001/