The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms
Circular data or also called angular data has been piqued the interest of researchers to explore and extend the procedure of outlier detection since many decades ago. Outliers are the set of observations that are significantly deviated or dissimilar from the rest of the dataset. In univariate circul...
| Main Authors: | , , |
|---|---|
| Format: | Conference or Workshop Item |
| Language: | English |
| Published: |
AIP Publishing
2024
|
| Subjects: | |
| Online Access: | https://umpir.ump.edu.my/id/eprint/46001/ |
| _version_ | 1848827543312924672 |
|---|---|
| author | Nur Syahirah, Zulkipli Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff |
| author_facet | Nur Syahirah, Zulkipli Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff |
| author_sort | Nur Syahirah, Zulkipli |
| building | UMP Institutional Repository |
| collection | Online Access |
| description | Circular data or also called angular data has been piqued the interest of researchers to explore and extend the procedure of outlier detection since many decades ago. Outliers are the set of observations that are significantly deviated or dissimilar from the rest of the dataset. In univariate circular data, the presence of outliers is acclaimed will affect the parameter estimates and inferences. This study proposes the procedure of detecting multiple outliers, particularly for univariate circular data based on agglomerative clustering algorithms. Three circular similarity measures are used to obtain the dendrogram from three agglomerative clustering algorithms. The outliers are detected by cutting the dendrogram at a specific height using the stopping rule and classifying the observations that exceed the stopping rule as potential outliers. The simulation studies that consider various data conditions with a certain level of contamination are conducted. Then, the results performance of the agglomerative clustering algorithms were compared and the best method for certain data conditions is chosen. It has been found that SL-Chang, CL-Chang and AL-Chang algorithms work the best in detecting outliers with low masking and swamping effect when sample size is small for any percentage of outliers and concentration parameter. Meanwhile, SL-Satari/Di, CL-Satari/Di, and AL-Satari/Di algorithms are recommended to be used for large sample sizes since these algorithms perform very well in detecting the outliers and have low masking and swamping effect at any percentage of outliers and concentration parameter. The proposed procedures are successfully applied in real data using a historical dataset in this study.. |
| first_indexed | 2025-11-15T04:02:23Z |
| format | Conference or Workshop Item |
| id | ump-46001 |
| institution | Universiti Malaysia Pahang |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T04:02:23Z |
| publishDate | 2024 |
| publisher | AIP Publishing |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | ump-460012025-10-22T04:39:56Z https://umpir.ump.edu.my/id/eprint/46001/ The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms Nur Syahirah, Zulkipli Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff Q Science (General) QA Mathematics Circular data or also called angular data has been piqued the interest of researchers to explore and extend the procedure of outlier detection since many decades ago. Outliers are the set of observations that are significantly deviated or dissimilar from the rest of the dataset. In univariate circular data, the presence of outliers is acclaimed will affect the parameter estimates and inferences. This study proposes the procedure of detecting multiple outliers, particularly for univariate circular data based on agglomerative clustering algorithms. Three circular similarity measures are used to obtain the dendrogram from three agglomerative clustering algorithms. The outliers are detected by cutting the dendrogram at a specific height using the stopping rule and classifying the observations that exceed the stopping rule as potential outliers. The simulation studies that consider various data conditions with a certain level of contamination are conducted. Then, the results performance of the agglomerative clustering algorithms were compared and the best method for certain data conditions is chosen. It has been found that SL-Chang, CL-Chang and AL-Chang algorithms work the best in detecting outliers with low masking and swamping effect when sample size is small for any percentage of outliers and concentration parameter. Meanwhile, SL-Satari/Di, CL-Satari/Di, and AL-Satari/Di algorithms are recommended to be used for large sample sizes since these algorithms perform very well in detecting the outliers and have low masking and swamping effect at any percentage of outliers and concentration parameter. The proposed procedures are successfully applied in real data using a historical dataset in this study.. AIP Publishing 2024 Conference or Workshop Item PeerReviewed pdf en https://umpir.ump.edu.my/id/eprint/46001/1/The%20multiple%20outliers%20detection%20for%20circular%20univariate%20data.pdf Nur Syahirah, Zulkipli and Siti Zanariah, Satari and Wan Nur Syahidah, Wan Yusoff (2024) The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms. In: AIP Conference Proceedings. 3rd International Conference on Applied and Industrial Mathematics and Statistics 2022, ICoAIMS 2022 , 24 - 26 August 2022 , Pahang, Malaysia. pp. 1-16., 2895 (1). ISSN 0094-243X (Published) https://doi.org/10.1063/5.0192148 |
| spellingShingle | Q Science (General) QA Mathematics Nur Syahirah, Zulkipli Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms |
| title | The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms |
| title_full | The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms |
| title_fullStr | The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms |
| title_full_unstemmed | The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms |
| title_short | The multiple outliers detection for circular univariate data using different agglomerative clustering algorithms |
| title_sort | multiple outliers detection for circular univariate data using different agglomerative clustering algorithms |
| topic | Q Science (General) QA Mathematics |
| url | https://umpir.ump.edu.my/id/eprint/46001/ https://umpir.ump.edu.my/id/eprint/46001/ |