The effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data
Clustering algorithms can be used to create an outlier detection procedure in univariate circular data. The circular distance between each point of angular observation in circular data is used to calculate the similarity measure to appropriately group observations. In this paper, we present a cluste...
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
PJSOR
|
| Subjects: | |
| Online Access: | http://umpir.ump.edu.my/id/eprint/35453/ http://umpir.ump.edu.my/id/eprint/35453/1/Zulkipli%20et%20al.%20PJSOR.pdf |
| _version_ | 1848824782219378688 |
|---|---|
| author | Nur Syahirah, Zulkipli Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff |
| author_facet | Nur Syahirah, Zulkipli Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff |
| author_sort | Nur Syahirah, Zulkipli |
| building | UMP Institutional Repository |
| collection | Online Access |
| description | Clustering algorithms can be used to create an outlier detection procedure in univariate circular data. The circular distance between each point of angular observation in circular data is used to calculate the similarity measure to appropriately group observations. In this paper, we present a clustering-based procedure for detecting outliers in univariate circular biological data using various similarity distance measures. Three circular similarity distance measures; Satari distance, Di distance and Chang-chien distance were used to detect outliers using a single-linkage clustering algorithm. Satari distance and Di distance are two similarity measures that have similar formulas for univariate circular data. This study aims to develop and demonstrate the effectiveness of the proposed clustering-based procedure with various similarity distance measures in detecting outliers. The circular similarity distance of SL-Satari/Di and other similarity measures, including SL-Chang, were compared at various dendrogram cutting points. It is found that a clustering-based procedure using a single-linkage algorithm with various similarity distances is a practical and promising approach to detect outliers in univariate circular data, particularly for biological data. According to the results, the SL-Satari/Di distance outperformed the SL-Chang distance for certain data conditions. |
| first_indexed | 2025-11-15T03:18:30Z |
| format | Article |
| id | ump-35453 |
| institution | Universiti Malaysia Pahang |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T03:18:30Z |
| publisher | PJSOR |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | ump-354532022-10-17T05:00:26Z http://umpir.ump.edu.my/id/eprint/35453/ The effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data Nur Syahirah, Zulkipli Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff HA Statistics QA Mathematics Clustering algorithms can be used to create an outlier detection procedure in univariate circular data. The circular distance between each point of angular observation in circular data is used to calculate the similarity measure to appropriately group observations. In this paper, we present a clustering-based procedure for detecting outliers in univariate circular biological data using various similarity distance measures. Three circular similarity distance measures; Satari distance, Di distance and Chang-chien distance were used to detect outliers using a single-linkage clustering algorithm. Satari distance and Di distance are two similarity measures that have similar formulas for univariate circular data. This study aims to develop and demonstrate the effectiveness of the proposed clustering-based procedure with various similarity distance measures in detecting outliers. The circular similarity distance of SL-Satari/Di and other similarity measures, including SL-Chang, were compared at various dendrogram cutting points. It is found that a clustering-based procedure using a single-linkage algorithm with various similarity distances is a practical and promising approach to detect outliers in univariate circular data, particularly for biological data. According to the results, the SL-Satari/Di distance outperformed the SL-Chang distance for certain data conditions. PJSOR Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/35453/1/Zulkipli%20et%20al.%20PJSOR.pdf Nur Syahirah, Zulkipli and Siti Zanariah, Satari and Wan Nur Syahidah, Wan Yusoff The effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data. Pakistan Journal of Statistics and Operation Research, 18 (3). pp. 561-573. ISSN 2220-5810. (Published) http://dx.doi.org/10.18187/pjsor.v18i3.3982 http://dx.doi.org/10.18187/pjsor.v18i3.3982 |
| spellingShingle | HA Statistics QA Mathematics Nur Syahirah, Zulkipli Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff The effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data |
| title | The effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data |
| title_full | The effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data |
| title_fullStr | The effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data |
| title_full_unstemmed | The effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data |
| title_short | The effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data |
| title_sort | effect of different similarity distance measures in detecting outliers using single-linkage clustering algorithm for univariate circular biological data |
| topic | HA Statistics QA Mathematics |
| url | http://umpir.ump.edu.my/id/eprint/35453/ http://umpir.ump.edu.my/id/eprint/35453/ http://umpir.ump.edu.my/id/eprint/35453/ http://umpir.ump.edu.my/id/eprint/35453/1/Zulkipli%20et%20al.%20PJSOR.pdf |