Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data

In multivariate data, outliers are difficult to detect especially when the dimension of the data increase. Mahalanobis distance (MD) has been one of the classical methods to detect outliers for multivariate data. However, the classical mean and covariance matrix in MD suffered from masking and swamp...

Full description

Bibliographic Details
Main Authors: Sharifah Sakinah, Syed Abd Mutalib, Siti Zanariah, Satari, Wan Nur Syahidah, Wan Yusoff
Format: Article
Language:English
Published: Universiti Teknologi MARA (UiTM) 2021
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/32427/
http://umpir.ump.edu.my/id/eprint/32427/8/Comparison%20of%20Robust%20Estimators.pdf
_version_ 1848824015973515264
author Sharifah Sakinah, Syed Abd Mutalib
Siti Zanariah, Satari
Wan Nur Syahidah, Wan Yusoff
author_facet Sharifah Sakinah, Syed Abd Mutalib
Siti Zanariah, Satari
Wan Nur Syahidah, Wan Yusoff
author_sort Sharifah Sakinah, Syed Abd Mutalib
building UMP Institutional Repository
collection Online Access
description In multivariate data, outliers are difficult to detect especially when the dimension of the data increase. Mahalanobis distance (MD) has been one of the classical methods to detect outliers for multivariate data. However, the classical mean and covariance matrix in MD suffered from masking and swamping effects if the data contain outliers. Due to this problem, many studies used a robust estimator instead of the classical estimator of mean and covariance matrix. In this study, the performance of five robust estimators namely Fast Minimum Covariance Determinant (FMCD), Minimum Vector Variance (MVV), Covariance Matrix Equality (CME), Index Set Equality (ISE),and Test on Covariance (TOC) are investigated and compared. FMCD has been widely used and is known as among the best robust estimator. However, there are certain conditions that FMCD still lacks. MVV, CME, ISE and TOC are innovative of FMCD. These four robust estimators improve the last step of the FMCD algorithm. Hence, the objective of this study is to observe the performance of these five estimator to detect outliers in multivariate data particularly TOC as TOC is the latest robust estimator. Simulation studies are conducted for two outlier scenarios with various conditions. There are three performance measures, which are pout, pmask and pswamp used to measure the performance of the robust estimators. It is found that the TOC gives better performance in pswamp for most conditions. TOC gives better results for pout and pmask for certain conditions.
first_indexed 2025-11-15T03:06:19Z
format Article
id ump-32427
institution Universiti Malaysia Pahang
institution_category Local University
language English
last_indexed 2025-11-15T03:06:19Z
publishDate 2021
publisher Universiti Teknologi MARA (UiTM)
recordtype eprints
repository_type Digital Repository
spelling ump-324272021-10-28T08:21:35Z http://umpir.ump.edu.my/id/eprint/32427/ Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data Sharifah Sakinah, Syed Abd Mutalib Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff QA Mathematics In multivariate data, outliers are difficult to detect especially when the dimension of the data increase. Mahalanobis distance (MD) has been one of the classical methods to detect outliers for multivariate data. However, the classical mean and covariance matrix in MD suffered from masking and swamping effects if the data contain outliers. Due to this problem, many studies used a robust estimator instead of the classical estimator of mean and covariance matrix. In this study, the performance of five robust estimators namely Fast Minimum Covariance Determinant (FMCD), Minimum Vector Variance (MVV), Covariance Matrix Equality (CME), Index Set Equality (ISE),and Test on Covariance (TOC) are investigated and compared. FMCD has been widely used and is known as among the best robust estimator. However, there are certain conditions that FMCD still lacks. MVV, CME, ISE and TOC are innovative of FMCD. These four robust estimators improve the last step of the FMCD algorithm. Hence, the objective of this study is to observe the performance of these five estimator to detect outliers in multivariate data particularly TOC as TOC is the latest robust estimator. Simulation studies are conducted for two outlier scenarios with various conditions. There are three performance measures, which are pout, pmask and pswamp used to measure the performance of the robust estimators. It is found that the TOC gives better performance in pswamp for most conditions. TOC gives better results for pout and pmask for certain conditions. Universiti Teknologi MARA (UiTM) 2021-10-15 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/32427/8/Comparison%20of%20Robust%20Estimators.pdf Sharifah Sakinah, Syed Abd Mutalib and Siti Zanariah, Satari and Wan Nur Syahidah, Wan Yusoff (2021) Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data. Journal of Statistical Modeling and Analytics, 3 (3). pp. 36-64. ISSN 2180-3102. (Published) https://ejournal.um.edu.my/index.php/JOSMA/article/view/32399
spellingShingle QA Mathematics
Sharifah Sakinah, Syed Abd Mutalib
Siti Zanariah, Satari
Wan Nur Syahidah, Wan Yusoff
Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_full Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_fullStr Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_full_unstemmed Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_short Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_sort comparison of robust estimators’ performance for detecting outliers in multivariate data
topic QA Mathematics
url http://umpir.ump.edu.my/id/eprint/32427/
http://umpir.ump.edu.my/id/eprint/32427/
http://umpir.ump.edu.my/id/eprint/32427/8/Comparison%20of%20Robust%20Estimators.pdf