Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi)
This paper presents an extension work of robust principal component analysis (ROBPCA) denoted as IRPCA, to improve the accuracy of the detection of high leverage points (HLPs) in high dimensional data (HDD). The IRPCA employs the Principal Component Analysis (PCA) to reduce the dimension of the data...
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Penerbit Universiti Kebangsaan Malaysia
2025
|
| Online Access: | http://psasir.upm.edu.my/id/eprint/120869/ http://psasir.upm.edu.my/id/eprint/120869/1/120869.pdf |
| _version_ | 1848868233785901056 |
|---|---|
| author | Midi, Habshah Suhaiza, Jaaz Mohd Aslam, . Hani Syahida, . Emi Amielda, . |
| author_facet | Midi, Habshah Suhaiza, Jaaz Mohd Aslam, . Hani Syahida, . Emi Amielda, . |
| author_sort | Midi, Habshah |
| building | UPM Institutional Repository |
| collection | Online Access |
| description | This paper presents an extension work of robust principal component analysis (ROBPCA) denoted as IRPCA, to improve the accuracy of the detection of high leverage points (HLPs) in high dimensional data (HDD). The IRPCA employs the Principal Component Analysis (PCA) to reduce the dimension of the data set and subsequently a robust location and scatter estimates of the PC scores are obtained based on the Minimum Regularized Covariance Determinant (MRCD). Instead of using robust score distance to detect HLPs as in ROBPCA; in the proposed IRPCA, we have considered using Robust Mahalanobis distance (RMD). The performance of the IRPCA is compared to the ROBPCA and the Minimum Regularized Covariance Determinant and PCA-based method (MRCD-PCA) for the identification of HLPs in HDD. The results signify that all the three methods are very successful in the detection of HLPs with no masking effect. Nonetheless, the ROBPCA suffers from serious swamping problems for less than 30% of HLPs. The proposed IRPCA and the MRCD-PCA have similar performance, having very small swamping effect. However, the MRCD-PCA algorithm is quite cumbersome and required longer computational running time. The attractive feature of the IRPCA is that it provides a simpler algorithm and it is very fast. |
| first_indexed | 2025-11-15T14:49:08Z |
| format | Article |
| id | upm-120869 |
| institution | Universiti Putra Malaysia |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T14:49:08Z |
| publishDate | 2025 |
| publisher | Penerbit Universiti Kebangsaan Malaysia |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | upm-1208692025-10-14T04:09:13Z http://psasir.upm.edu.my/id/eprint/120869/ Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi) Midi, Habshah Suhaiza, Jaaz Mohd Aslam, . Hani Syahida, . Emi Amielda, . This paper presents an extension work of robust principal component analysis (ROBPCA) denoted as IRPCA, to improve the accuracy of the detection of high leverage points (HLPs) in high dimensional data (HDD). The IRPCA employs the Principal Component Analysis (PCA) to reduce the dimension of the data set and subsequently a robust location and scatter estimates of the PC scores are obtained based on the Minimum Regularized Covariance Determinant (MRCD). Instead of using robust score distance to detect HLPs as in ROBPCA; in the proposed IRPCA, we have considered using Robust Mahalanobis distance (RMD). The performance of the IRPCA is compared to the ROBPCA and the Minimum Regularized Covariance Determinant and PCA-based method (MRCD-PCA) for the identification of HLPs in HDD. The results signify that all the three methods are very successful in the detection of HLPs with no masking effect. Nonetheless, the ROBPCA suffers from serious swamping problems for less than 30% of HLPs. The proposed IRPCA and the MRCD-PCA have similar performance, having very small swamping effect. However, the MRCD-PCA algorithm is quite cumbersome and required longer computational running time. The attractive feature of the IRPCA is that it provides a simpler algorithm and it is very fast. Penerbit Universiti Kebangsaan Malaysia 2025 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/120869/1/120869.pdf Midi, Habshah and Suhaiza, Jaaz and Mohd Aslam, . and Hani Syahida, . and Emi Amielda, . (2025) Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi). Sains Malaysiana, 54 (8). pp. 2087-2097. ISSN 0126-6039; eISSN: 2735-0118 https://www.ukm.my/jsm/pdf_files/SM-PDF-54-8-2025/17.pdf 10.17576/jsm-2025-5408-17 |
| spellingShingle | Midi, Habshah Suhaiza, Jaaz Mohd Aslam, . Hani Syahida, . Emi Amielda, . Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi) |
| title | Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi) |
| title_full | Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi) |
| title_fullStr | Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi) |
| title_full_unstemmed | Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi) |
| title_short | Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi) |
| title_sort | improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi) |
| url | http://psasir.upm.edu.my/id/eprint/120869/ http://psasir.upm.edu.my/id/eprint/120869/ http://psasir.upm.edu.my/id/eprint/120869/ http://psasir.upm.edu.my/id/eprint/120869/1/120869.pdf |