Robust correlation feature selection based support vector machine approach for high dimensional datasets

Correlation-based feature selection methods are popular tools used to select the most important variables to include the true model in the analysis of sparse and high-dimensional models. In application, the presence of anomalous observations in both predictors and responses can seriously jeopardize...

Full description

Bibliographic Details
Main Authors:	Baba, Ishaq Abdullahi, Mohammed, Mohammed Bappah, Jillahi, Kamal Bakari, Umar, Aliyu, Hendi, Hasan Talib
Format:	Article
Language:	English
Published:	Elsevier B.V. 2025
Online Access:	http://psasir.upm.edu.my/id/eprint/120119/ http://psasir.upm.edu.my/id/eprint/120119/1/120119.pdf

_version_	1848868116339097600
author	Baba, Ishaq Abdullahi Mohammed, Mohammed Bappah Jillahi, Kamal Bakari Umar, Aliyu Hendi, Hasan Talib
author_facet	Baba, Ishaq Abdullahi Mohammed, Mohammed Bappah Jillahi, Kamal Bakari Umar, Aliyu Hendi, Hasan Talib
author_sort	Baba, Ishaq Abdullahi
building	UPM Institutional Repository
collection	Online Access
description	Correlation-based feature selection methods are popular tools used to select the most important variables to include the true model in the analysis of sparse and high-dimensional models. In application, the presence of anomalous observations in both predictors and responses can seriously jeopardize the prediction accuracy of the model, which in turn leads to misleading interpretations and conclusions if not correctly addressed. Furthermore, the cause of dimensionality is another serious difficulty facing many existing feature selection algorithms. To achieve more reliable feature selection and prediction accuracy, a weighted sure independence screening-based support vector machine for high-dimensional datasets is proposed. The key contribution of our proposed method is that it minimizes the influence of outliers in differentiating between significant and insignificant features and improves predictability and interpretability. Our method consists of three basic steps. In the first step, a weights-based modified reweighted fast, consistent, and high break-down point is computed. The second step utilizes the estimates of weights from the first step to select the most important variables for the model. The third step employs the support vector machine algorithm to calculate prediction values. To demonstrate the effectiveness of the developed procedure, we used both simulation and real-life data examples. Our results show that the proposed methods performs better with a clear margin compared to other procedures.
first_indexed	2025-11-15T14:47:16Z
format	Article
id	upm-120119
institution	Universiti Putra Malaysia
institution_category	Local University
language	English
last_indexed	2025-11-15T14:47:16Z
publishDate	2025
publisher	Elsevier B.V.
recordtype	eprints
repository_type	Digital Repository
spelling	upm-1201192025-09-23T07:29:05Z http://psasir.upm.edu.my/id/eprint/120119/ Robust correlation feature selection based support vector machine approach for high dimensional datasets Baba, Ishaq Abdullahi Mohammed, Mohammed Bappah Jillahi, Kamal Bakari Umar, Aliyu Hendi, Hasan Talib Correlation-based feature selection methods are popular tools used to select the most important variables to include the true model in the analysis of sparse and high-dimensional models. In application, the presence of anomalous observations in both predictors and responses can seriously jeopardize the prediction accuracy of the model, which in turn leads to misleading interpretations and conclusions if not correctly addressed. Furthermore, the cause of dimensionality is another serious difficulty facing many existing feature selection algorithms. To achieve more reliable feature selection and prediction accuracy, a weighted sure independence screening-based support vector machine for high-dimensional datasets is proposed. The key contribution of our proposed method is that it minimizes the influence of outliers in differentiating between significant and insignificant features and improves predictability and interpretability. Our method consists of three basic steps. In the first step, a weights-based modified reweighted fast, consistent, and high break-down point is computed. The second step utilizes the estimates of weights from the first step to select the most important variables for the model. The third step employs the support vector machine algorithm to calculate prediction values. To demonstrate the effectiveness of the developed procedure, we used both simulation and real-life data examples. Our results show that the proposed methods performs better with a clear margin compared to other procedures. Elsevier B.V. 2025-12 Article PeerReviewed text en cc_by_nc_nd_4 http://psasir.upm.edu.my/id/eprint/120119/1/120119.pdf Baba, Ishaq Abdullahi and Mohammed, Mohammed Bappah and Jillahi, Kamal Bakari and Umar, Aliyu and Hendi, Hasan Talib (2025) Robust correlation feature selection based support vector machine approach for high dimensional datasets. Results in Control and Optimization, 21. art. no. 100609. pp. 1-14. ISSN 2666-7207 https://linkinghub.elsevier.com/retrieve/pii/S2666720725000943 10.1016/j.rico.2025.100609
spellingShingle	Baba, Ishaq Abdullahi Mohammed, Mohammed Bappah Jillahi, Kamal Bakari Umar, Aliyu Hendi, Hasan Talib Robust correlation feature selection based support vector machine approach for high dimensional datasets
title	Robust correlation feature selection based support vector machine approach for high dimensional datasets
title_full	Robust correlation feature selection based support vector machine approach for high dimensional datasets
title_fullStr	Robust correlation feature selection based support vector machine approach for high dimensional datasets
title_full_unstemmed	Robust correlation feature selection based support vector machine approach for high dimensional datasets
title_short	Robust correlation feature selection based support vector machine approach for high dimensional datasets
title_sort	robust correlation feature selection based support vector machine approach for high dimensional datasets
url	http://psasir.upm.edu.my/id/eprint/120119/ http://psasir.upm.edu.my/id/eprint/120119/ http://psasir.upm.edu.my/id/eprint/120119/ http://psasir.upm.edu.my/id/eprint/120119/1/120119.pdf

Robust correlation feature selection based support vector machine approach for high dimensional datasets

Similar Items