A novel approach for handling missing data to enhance network intrusion detection system

Managing missing data is a critical challenge in Intrusion Detection System (IDS) datasets, significantly affecting the performance of deep learning models. To address this issue, we introduce DeepLearning_Based_MissingData_Imputation (DMDI), a novel method designed to enhance the quality of input d...

Full description

Bibliographic Details
Main Authors: Tahir, Mahjabeen, Abdullah, Azizol, Udzir, Nur Izura, Kasmiran, Khairul Azhar
Format: Article
Language:English
Published: KeAi Communications 2025
Online Access:http://psasir.upm.edu.my/id/eprint/120305/
http://psasir.upm.edu.my/id/eprint/120305/1/120305.pdf
_version_ 1848868158777065472
author Tahir, Mahjabeen
Abdullah, Azizol
Udzir, Nur Izura
Kasmiran, Khairul Azhar
author_facet Tahir, Mahjabeen
Abdullah, Azizol
Udzir, Nur Izura
Kasmiran, Khairul Azhar
author_sort Tahir, Mahjabeen
building UPM Institutional Repository
collection Online Access
description Managing missing data is a critical challenge in Intrusion Detection System (IDS) datasets, significantly affecting the performance of deep learning models. To address this issue, we introduce DeepLearning_Based_MissingData_Imputation (DMDI), a novel method designed to enhance the quality of input data by efficiently handling missing values. Our approach employs the Random Missing Value (RMV) algorithm to simulate missing data, enabling thorough testing and comparison of various imputation techniques. The DMDI method integrates a stacked denoising autoencoder with Gradient Boosting to improve imputation accuracy. We evaluated the effectiveness of our approach through three experimental phases: generating missing data, imputing missing values, and assessing imputation models. Using the NSL-KDD and UNSW-NB15 datasets, our results demonstrate significant improvements in the performance of five different classifiers (SVM, KNN, Logistic Regression, Decision Tree, and Random Forest) after imputation. On average, our method achieved accuracy improvements ranging from 0.95 to 0.97 across these classifiers compared to baseline imputation methods. Detailed analysis using Python 3 validates our findings, demonstrating enhanced model performance and robustness. This study underscores the necessity of precise missing data imputation for enhancing deep learning tasks, particularly in anomaly detection systems. It provides a reliable solution for managing missing data in IDS datasets.
first_indexed 2025-11-15T14:47:57Z
format Article
id upm-120305
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T14:47:57Z
publishDate 2025
publisher KeAi Communications
recordtype eprints
repository_type Digital Repository
spelling upm-1203052025-09-30T02:58:38Z http://psasir.upm.edu.my/id/eprint/120305/ A novel approach for handling missing data to enhance network intrusion detection system Tahir, Mahjabeen Abdullah, Azizol Udzir, Nur Izura Kasmiran, Khairul Azhar Managing missing data is a critical challenge in Intrusion Detection System (IDS) datasets, significantly affecting the performance of deep learning models. To address this issue, we introduce DeepLearning_Based_MissingData_Imputation (DMDI), a novel method designed to enhance the quality of input data by efficiently handling missing values. Our approach employs the Random Missing Value (RMV) algorithm to simulate missing data, enabling thorough testing and comparison of various imputation techniques. The DMDI method integrates a stacked denoising autoencoder with Gradient Boosting to improve imputation accuracy. We evaluated the effectiveness of our approach through three experimental phases: generating missing data, imputing missing values, and assessing imputation models. Using the NSL-KDD and UNSW-NB15 datasets, our results demonstrate significant improvements in the performance of five different classifiers (SVM, KNN, Logistic Regression, Decision Tree, and Random Forest) after imputation. On average, our method achieved accuracy improvements ranging from 0.95 to 0.97 across these classifiers compared to baseline imputation methods. Detailed analysis using Python 3 validates our findings, demonstrating enhanced model performance and robustness. This study underscores the necessity of precise missing data imputation for enhancing deep learning tasks, particularly in anomaly detection systems. It provides a reliable solution for managing missing data in IDS datasets. KeAi Communications 2025 Article PeerReviewed text en cc_by_nc_nd_4 http://psasir.upm.edu.my/id/eprint/120305/1/120305.pdf Tahir, Mahjabeen and Abdullah, Azizol and Udzir, Nur Izura and Kasmiran, Khairul Azhar (2025) A novel approach for handling missing data to enhance network intrusion detection system. Cyber Security and Applications, 3. art. no. 100063. pp. 1-11. ISSN 2772-9184 https://linkinghub.elsevier.com/retrieve/pii/S2772918424000298 10.1016/j.csa.2024.100063
spellingShingle Tahir, Mahjabeen
Abdullah, Azizol
Udzir, Nur Izura
Kasmiran, Khairul Azhar
A novel approach for handling missing data to enhance network intrusion detection system
title A novel approach for handling missing data to enhance network intrusion detection system
title_full A novel approach for handling missing data to enhance network intrusion detection system
title_fullStr A novel approach for handling missing data to enhance network intrusion detection system
title_full_unstemmed A novel approach for handling missing data to enhance network intrusion detection system
title_short A novel approach for handling missing data to enhance network intrusion detection system
title_sort novel approach for handling missing data to enhance network intrusion detection system
url http://psasir.upm.edu.my/id/eprint/120305/
http://psasir.upm.edu.my/id/eprint/120305/
http://psasir.upm.edu.my/id/eprint/120305/
http://psasir.upm.edu.my/id/eprint/120305/1/120305.pdf