Systematic review of using machine learning in imputing missing values

Missing data are a universal data quality problem in many domains, leading to misleading analysis and inaccurate decisions. Much research has been done to investigate the different mechanisms of missing data and the proper techniques in handling various data types. In the last decade, machine learni...

Full description

Bibliographic Details
Main Authors: Alabadla, Mustafa, Fatimah, Sidi, Iskandar, Ishak, Hamidah D., Ibrahim, Lilly Suriani, Affendey, Zafienas, Che Ani, A. Jabar, Marzanah Ab, Bukar, Umar Ali, Devaraj, Navin Kumar, Ahmad Sobri, Muda, Tharek, Anas, Noritah, Omar, Mohd Izham, Mohd Jaya
Format: Article
Language:English
English
Published: Institute of Electrical and Electronics Engineers Inc. 2022
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/38852/
http://umpir.ump.edu.my/id/eprint/38852/1/Systematic%20Review%20of%20Using%20Machine%20Learning%20in%20Imputing%20Missing%20Values.pdf
http://umpir.ump.edu.my/id/eprint/38852/2/Systematic%20review%20of%20using%20machine%20learning%20in%20imputing%20missing%20values_ABS.pdf
_version_ 1848825615270019072
author Alabadla, Mustafa
Fatimah, Sidi
Iskandar, Ishak
Hamidah D., Ibrahim
Lilly Suriani, Affendey
Zafienas, Che Ani
A. Jabar, Marzanah Ab
Bukar, Umar Ali
Devaraj, Navin Kumar
Ahmad Sobri, Muda
Tharek, Anas
Noritah, Omar
Mohd Izham, Mohd Jaya
author_facet Alabadla, Mustafa
Fatimah, Sidi
Iskandar, Ishak
Hamidah D., Ibrahim
Lilly Suriani, Affendey
Zafienas, Che Ani
A. Jabar, Marzanah Ab
Bukar, Umar Ali
Devaraj, Navin Kumar
Ahmad Sobri, Muda
Tharek, Anas
Noritah, Omar
Mohd Izham, Mohd Jaya
author_sort Alabadla, Mustafa
building UMP Institutional Repository
collection Online Access
description Missing data are a universal data quality problem in many domains, leading to misleading analysis and inaccurate decisions. Much research has been done to investigate the different mechanisms of missing data and the proper techniques in handling various data types. In the last decade, machine learning has been utilized to replace conventional methods to address the problem of missing values more efficiently. By studying and analyzing recently proposed methods using machine learning approaches, vital adoptions in accuracy, performance, and time consumed can be highlighted. This study aimed to help data analysts and researchers address the limitations of machine learning imputation methods by conducting a systematic literature review to provide a comprehensive overview of using such methods to impute missing values. Novel proposed machine learning approaches used for data imputation are analyzed and summarized to assist researchers in selecting a proper machine learning method based on several factors and settings. The review was performed on research studies published between 2016 and 2021 on adopting machine learning to impute missing values, focusing on their strengths and limitations. A total of 684 research articles from various scientific databases were analyzed using search engines, and 94 of them were selected as primary studies. Finally, several recommendations were given to guide future researchers in applying machine learning to impute missing values.
first_indexed 2025-11-15T03:31:44Z
format Article
id ump-38852
institution Universiti Malaysia Pahang
institution_category Local University
language English
English
last_indexed 2025-11-15T03:31:44Z
publishDate 2022
publisher Institute of Electrical and Electronics Engineers Inc.
recordtype eprints
repository_type Digital Repository
spelling ump-388522023-11-08T02:50:19Z http://umpir.ump.edu.my/id/eprint/38852/ Systematic review of using machine learning in imputing missing values Alabadla, Mustafa Fatimah, Sidi Iskandar, Ishak Hamidah D., Ibrahim Lilly Suriani, Affendey Zafienas, Che Ani A. Jabar, Marzanah Ab Bukar, Umar Ali Devaraj, Navin Kumar Ahmad Sobri, Muda Tharek, Anas Noritah, Omar Mohd Izham, Mohd Jaya QA75 Electronic computers. Computer science QA76 Computer software T Technology (General) TA Engineering (General). Civil engineering (General) Missing data are a universal data quality problem in many domains, leading to misleading analysis and inaccurate decisions. Much research has been done to investigate the different mechanisms of missing data and the proper techniques in handling various data types. In the last decade, machine learning has been utilized to replace conventional methods to address the problem of missing values more efficiently. By studying and analyzing recently proposed methods using machine learning approaches, vital adoptions in accuracy, performance, and time consumed can be highlighted. This study aimed to help data analysts and researchers address the limitations of machine learning imputation methods by conducting a systematic literature review to provide a comprehensive overview of using such methods to impute missing values. Novel proposed machine learning approaches used for data imputation are analyzed and summarized to assist researchers in selecting a proper machine learning method based on several factors and settings. The review was performed on research studies published between 2016 and 2021 on adopting machine learning to impute missing values, focusing on their strengths and limitations. A total of 684 research articles from various scientific databases were analyzed using search engines, and 94 of them were selected as primary studies. Finally, several recommendations were given to guide future researchers in applying machine learning to impute missing values. Institute of Electrical and Electronics Engineers Inc. 2022 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/38852/1/Systematic%20Review%20of%20Using%20Machine%20Learning%20in%20Imputing%20Missing%20Values.pdf pdf en http://umpir.ump.edu.my/id/eprint/38852/2/Systematic%20review%20of%20using%20machine%20learning%20in%20imputing%20missing%20values_ABS.pdf Alabadla, Mustafa and Fatimah, Sidi and Iskandar, Ishak and Hamidah D., Ibrahim and Lilly Suriani, Affendey and Zafienas, Che Ani and A. Jabar, Marzanah Ab and Bukar, Umar Ali and Devaraj, Navin Kumar and Ahmad Sobri, Muda and Tharek, Anas and Noritah, Omar and Mohd Izham, Mohd Jaya (2022) Systematic review of using machine learning in imputing missing values. IEEE Access, 10. pp. 44483-44502. ISSN 2169-3536. (Published) https://doi.org/10.1109/ACCESS.2022.3160841 https://doi.org/10.1109/ACCESS.2022.3160841
spellingShingle QA75 Electronic computers. Computer science
QA76 Computer software
T Technology (General)
TA Engineering (General). Civil engineering (General)
Alabadla, Mustafa
Fatimah, Sidi
Iskandar, Ishak
Hamidah D., Ibrahim
Lilly Suriani, Affendey
Zafienas, Che Ani
A. Jabar, Marzanah Ab
Bukar, Umar Ali
Devaraj, Navin Kumar
Ahmad Sobri, Muda
Tharek, Anas
Noritah, Omar
Mohd Izham, Mohd Jaya
Systematic review of using machine learning in imputing missing values
title Systematic review of using machine learning in imputing missing values
title_full Systematic review of using machine learning in imputing missing values
title_fullStr Systematic review of using machine learning in imputing missing values
title_full_unstemmed Systematic review of using machine learning in imputing missing values
title_short Systematic review of using machine learning in imputing missing values
title_sort systematic review of using machine learning in imputing missing values
topic QA75 Electronic computers. Computer science
QA76 Computer software
T Technology (General)
TA Engineering (General). Civil engineering (General)
url http://umpir.ump.edu.my/id/eprint/38852/
http://umpir.ump.edu.my/id/eprint/38852/
http://umpir.ump.edu.my/id/eprint/38852/
http://umpir.ump.edu.my/id/eprint/38852/1/Systematic%20Review%20of%20Using%20Machine%20Learning%20in%20Imputing%20Missing%20Values.pdf
http://umpir.ump.edu.my/id/eprint/38852/2/Systematic%20review%20of%20using%20machine%20learning%20in%20imputing%20missing%20values_ABS.pdf