Missing values imputation tool using imputex algorithm

Missing data is a prevalent issue affecting data quality across numerous fields. One frequent challenge arises when data is lost during the input stage. Numerous studies have proposed methods to impute missing values for data across multiple fields. However, certain domains present unique challenges...

Full description

Bibliographic Details
Main Authors: Sidi, Fatimah, Abdullah, Lili Nurliyana, Alabadla, Mustafa, Ishak, Iskandar
Format: Article
Language:English
Published: Manash Kozybayev North Kazakhstan University 2024
Online Access:http://psasir.upm.edu.my/id/eprint/118068/
http://psasir.upm.edu.my/id/eprint/118068/1/118068.pdf
Description
Summary:Missing data is a prevalent issue affecting data quality across numerous fields. One frequent challenge arises when data is lost during the input stage. Numerous studies have proposed methods to impute missing values for data across multiple fields. However, certain domains present unique challenges due to the involvement of attributes from multiple scientific disciplines, such as biology, chemistry, and medical which complicates the imputation process. The purpose of this study is to design an application that addresses missing values and maintains accuracy in large datasets, with a focus on minimizing processing time. The application's performance is evaluated based on classification accuracy using various imputation methods. The proposed application outperforms performance compared to current software tools such as against R package, Statistical Package for the Social Sciences (SPSS), Stata, and Microsoft Excel. This study helps to improve data quality and contributes to data science by improving the data cleaning procedure, which is a step in the data pre-processing stage.