Evaluating A New Adaptive Group Lasso Imputation Technique For Handling Missing Values In Compositional Data

Pie chart is a widely used statistical chart to represent the proportions of various components in a certain entity. The shares of data in a pie chart, also known as compositional data, consist of non-negative values, containing only relative information. However, in many real-life domains, a sub...

Full description

Bibliographic Details
Main Author: Tian, Ying
Format: Thesis
Language:English
Published: 2024
Subjects:
Online Access:http://eprints.usm.my/62560/
http://eprints.usm.my/62560/1/TIAN%20YING%20-%20TESIS%20cut.pdf
Description
Summary:Pie chart is a widely used statistical chart to represent the proportions of various components in a certain entity. The shares of data in a pie chart, also known as compositional data, consist of non-negative values, containing only relative information. However, in many real-life domains, a substantial amount of missing values is often collected. The complexity of compositional data with missing values renders traditional estimation methods inadequate. In this thesis, a compositional data imputation method designed based on LASSO is proposed combining group LASSO and adaptive LASSO analysis methods. The estimation effects of highdimensional and low-dimensional compositional data with missing values are compared through simulation studies and case analyses under different missing rates, dimensions, and correlation coefficients. Considering the impact of outliers on the accuracy of estimation, both simulation and case analysis are conducted to compare the proposed algorithm against four existing methods. The experimental results demonstrate that the proposed adaptive group LASSO method produces a better imputation performance, MSE, MADE, RMSE and NRMSE increased by up to 26.6% at selected missing rates. Future work analyses the effect of imputation under continuous missing rates, MAR missing mechanism and more model evaluation criteria.