Optimizing tuberculosis treatment predictions: a comparative study of XGBoost with hyperparameter in Penang, Malaysia

The bacterium Mycobacterium tuberculosis causes a viral infection affecting the lungs and liver. Tuberculosis (TB) is a significant public health concern in developing countries, where it is often associated with poverty, poor living conditions, and limited access to healthcare services. According t...

Full description

Bibliographic Details
Main Authors: Yaniza Shaira Zakaria, Nur Afiqah Ariffin, Azizul Ahmad, Ruslan Rainis, Aidy M. Muslim, Wan Mohd Muhiyuddin Wan Ibrahim
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2025
Online Access:http://journalarticle.ukm.my/25075/
http://journalarticle.ukm.my/25075/1/SSB%2022.pdf
Description
Summary:The bacterium Mycobacterium tuberculosis causes a viral infection affecting the lungs and liver. Tuberculosis (TB) is a significant public health concern in developing countries, where it is often associated with poverty, poor living conditions, and limited access to healthcare services. According to the World Health Organization (2023), Tuberculosis continues to pose a substantial risk to public health on a global scale, with millions of people affected each year and around 1.5 million deaths in 2020. Healthcare providers often encounter significant challenges in addressing TB, leading to uncertain treatment outcomes. This study introduces a novel method for enhancing TB treatment using sophisticated machine learning techniques, particularly emphasizing the application of XGBoost and various predictive models in Penang State, Malaysia, to predict individual treatment outcomes based on clinical data. The models were trained using 2017 Penang data. Comparing predicted accuracy helps establish the optimum method. Clinical data was anonymized and analyzed. Decision tree accuracy is 63.7% using 2017 data. Logistic Regression is 63.3% accurate, while XGBoost is 66.3%. Hyperparameter-tuned XGBoost performs best at 68.1%. Comparing observed and expected results determines accuracy. TB result predictions are accurate using supervised learning. Calibrated ensemble models like XGBoost makes reliable predictions. Additional clinical characteristics may improve forecasts. The primary objective was to develop a reliable, clinically validated instrument that enhances TB treatments while optimizing resource efficiency across diverse healthcare environments.