Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia

Accurately predicting climate variables such as air temperature, humidity and precipitation plays a crucial role in air quality management. This research aims to provide preliminary information that can shed lights to local stakeholders for climate adaptation strategies in Johor Bahru city, Malaysia...

Full description

Bibliographic Details
Main Authors: Che Rose, Farid Zamani, Rosili, Nur Aqilah Khadijah, Marsani, Muhammad Fadhil
Format: Article
Language:English
Published: Nature Research 2025
Online Access:http://psasir.upm.edu.my/id/eprint/120149/
http://psasir.upm.edu.my/id/eprint/120149/1/120149.pdf
_version_ 1848868123776647168
author Che Rose, Farid Zamani
Rosili, Nur Aqilah Khadijah
Marsani, Muhammad Fadhil
author_facet Che Rose, Farid Zamani
Rosili, Nur Aqilah Khadijah
Marsani, Muhammad Fadhil
author_sort Che Rose, Farid Zamani
building UPM Institutional Repository
collection Online Access
description Accurately predicting climate variables such as air temperature, humidity and precipitation plays a crucial role in air quality management. This research aims to provide preliminary information that can shed lights to local stakeholders for climate adaptation strategies in Johor Bahru city, Malaysia. Five machine learning models were employed viz. Support Vector Regressions (SVR), Random Forest (RF), Gradient Boosting Machine (GBM), Extreme Gradient Boosting Machine (XGBoost) and Prophet to analyze the 15,888 daily time series climate data in Johor Bahru city, Malaysia. Six climate variables datasets obtained from NASA Prediction of Worldwide Energy Resources (POWER) include Temperature at 2 m (T2M), Dew/Frost Point at 2 m (T2MDEW), Wet Bulb Temperature at 2 m (T2MWET), Specific Humidity at 2 m (QV2M), Relative Humidity at 2 m (RH2M), Precipitation (PREC). Results showed that RF outperforms the other ML models in prediction performance by exhibiting the lowest error for both training and testing data. Superior results are seen for RF in fitting the training data for T2M, T2MDEW and T2MWET with R² above 90% demonstrating a strong predictive capability. RF exhibits the lowest error to predict the T2M (RMSE: 0.2182, MAE: 0.1679), T2MDEW (RMSE: 0.2291, MAE: 0.1750), T2MWET (RMSE: 0.1621, MAE: 0.1251), QH2M (RMSE: 0.3502, MAE: 0.2701) and RV2M (RMSE: 1.4444, MAE: 1.1090). RF shows particularly strong Nash–Sutcliffe efficiency (NSE) scores up to 0.94 in the training phase, especially for temperature-related variables indicating high explanatory power and stability. In contrast, SVR demonstrates superior generalization in the testing phase, with the highest Kling-Gupta Efficiency (KGE) value (0.88) confirming its reliability in out-of-sample forecasting. The findings of this research provide transparent, data-driven insights that can inform policymakers and guide the development of robust public policies and strategic investments in Johor Bahru.
first_indexed 2025-11-15T14:47:24Z
format Article
id upm-120149
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T14:47:24Z
publishDate 2025
publisher Nature Research
recordtype eprints
repository_type Digital Repository
spelling upm-1201492025-09-24T02:02:43Z http://psasir.upm.edu.my/id/eprint/120149/ Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia Che Rose, Farid Zamani Rosili, Nur Aqilah Khadijah Marsani, Muhammad Fadhil Accurately predicting climate variables such as air temperature, humidity and precipitation plays a crucial role in air quality management. This research aims to provide preliminary information that can shed lights to local stakeholders for climate adaptation strategies in Johor Bahru city, Malaysia. Five machine learning models were employed viz. Support Vector Regressions (SVR), Random Forest (RF), Gradient Boosting Machine (GBM), Extreme Gradient Boosting Machine (XGBoost) and Prophet to analyze the 15,888 daily time series climate data in Johor Bahru city, Malaysia. Six climate variables datasets obtained from NASA Prediction of Worldwide Energy Resources (POWER) include Temperature at 2 m (T2M), Dew/Frost Point at 2 m (T2MDEW), Wet Bulb Temperature at 2 m (T2MWET), Specific Humidity at 2 m (QV2M), Relative Humidity at 2 m (RH2M), Precipitation (PREC). Results showed that RF outperforms the other ML models in prediction performance by exhibiting the lowest error for both training and testing data. Superior results are seen for RF in fitting the training data for T2M, T2MDEW and T2MWET with R² above 90% demonstrating a strong predictive capability. RF exhibits the lowest error to predict the T2M (RMSE: 0.2182, MAE: 0.1679), T2MDEW (RMSE: 0.2291, MAE: 0.1750), T2MWET (RMSE: 0.1621, MAE: 0.1251), QH2M (RMSE: 0.3502, MAE: 0.2701) and RV2M (RMSE: 1.4444, MAE: 1.1090). RF shows particularly strong Nash–Sutcliffe efficiency (NSE) scores up to 0.94 in the training phase, especially for temperature-related variables indicating high explanatory power and stability. In contrast, SVR demonstrates superior generalization in the testing phase, with the highest Kling-Gupta Efficiency (KGE) value (0.88) confirming its reliability in out-of-sample forecasting. The findings of this research provide transparent, data-driven insights that can inform policymakers and guide the development of robust public policies and strategic investments in Johor Bahru. Nature Research 2025 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/120149/1/120149.pdf Che Rose, Farid Zamani and Rosili, Nur Aqilah Khadijah and Marsani, Muhammad Fadhil (2025) Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia. Scientific Reports, 15 (1). art. no. 23465. pp. 1-20. ISSN 2045-2322 https://www.nature.com/articles/s41598-025-08033-y?error=cookies_not_supported&code=5e1914c6-3294-411c-88bf-e206602a4cdc 10.1038/s41598-025-08033-y
spellingShingle Che Rose, Farid Zamani
Rosili, Nur Aqilah Khadijah
Marsani, Muhammad Fadhil
Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia
title Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia
title_full Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia
title_fullStr Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia
title_full_unstemmed Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia
title_short Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia
title_sort comparison of machine learning model performance for predicting the climate variables in johor bahru, malaysia
url http://psasir.upm.edu.my/id/eprint/120149/
http://psasir.upm.edu.my/id/eprint/120149/
http://psasir.upm.edu.my/id/eprint/120149/
http://psasir.upm.edu.my/id/eprint/120149/1/120149.pdf