Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia
Accurately predicting climate variables such as air temperature, humidity and precipitation plays a crucial role in air quality management. This research aims to provide preliminary information that can shed lights to local stakeholders for climate adaptation strategies in Johor Bahru city, Malaysia...
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Research
2025
|
| Online Access: | http://psasir.upm.edu.my/id/eprint/120149/ http://psasir.upm.edu.my/id/eprint/120149/1/120149.pdf |
| _version_ | 1848868123776647168 |
|---|---|
| author | Che Rose, Farid Zamani Rosili, Nur Aqilah Khadijah Marsani, Muhammad Fadhil |
| author_facet | Che Rose, Farid Zamani Rosili, Nur Aqilah Khadijah Marsani, Muhammad Fadhil |
| author_sort | Che Rose, Farid Zamani |
| building | UPM Institutional Repository |
| collection | Online Access |
| description | Accurately predicting climate variables such as air temperature, humidity and precipitation plays a crucial role in air quality management. This research aims to provide preliminary information that can shed lights to local stakeholders for climate adaptation strategies in Johor Bahru city, Malaysia. Five machine learning models were employed viz. Support Vector Regressions (SVR), Random Forest (RF), Gradient Boosting Machine (GBM), Extreme Gradient Boosting Machine (XGBoost) and Prophet to analyze the 15,888 daily time series climate data in Johor Bahru city, Malaysia. Six climate variables datasets obtained from NASA Prediction of Worldwide Energy Resources (POWER) include Temperature at 2 m (T2M), Dew/Frost Point at 2 m (T2MDEW), Wet Bulb Temperature at 2 m (T2MWET), Specific Humidity at 2 m (QV2M), Relative Humidity at 2 m (RH2M), Precipitation (PREC). Results showed that RF outperforms the other ML models in prediction performance by exhibiting the lowest error for both training and testing data. Superior results are seen for RF in fitting the training data for T2M, T2MDEW and T2MWET with R² above 90% demonstrating a strong predictive capability. RF exhibits the lowest error to predict the T2M (RMSE: 0.2182, MAE: 0.1679), T2MDEW (RMSE: 0.2291, MAE: 0.1750), T2MWET (RMSE: 0.1621, MAE: 0.1251), QH2M (RMSE: 0.3502, MAE: 0.2701) and RV2M (RMSE: 1.4444, MAE: 1.1090). RF shows particularly strong Nash–Sutcliffe efficiency (NSE) scores up to 0.94 in the training phase, especially for temperature-related variables indicating high explanatory power and stability. In contrast, SVR demonstrates superior generalization in the testing phase, with the highest Kling-Gupta Efficiency (KGE) value (0.88) confirming its reliability in out-of-sample forecasting. The findings of this research provide transparent, data-driven insights that can inform policymakers and guide the development of robust public policies and strategic investments in Johor Bahru. |
| first_indexed | 2025-11-15T14:47:24Z |
| format | Article |
| id | upm-120149 |
| institution | Universiti Putra Malaysia |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T14:47:24Z |
| publishDate | 2025 |
| publisher | Nature Research |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | upm-1201492025-09-24T02:02:43Z http://psasir.upm.edu.my/id/eprint/120149/ Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia Che Rose, Farid Zamani Rosili, Nur Aqilah Khadijah Marsani, Muhammad Fadhil Accurately predicting climate variables such as air temperature, humidity and precipitation plays a crucial role in air quality management. This research aims to provide preliminary information that can shed lights to local stakeholders for climate adaptation strategies in Johor Bahru city, Malaysia. Five machine learning models were employed viz. Support Vector Regressions (SVR), Random Forest (RF), Gradient Boosting Machine (GBM), Extreme Gradient Boosting Machine (XGBoost) and Prophet to analyze the 15,888 daily time series climate data in Johor Bahru city, Malaysia. Six climate variables datasets obtained from NASA Prediction of Worldwide Energy Resources (POWER) include Temperature at 2 m (T2M), Dew/Frost Point at 2 m (T2MDEW), Wet Bulb Temperature at 2 m (T2MWET), Specific Humidity at 2 m (QV2M), Relative Humidity at 2 m (RH2M), Precipitation (PREC). Results showed that RF outperforms the other ML models in prediction performance by exhibiting the lowest error for both training and testing data. Superior results are seen for RF in fitting the training data for T2M, T2MDEW and T2MWET with R² above 90% demonstrating a strong predictive capability. RF exhibits the lowest error to predict the T2M (RMSE: 0.2182, MAE: 0.1679), T2MDEW (RMSE: 0.2291, MAE: 0.1750), T2MWET (RMSE: 0.1621, MAE: 0.1251), QH2M (RMSE: 0.3502, MAE: 0.2701) and RV2M (RMSE: 1.4444, MAE: 1.1090). RF shows particularly strong Nash–Sutcliffe efficiency (NSE) scores up to 0.94 in the training phase, especially for temperature-related variables indicating high explanatory power and stability. In contrast, SVR demonstrates superior generalization in the testing phase, with the highest Kling-Gupta Efficiency (KGE) value (0.88) confirming its reliability in out-of-sample forecasting. The findings of this research provide transparent, data-driven insights that can inform policymakers and guide the development of robust public policies and strategic investments in Johor Bahru. Nature Research 2025 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/120149/1/120149.pdf Che Rose, Farid Zamani and Rosili, Nur Aqilah Khadijah and Marsani, Muhammad Fadhil (2025) Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia. Scientific Reports, 15 (1). art. no. 23465. pp. 1-20. ISSN 2045-2322 https://www.nature.com/articles/s41598-025-08033-y?error=cookies_not_supported&code=5e1914c6-3294-411c-88bf-e206602a4cdc 10.1038/s41598-025-08033-y |
| spellingShingle | Che Rose, Farid Zamani Rosili, Nur Aqilah Khadijah Marsani, Muhammad Fadhil Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia |
| title | Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia |
| title_full | Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia |
| title_fullStr | Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia |
| title_full_unstemmed | Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia |
| title_short | Comparison of machine learning model performance for predicting the climate variables in Johor Bahru, Malaysia |
| title_sort | comparison of machine learning model performance for predicting the climate variables in johor bahru, malaysia |
| url | http://psasir.upm.edu.my/id/eprint/120149/ http://psasir.upm.edu.my/id/eprint/120149/ http://psasir.upm.edu.my/id/eprint/120149/ http://psasir.upm.edu.my/id/eprint/120149/1/120149.pdf |