Artificial Intelligence Method For Filling Missing Meteorological Data In Long Term Monitoring Strategy

Complete meteorological data is vital for climate detection, planning, modelling and management purposes. However, missing data is a common occurrence in meteorological data due to defective equipment, erroneous calibration, improper maintenance, natural hazards and human-related discrepancies. The...

Full description

Bibliographic Details
Main Author: Yii, Sharon
Format: Final Year Project / Dissertation / Thesis
Published: 2021
Subjects:
Online Access:http://eprints.utar.edu.my/4242/
http://eprints.utar.edu.my/4242/1/1605934_FYP_Report_%2D_YII_SHARON.pdf
_version_ 1848886107930886144
author Yii, Sharon
author_facet Yii, Sharon
author_sort Yii, Sharon
building UTAR Institutional Repository
collection Online Access
description Complete meteorological data is vital for climate detection, planning, modelling and management purposes. However, missing data is a common occurrence in meteorological data due to defective equipment, erroneous calibration, improper maintenance, natural hazards and human-related discrepancies. The purpose of this study is to evaluate the viability of hybrid long-short term memory (LSTM) models, namely LSTM hybridized with random forest (LSTM-RF) and LSTM hybridized with support vector machine (LSTM-SVM) for the imputation of missing meteorological data. The meteorological data utilized to train the LSTM models included maximum temperature (Tmax), minimum temperature (Tmin), mean temperature (Tmean), 24-hour mean relative humidity (RH24,mean), 24-hour mean wind speed (U24,mean) and evaporation (Ep), which were procured from twelve weather stations distributed across Peninsular and East Malaysia at Alor Setar, Bayan Lepas, Ipoh, KLIA Sepang, Kota Bharu, Kota Kinabalu, Kuantan, Kuching, Miri, Muadzam Shah, Pulau Langkawi and Subang, for a span of fourteen years from 2002 to 2016. The missing data percentage in this study was regulated at 10 %. The LSTM models were trained using the MATLAB R2020a software, while the performance of the models were ascertained based on the statistical measures of mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean square error (RMSE). The results evinced that the performance of the LSTM models were governed by the spatial locations of the stations. The hybrid LSTM-SVM model outperformed LSTM-RF and LSTM at a majority of the stations (eight out of twelve), especially in the coastal northeastern, inland south-eastern and inland south-western regions of Peninsular Malaysia, as well as the coasts of East Malaysia, due to the ability of LSTMSVM in analysing the relevance of input data via the employment of hyperplane and support vectors. Conversely, the performance of LSTM-RF transcends both LSTM-SVM and LSTM at the coastal north-western and south-eastern regions of Peninsular Malaysia, which has minimal monsoon effects. Therefore, the ensemble of decision trees is able to streamline the predictions of stable missing data. Overall, both the hybrid LSTM-RF and LSTM-SVM models performed better than the stand-alone LSTM. Nonetheless, LSTM-SVM is selected as the best model for its performance at the weather stations.
first_indexed 2025-11-15T19:33:15Z
format Final Year Project / Dissertation / Thesis
id utar-4242
institution Universiti Tunku Abdul Rahman
institution_category Local University
last_indexed 2025-11-15T19:33:15Z
publishDate 2021
recordtype eprints
repository_type Digital Repository
spelling utar-42422023-08-10T13:09:44Z Artificial Intelligence Method For Filling Missing Meteorological Data In Long Term Monitoring Strategy Yii, Sharon TA Engineering (General). Civil engineering (General) Complete meteorological data is vital for climate detection, planning, modelling and management purposes. However, missing data is a common occurrence in meteorological data due to defective equipment, erroneous calibration, improper maintenance, natural hazards and human-related discrepancies. The purpose of this study is to evaluate the viability of hybrid long-short term memory (LSTM) models, namely LSTM hybridized with random forest (LSTM-RF) and LSTM hybridized with support vector machine (LSTM-SVM) for the imputation of missing meteorological data. The meteorological data utilized to train the LSTM models included maximum temperature (Tmax), minimum temperature (Tmin), mean temperature (Tmean), 24-hour mean relative humidity (RH24,mean), 24-hour mean wind speed (U24,mean) and evaporation (Ep), which were procured from twelve weather stations distributed across Peninsular and East Malaysia at Alor Setar, Bayan Lepas, Ipoh, KLIA Sepang, Kota Bharu, Kota Kinabalu, Kuantan, Kuching, Miri, Muadzam Shah, Pulau Langkawi and Subang, for a span of fourteen years from 2002 to 2016. The missing data percentage in this study was regulated at 10 %. The LSTM models were trained using the MATLAB R2020a software, while the performance of the models were ascertained based on the statistical measures of mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean square error (RMSE). The results evinced that the performance of the LSTM models were governed by the spatial locations of the stations. The hybrid LSTM-SVM model outperformed LSTM-RF and LSTM at a majority of the stations (eight out of twelve), especially in the coastal northeastern, inland south-eastern and inland south-western regions of Peninsular Malaysia, as well as the coasts of East Malaysia, due to the ability of LSTMSVM in analysing the relevance of input data via the employment of hyperplane and support vectors. Conversely, the performance of LSTM-RF transcends both LSTM-SVM and LSTM at the coastal north-western and south-eastern regions of Peninsular Malaysia, which has minimal monsoon effects. Therefore, the ensemble of decision trees is able to streamline the predictions of stable missing data. Overall, both the hybrid LSTM-RF and LSTM-SVM models performed better than the stand-alone LSTM. Nonetheless, LSTM-SVM is selected as the best model for its performance at the weather stations. 2021 Final Year Project / Dissertation / Thesis NonPeerReviewed application/pdf http://eprints.utar.edu.my/4242/1/1605934_FYP_Report_%2D_YII_SHARON.pdf Yii, Sharon (2021) Artificial Intelligence Method For Filling Missing Meteorological Data In Long Term Monitoring Strategy. Final Year Project, UTAR. http://eprints.utar.edu.my/4242/
spellingShingle TA Engineering (General). Civil engineering (General)
Yii, Sharon
Artificial Intelligence Method For Filling Missing Meteorological Data In Long Term Monitoring Strategy
title Artificial Intelligence Method For Filling Missing Meteorological Data In Long Term Monitoring Strategy
title_full Artificial Intelligence Method For Filling Missing Meteorological Data In Long Term Monitoring Strategy
title_fullStr Artificial Intelligence Method For Filling Missing Meteorological Data In Long Term Monitoring Strategy
title_full_unstemmed Artificial Intelligence Method For Filling Missing Meteorological Data In Long Term Monitoring Strategy
title_short Artificial Intelligence Method For Filling Missing Meteorological Data In Long Term Monitoring Strategy
title_sort artificial intelligence method for filling missing meteorological data in long term monitoring strategy
topic TA Engineering (General). Civil engineering (General)
url http://eprints.utar.edu.my/4242/
http://eprints.utar.edu.my/4242/1/1605934_FYP_Report_%2D_YII_SHARON.pdf