Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection

Artificial Neural Networks is a computing system that is inspired by how the nervous system works in humans and continues to grow rapidly until now. Just like the nervous system in humans, artificial neural networks work through the process of studying existing data to formulate new data outputs. An...

Full description

Bibliographic Details
Main Authors: Ferna, Marestiani, Sugiyarto, Surono
Format: Article
Language:English
Published: INTI International University 2022
Subjects:
Online Access:http://eprints.intimal.edu.my/1632/
http://eprints.intimal.edu.my/1632/1/jods2022_04.pdf
_version_ 1848766791475527680
author Ferna, Marestiani
Sugiyarto, Surono
author_facet Ferna, Marestiani
Sugiyarto, Surono
author_sort Ferna, Marestiani
building INTI Institutional Repository
collection Online Access
description Artificial Neural Networks is a computing system that is inspired by how the nervous system works in humans and continues to grow rapidly until now. Just like the nervous system in humans, artificial neural networks work through the process of studying existing data to formulate new data outputs. An artificial neural network using the Recurrent Neural Network (RNN) method is one of the popular models used today, especially in forecasting cases. In simple terms, the forecasting flow using the RNN method begins by dividing the test data and training data, the forward calculation process, the backward calculation process, the optimization calculation, and the evaluation calculation of the forecasting model. The main obstacle of the RNN method is the presence of a vanishing gradient which can cause poor forecasting results. In this study, the authors propose a Principal Component Analysis (PCA) dimension reduction method to obtain the most influential variables and become inputs for the prediction model that is built to minimize existing errors. The author also uses the K-means clustering method to divide the data with similar trend variations. To increase the clustering effect, the researcher used similarity calculation based on Euclidean distance. So that in an effort to build optimal prediction results, first time series data with the most influential variables will be selected using the PCA method. Furthermore, the data are grouped using the K-means method and will be included in the prediction model that is built. In the RNN prediction model, the data will be trained using the Backpropagation Through Time (BPTT) method and the optimization method used is Stochastic Gradient Descent (SGD). Forecasting with the RNN method with PCA produces an accuracy of 93%, while forecasting using the RNN method without PCA produces an accuracy of 82%. The experimental results show that the RNN method with PCA achieves higher predictive accuracy and flexibility than RNN without PCA.
first_indexed 2025-11-14T11:56:45Z
format Article
id intimal-1632
institution INTI International University
institution_category Local University
language English
last_indexed 2025-11-14T11:56:45Z
publishDate 2022
publisher INTI International University
recordtype eprints
repository_type Digital Repository
spelling intimal-16322024-05-07T09:37:17Z http://eprints.intimal.edu.my/1632/ Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection Ferna, Marestiani Sugiyarto, Surono QA75 Electronic computers. Computer science QC Physics T Technology (General) Artificial Neural Networks is a computing system that is inspired by how the nervous system works in humans and continues to grow rapidly until now. Just like the nervous system in humans, artificial neural networks work through the process of studying existing data to formulate new data outputs. An artificial neural network using the Recurrent Neural Network (RNN) method is one of the popular models used today, especially in forecasting cases. In simple terms, the forecasting flow using the RNN method begins by dividing the test data and training data, the forward calculation process, the backward calculation process, the optimization calculation, and the evaluation calculation of the forecasting model. The main obstacle of the RNN method is the presence of a vanishing gradient which can cause poor forecasting results. In this study, the authors propose a Principal Component Analysis (PCA) dimension reduction method to obtain the most influential variables and become inputs for the prediction model that is built to minimize existing errors. The author also uses the K-means clustering method to divide the data with similar trend variations. To increase the clustering effect, the researcher used similarity calculation based on Euclidean distance. So that in an effort to build optimal prediction results, first time series data with the most influential variables will be selected using the PCA method. Furthermore, the data are grouped using the K-means method and will be included in the prediction model that is built. In the RNN prediction model, the data will be trained using the Backpropagation Through Time (BPTT) method and the optimization method used is Stochastic Gradient Descent (SGD). Forecasting with the RNN method with PCA produces an accuracy of 93%, while forecasting using the RNN method without PCA produces an accuracy of 82%. The experimental results show that the RNN method with PCA achieves higher predictive accuracy and flexibility than RNN without PCA. INTI International University 2022-06 Article PeerReviewed text en cc_by_4 http://eprints.intimal.edu.my/1632/1/jods2022_04.pdf Ferna, Marestiani and Sugiyarto, Surono (2022) Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection. Journal of Data Science, 2022 (04). pp. 1-14. ISSN 2805-5160 http://ipublishing.intimal.edu.my/jods.html
spellingShingle QA75 Electronic computers. Computer science
QC Physics
T Technology (General)
Ferna, Marestiani
Sugiyarto, Surono
Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_full Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_fullStr Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_full_unstemmed Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_short Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_sort forecasting using k-means clustering and rnn methods with pca feature selection
topic QA75 Electronic computers. Computer science
QC Physics
T Technology (General)
url http://eprints.intimal.edu.my/1632/
http://eprints.intimal.edu.my/1632/
http://eprints.intimal.edu.my/1632/1/jods2022_04.pdf