Optimizing sentiment analysis of Indonesian texts: Enhancing deep learning models with genetic algorithm-based feature selection
Automatic text classification techniques are employed in a multitude of real-world applications, including the filtering of unsolicited messages, the analysis of sentiment, and the categorization of news items. The primary challenge in text representation is the high dimensionality, which can increa...
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Penerbit UTHM
2024
|
| Subjects: | |
| Online Access: | http://umpir.ump.edu.my/id/eprint/43886/ http://umpir.ump.edu.my/id/eprint/43886/1/Optimizing%20sentiment%20analysis%20of%20indonesian%20texts.pdf |
| _version_ | 1848826983230734336 |
|---|---|
| author | Siti, Mujilahwati Noor Zuraidin, Mohd Safar Ku Muhammad Naim, Ku Khalif Nasyitah, Ghazalli |
| author_facet | Siti, Mujilahwati Noor Zuraidin, Mohd Safar Ku Muhammad Naim, Ku Khalif Nasyitah, Ghazalli |
| author_sort | Siti, Mujilahwati |
| building | UMP Institutional Repository |
| collection | Online Access |
| description | Automatic text classification techniques are employed in a multitude of real-world applications, including the filtering of unsolicited messages, the analysis of sentiment, and the categorization of news items. The primary challenge in text representation is the high dimensionality, which can increase the complexity and risk of overfitting the model. To address this challenge, feature selection (FS) is conducted during the data pre-processing phase with the objective of enhancing the learning accuracy and efficiency of the model. This study examines the optimization of Indonesian text sentiment analysis through the integration of feature selection using a genetic algorithm (GA) with deep learning models. The application of GA for data dimensionality reduction from 41,140 to 20,769 features, coupled with fitness evaluation based on SVM, resulted in an observed increase in accuracy by 8.10% for SVM, 36.1% for Naïve Bayes, 7.82% for LSTM, 5.47% for DNN, and 6.25% for CNN. Of the three deep learning models, LSTM demonstrated the highest accuracy, at 91.41%, while also exhibiting a notable reduction in computation time, approaching 50%. |
| first_indexed | 2025-11-15T03:53:29Z |
| format | Article |
| id | ump-43886 |
| institution | Universiti Malaysia Pahang |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T03:53:29Z |
| publishDate | 2024 |
| publisher | Penerbit UTHM |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | ump-438862025-02-20T08:53:48Z http://umpir.ump.edu.my/id/eprint/43886/ Optimizing sentiment analysis of Indonesian texts: Enhancing deep learning models with genetic algorithm-based feature selection Siti, Mujilahwati Noor Zuraidin, Mohd Safar Ku Muhammad Naim, Ku Khalif Nasyitah, Ghazalli Q Science (General) QA Mathematics Automatic text classification techniques are employed in a multitude of real-world applications, including the filtering of unsolicited messages, the analysis of sentiment, and the categorization of news items. The primary challenge in text representation is the high dimensionality, which can increase the complexity and risk of overfitting the model. To address this challenge, feature selection (FS) is conducted during the data pre-processing phase with the objective of enhancing the learning accuracy and efficiency of the model. This study examines the optimization of Indonesian text sentiment analysis through the integration of feature selection using a genetic algorithm (GA) with deep learning models. The application of GA for data dimensionality reduction from 41,140 to 20,769 features, coupled with fitness evaluation based on SVM, resulted in an observed increase in accuracy by 8.10% for SVM, 36.1% for Naïve Bayes, 7.82% for LSTM, 5.47% for DNN, and 6.25% for CNN. Of the three deep learning models, LSTM demonstrated the highest accuracy, at 91.41%, while also exhibiting a notable reduction in computation time, approaching 50%. Penerbit UTHM 2024-12-18 Article PeerReviewed pdf en cc_by_nc_sa_4 http://umpir.ump.edu.my/id/eprint/43886/1/Optimizing%20sentiment%20analysis%20of%20indonesian%20texts.pdf Siti, Mujilahwati and Noor Zuraidin, Mohd Safar and Ku Muhammad Naim, Ku Khalif and Nasyitah, Ghazalli (2024) Optimizing sentiment analysis of Indonesian texts: Enhancing deep learning models with genetic algorithm-based feature selection. Journal of Soft Computing and Data Mining, 5 (2). pp. 208-222. ISSN 2716-621X. (Published) https://doi.org/10.30880/jscdm.2024.05.02.016 https://doi.org/10.30880/jscdm.2024.05.02.016 |
| spellingShingle | Q Science (General) QA Mathematics Siti, Mujilahwati Noor Zuraidin, Mohd Safar Ku Muhammad Naim, Ku Khalif Nasyitah, Ghazalli Optimizing sentiment analysis of Indonesian texts: Enhancing deep learning models with genetic algorithm-based feature selection |
| title | Optimizing sentiment analysis of Indonesian texts: Enhancing deep learning models with genetic algorithm-based feature selection |
| title_full | Optimizing sentiment analysis of Indonesian texts: Enhancing deep learning models with genetic algorithm-based feature selection |
| title_fullStr | Optimizing sentiment analysis of Indonesian texts: Enhancing deep learning models with genetic algorithm-based feature selection |
| title_full_unstemmed | Optimizing sentiment analysis of Indonesian texts: Enhancing deep learning models with genetic algorithm-based feature selection |
| title_short | Optimizing sentiment analysis of Indonesian texts: Enhancing deep learning models with genetic algorithm-based feature selection |
| title_sort | optimizing sentiment analysis of indonesian texts: enhancing deep learning models with genetic algorithm-based feature selection |
| topic | Q Science (General) QA Mathematics |
| url | http://umpir.ump.edu.my/id/eprint/43886/ http://umpir.ump.edu.my/id/eprint/43886/ http://umpir.ump.edu.my/id/eprint/43886/ http://umpir.ump.edu.my/id/eprint/43886/1/Optimizing%20sentiment%20analysis%20of%20indonesian%20texts.pdf |