Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec

The discovery of an active feature extraction technique has been the focus of many researchers to improve the performance of classification methods, such as for sentiment analysis. Many of them have shown interest in using word embeddings especially Word2Vec as the features for text classification t...

Full description

Bibliographic Details
Main Authors: M.Alshari, Eissa, Azman, Azreen, Doraisamy, Shyamala, Mustapha, Norwati, Alksher, Mostafa
Format: Article
Published: University of Malaya * Faculty of Computer Science and Information Technology 2020
Online Access:http://psasir.upm.edu.my/id/eprint/85796/
_version_ 1848860184529600512
author M.Alshari, Eissa
Azman, Azreen
Doraisamy, Shyamala
Mustapha, Norwati
Alksher, Mostafa
author_facet M.Alshari, Eissa
Azman, Azreen
Doraisamy, Shyamala
Mustapha, Norwati
Alksher, Mostafa
author_sort M.Alshari, Eissa
building UPM Institutional Repository
collection Online Access
description The discovery of an active feature extraction technique has been the focus of many researchers to improve the performance of classification methods, such as for sentiment analysis. Many of them have shown interest in using word embeddings especially Word2Vec as the features for text classification tasks. Its ability to model high-quality distributional semantics among words has contributed to its success in many of the functions. Despite the success, Word2Vec features are high dimensional that lead to an increase in the complexity of the classifier. In this paper, an effective method for feature extraction based on Word2Vec is proposed for sentiment analysis. The process discovers polarity clusters of the terms in the vocabulary through Word2Vec and opinion lexical dictionary. The features vector for each text is constructed from the polarity clusters, which lead to a lower-dimensional vector to represent the text. This paper also investigates the effect of two opinion lexical dictionaries on the performance of sentiment analysis, and one of the dictionaries are created based on SentiWordNet. The effectiveness of the proposed method is evaluated on the IMDB with two classifiers, namely the Logistic Regression and the Support Vector Machine. The result is promising, showing that the proposed method can be more effective than the baseline approaches.
first_indexed 2025-11-15T12:41:12Z
format Article
id upm-85796
institution Universiti Putra Malaysia
institution_category Local University
last_indexed 2025-11-15T12:41:12Z
publishDate 2020
publisher University of Malaya * Faculty of Computer Science and Information Technology
recordtype eprints
repository_type Digital Repository
spelling upm-857962023-09-07T00:24:04Z http://psasir.upm.edu.my/id/eprint/85796/ Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec M.Alshari, Eissa Azman, Azreen Doraisamy, Shyamala Mustapha, Norwati Alksher, Mostafa The discovery of an active feature extraction technique has been the focus of many researchers to improve the performance of classification methods, such as for sentiment analysis. Many of them have shown interest in using word embeddings especially Word2Vec as the features for text classification tasks. Its ability to model high-quality distributional semantics among words has contributed to its success in many of the functions. Despite the success, Word2Vec features are high dimensional that lead to an increase in the complexity of the classifier. In this paper, an effective method for feature extraction based on Word2Vec is proposed for sentiment analysis. The process discovers polarity clusters of the terms in the vocabulary through Word2Vec and opinion lexical dictionary. The features vector for each text is constructed from the polarity clusters, which lead to a lower-dimensional vector to represent the text. This paper also investigates the effect of two opinion lexical dictionaries on the performance of sentiment analysis, and one of the dictionaries are created based on SentiWordNet. The effectiveness of the proposed method is evaluated on the IMDB with two classifiers, namely the Logistic Regression and the Support Vector Machine. The result is promising, showing that the proposed method can be more effective than the baseline approaches. University of Malaya * Faculty of Computer Science and Information Technology 2020 Article PeerReviewed M.Alshari, Eissa and Azman, Azreen and Doraisamy, Shyamala and Mustapha, Norwati and Alksher, Mostafa (2020) Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec. Malaysian Journal of Computer Science, 33 (3). 240 - 251. ISSN 0127-9084 https://ejournal.um.edu.my/index.php/MJCS/article/view/25280 10.22452/mjcs.vol33no3.5
spellingShingle M.Alshari, Eissa
Azman, Azreen
Doraisamy, Shyamala
Mustapha, Norwati
Alksher, Mostafa
Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_full Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_fullStr Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_full_unstemmed Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_short Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_sort senti2vec: an effective feature extraction technique for sentiment analysis based on word2vec
url http://psasir.upm.edu.my/id/eprint/85796/
http://psasir.upm.edu.my/id/eprint/85796/
http://psasir.upm.edu.my/id/eprint/85796/