NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification

Research in financial domain has shown that sentiment aspects of stock news have a profound impact on volume trades, volatility, stock prices and firm earnings. In-depth analysis of stock news is now sourced from financial reviews by various social networking and marketing sites to help improve deci...

Full description

Bibliographic Details
Main Authors: Yazdani, Sepideh Foroozan, Tan, Zhiyuan, Kakavand, Mohsen, Mustapha, Aida
Format: Article
Language:English
Published: Springer 2018
Subjects:
Online Access:http://eprints.uthm.edu.my/5136/
http://eprints.uthm.edu.my/5136/1/AJ%202018%20%28843%29%20NgramPOS%20a%20bigram-based%20linguistic%20and%20statistical%20feature%20process%20model%20for%20unstructured%20text%20classification.pdf
_version_ 1848888474236616704
author Yazdani, Sepideh Foroozan
Tan, Zhiyuan
Kakavand, Mohsen
Mustapha, Aida
author_facet Yazdani, Sepideh Foroozan
Tan, Zhiyuan
Kakavand, Mohsen
Mustapha, Aida
author_sort Yazdani, Sepideh Foroozan
building UTHM Institutional Repository
collection Online Access
description Research in financial domain has shown that sentiment aspects of stock news have a profound impact on volume trades, volatility, stock prices and firm earnings. In-depth analysis of stock news is now sourced from financial reviews by various social networking and marketing sites to help improve decision making. Nonetheless, such reviews are in the form of unstructured text, which requires natural language processing (NLP) in order to extract the sentiments. Accordingly, in this study we investigate the use of NLP tasks in effort to improve the performance of sentiment classification in evaluating the information content of financial news as an instrument in investment decision support system. At present, feature extraction approach is mainly based on the occurrence frequency of words. Therefore low-frequency linguistic features that could be critical in sentiment classification are typically ignored. In this research, we attempt to improve current sentiment analysis approaches for financial news classification by focusing on low-frequency but informative linguistic expressions. Our proposed combination of low and high-frequency linguistic expressions contributes a novel set of features for sentiment classification. The experimental results show that an optimal Ngram feature selection (combination of optimal unigram and bigram features) enhances sentiment classification accuracy as compared to other types of feature sets.
first_indexed 2025-11-15T20:10:51Z
format Article
id uthm-5136
institution Universiti Tun Hussein Onn Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T20:10:51Z
publishDate 2018
publisher Springer
recordtype eprints
repository_type Digital Repository
spelling uthm-51362022-01-06T02:29:16Z http://eprints.uthm.edu.my/5136/ NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification Yazdani, Sepideh Foroozan Tan, Zhiyuan Kakavand, Mohsen Mustapha, Aida QA76 Computer software TA Engineering (General). Civil engineering (General) TA329-348 Engineering mathematics. Engineering analysis Research in financial domain has shown that sentiment aspects of stock news have a profound impact on volume trades, volatility, stock prices and firm earnings. In-depth analysis of stock news is now sourced from financial reviews by various social networking and marketing sites to help improve decision making. Nonetheless, such reviews are in the form of unstructured text, which requires natural language processing (NLP) in order to extract the sentiments. Accordingly, in this study we investigate the use of NLP tasks in effort to improve the performance of sentiment classification in evaluating the information content of financial news as an instrument in investment decision support system. At present, feature extraction approach is mainly based on the occurrence frequency of words. Therefore low-frequency linguistic features that could be critical in sentiment classification are typically ignored. In this research, we attempt to improve current sentiment analysis approaches for financial news classification by focusing on low-frequency but informative linguistic expressions. Our proposed combination of low and high-frequency linguistic expressions contributes a novel set of features for sentiment classification. The experimental results show that an optimal Ngram feature selection (combination of optimal unigram and bigram features) enhances sentiment classification accuracy as compared to other types of feature sets. Springer 2018 Article PeerReviewed text en http://eprints.uthm.edu.my/5136/1/AJ%202018%20%28843%29%20NgramPOS%20a%20bigram-based%20linguistic%20and%20statistical%20feature%20process%20model%20for%20unstructured%20text%20classification.pdf Yazdani, Sepideh Foroozan and Tan, Zhiyuan and Kakavand, Mohsen and Mustapha, Aida (2018) NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification. WIRELESS NETWORKS. pp. 1-11. ISSN 1022-0038
spellingShingle QA76 Computer software
TA Engineering (General). Civil engineering (General)
TA329-348 Engineering mathematics. Engineering analysis
Yazdani, Sepideh Foroozan
Tan, Zhiyuan
Kakavand, Mohsen
Mustapha, Aida
NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_full NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_fullStr NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_full_unstemmed NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_short NgramPOS a bigram-based linguistic and statistical feature process model for unstructured text classification
title_sort ngrampos a bigram-based linguistic and statistical feature process model for unstructured text classification
topic QA76 Computer software
TA Engineering (General). Civil engineering (General)
TA329-348 Engineering mathematics. Engineering analysis
url http://eprints.uthm.edu.my/5136/
http://eprints.uthm.edu.my/5136/1/AJ%202018%20%28843%29%20NgramPOS%20a%20bigram-based%20linguistic%20and%20statistical%20feature%20process%20model%20for%20unstructured%20text%20classification.pdf