Semantic feature selection for spam filtering

(Spam or unsolicited e-mail could be in a form of advertisement, product promotions, etc. It has become a key problem for e-mail users. Due to this, spam filtering has become a major research attention. In this research, spam filtering is explored based on semantic feature selection. Here, the W...

Full description

Bibliographic Details
Main Author: Azlina, Narawi
Format: Thesis
Language:English
Published: Universiti Malaysia Sarawak, (UNIMAS) 2010
Subjects:
Online Access:http://ir.unimas.my/id/eprint/14865/
http://ir.unimas.my/id/eprint/14865/2/Azlina%20Narawi%20ft.pdf
_version_ 1848837749234204672
author Azlina, Narawi
author_facet Azlina, Narawi
author_sort Azlina, Narawi
building UNIMAS Institutional Repository
collection Online Access
description (Spam or unsolicited e-mail could be in a form of advertisement, product promotions, etc. It has become a key problem for e-mail users. Due to this, spam filtering has become a major research attention. In this research, spam filtering is explored based on semantic feature selection. Here, the Wordnet-based approach is employed with statistical approaches used for the purpose of comparison. In further enhancing the task, another technique using distributed clustering has been proposed for identifying meaningful words for characterization) A series of experiments were conducted. The results show that the WordNet-based approach is able to select more meaningful features as compared to statistical approaches. The WordNet-based approach has the ability to achieve great dimensionality. A reduction of 72.9 % and 49.2% for the non-spam and spam categories was achieved respectively. Pruning of features by incorporating distributed clustering enhanced performance significantly. A new framework for semantics filtering was proposed as a result with distinct features in Spam and non-spam e-mail documents were determined. The promising results achieved, show that this approach can be further explored on other datasets or applications.
first_indexed 2025-11-15T06:44:36Z
format Thesis
id unimas-14865
institution Universiti Malaysia Sarawak
institution_category Local University
language English
last_indexed 2025-11-15T06:44:36Z
publishDate 2010
publisher Universiti Malaysia Sarawak, (UNIMAS)
recordtype eprints
repository_type Digital Repository
spelling unimas-148652025-05-16T03:22:40Z http://ir.unimas.my/id/eprint/14865/ Semantic feature selection for spam filtering Azlina, Narawi T Technology (General) (Spam or unsolicited e-mail could be in a form of advertisement, product promotions, etc. It has become a key problem for e-mail users. Due to this, spam filtering has become a major research attention. In this research, spam filtering is explored based on semantic feature selection. Here, the Wordnet-based approach is employed with statistical approaches used for the purpose of comparison. In further enhancing the task, another technique using distributed clustering has been proposed for identifying meaningful words for characterization) A series of experiments were conducted. The results show that the WordNet-based approach is able to select more meaningful features as compared to statistical approaches. The WordNet-based approach has the ability to achieve great dimensionality. A reduction of 72.9 % and 49.2% for the non-spam and spam categories was achieved respectively. Pruning of features by incorporating distributed clustering enhanced performance significantly. A new framework for semantics filtering was proposed as a result with distinct features in Spam and non-spam e-mail documents were determined. The promising results achieved, show that this approach can be further explored on other datasets or applications. Universiti Malaysia Sarawak, (UNIMAS) 2010 Thesis NonPeerReviewed text en http://ir.unimas.my/id/eprint/14865/2/Azlina%20Narawi%20ft.pdf Azlina, Narawi (2010) Semantic feature selection for spam filtering. Masters thesis, Universiti Malaysia Sarawak.
spellingShingle T Technology (General)
Azlina, Narawi
Semantic feature selection for spam filtering
title Semantic feature selection for spam filtering
title_full Semantic feature selection for spam filtering
title_fullStr Semantic feature selection for spam filtering
title_full_unstemmed Semantic feature selection for spam filtering
title_short Semantic feature selection for spam filtering
title_sort semantic feature selection for spam filtering
topic T Technology (General)
url http://ir.unimas.my/id/eprint/14865/
http://ir.unimas.my/id/eprint/14865/2/Azlina%20Narawi%20ft.pdf