Characterisation of web spambots using self organising maps

The growth of spam in Web 2.0 environments not only reduces the quality and trust of the content but it also degrades the quality of search engine results. By means of web spambots, spammers are able to distribute spam content more efficiently to more targeted websites. Current anti-spam filtering s...

Full description

Bibliographic Details
Main Authors: Hayati, Pedram, Potdar, Vidyasagar, Talevski, Alex, Chai, Kevin
Format: Journal Article
Published: C R L publishing Ltd 2011
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/44532
_version_ 1848757028541956096
author Hayati, Pedram
Potdar, Vidyasagar
Talevski, Alex
Chai, Kevin
author_facet Hayati, Pedram
Potdar, Vidyasagar
Talevski, Alex
Chai, Kevin
author_sort Hayati, Pedram
building Curtin Institutional Repository
collection Online Access
description The growth of spam in Web 2.0 environments not only reduces the quality and trust of the content but it also degrades the quality of search engine results. By means of web spambots, spammers are able to distribute spam content more efficiently to more targeted websites. Current anti-spam filtering solutions have not studied web spambots thoroughly and the characterisation of spambots remains an open area of research. In order to fill this research gap, this paper utilises Kohonen’s Self-Organising Map (SOM) to characterise web spambots. We analyse web usage data to profile web spambots based on three novel set of features i.e. action set, action frequency and action time. Our experimental results uncovered important characteristics of web spambots that 1) they focus on specific and limited actions compared with humans 2) they use multiple user accounts to spread spam content, hide their identity and bypass restrictions, 3) they bypass filling in submission forms and directly submit the content to the Web server in order to efficiently spread spam, 4) they can be categorise into 4 different categories based on their actions – content submitters, profile editors, content viewers and mixed behaviour, 5) they change their IP address based on different action to hide their tracks. Our results are promising and they suggest that our technique is capable of identifying spam in Web 2.0 applications.
first_indexed 2025-11-14T09:21:35Z
format Journal Article
id curtin-20.500.11937-44532
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T09:21:35Z
publishDate 2011
publisher C R L publishing Ltd
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-445322017-03-08T13:18:45Z Characterisation of web spambots using self organising maps Hayati, Pedram Potdar, Vidyasagar Talevski, Alex Chai, Kevin spam 2.0 behaviour analysis web spambots web usage mining spam detection The growth of spam in Web 2.0 environments not only reduces the quality and trust of the content but it also degrades the quality of search engine results. By means of web spambots, spammers are able to distribute spam content more efficiently to more targeted websites. Current anti-spam filtering solutions have not studied web spambots thoroughly and the characterisation of spambots remains an open area of research. In order to fill this research gap, this paper utilises Kohonen’s Self-Organising Map (SOM) to characterise web spambots. We analyse web usage data to profile web spambots based on three novel set of features i.e. action set, action frequency and action time. Our experimental results uncovered important characteristics of web spambots that 1) they focus on specific and limited actions compared with humans 2) they use multiple user accounts to spread spam content, hide their identity and bypass restrictions, 3) they bypass filling in submission forms and directly submit the content to the Web server in order to efficiently spread spam, 4) they can be categorise into 4 different categories based on their actions – content submitters, profile editors, content viewers and mixed behaviour, 5) they change their IP address based on different action to hide their tracks. Our results are promising and they suggest that our technique is capable of identifying spam in Web 2.0 applications. 2011 Journal Article http://hdl.handle.net/20.500.11937/44532 C R L publishing Ltd restricted
spellingShingle spam 2.0
behaviour analysis
web spambots
web usage mining
spam detection
Hayati, Pedram
Potdar, Vidyasagar
Talevski, Alex
Chai, Kevin
Characterisation of web spambots using self organising maps
title Characterisation of web spambots using self organising maps
title_full Characterisation of web spambots using self organising maps
title_fullStr Characterisation of web spambots using self organising maps
title_full_unstemmed Characterisation of web spambots using self organising maps
title_short Characterisation of web spambots using self organising maps
title_sort characterisation of web spambots using self organising maps
topic spam 2.0
behaviour analysis
web spambots
web usage mining
spam detection
url http://hdl.handle.net/20.500.11937/44532