Phishing webpage detection using weighted URL tokens for identity keywords retrieval

Phishing is an online identity theft that has threatened Internet users for more than a decade. This paper proposes an anti-phishing technique based on a weighted URL tokens system, which extracts identity keywords from a query webpage. Using the identity keywords as search terms, a search engine is...

Full description

Bibliographic Details
Main Authors: Tan, Choon Lin, Chiew, Kang Leng, Sze, San Nah
Format: Article
Language:English
Published: Springer Verlag 2017
Subjects:
Online Access:http://ir.unimas.my/id/eprint/14955/
http://ir.unimas.my/id/eprint/14955/1/Phishingwebpage-detection-usingweighted-URL-tokens-for-identity-keywords-retrieval_2017_Lecture-Notes-in-Electrical-Engineering.html
_version_ 1848837756881469440
author Tan, Choon Lin
Chiew, Kang Leng
Sze, San Nah
author_facet Tan, Choon Lin
Chiew, Kang Leng
Sze, San Nah
author_sort Tan, Choon Lin
building UNIMAS Institutional Repository
collection Online Access
description Phishing is an online identity theft that has threatened Internet users for more than a decade. This paper proposes an anti-phishing technique based on a weighted URL tokens system, which extracts identity keywords from a query webpage. Using the identity keywords as search terms, a search engine is invoked to pinpoint the target domain name, which can be used to determine the legitimacy of the query webpage. Experiments were conducted over 1000 datasets, where 99.20% true positives and 92.20% true negatives were achieved. Results suggest that the proposed system can detect phishing webpages effectively without using conventional language-dependent keywords extraction algorithms.
first_indexed 2025-11-15T06:44:43Z
format Article
id unimas-14955
institution Universiti Malaysia Sarawak
institution_category Local University
language English
last_indexed 2025-11-15T06:44:43Z
publishDate 2017
publisher Springer Verlag
recordtype eprints
repository_type Digital Repository
spelling unimas-149552020-08-17T17:53:09Z http://ir.unimas.my/id/eprint/14955/ Phishing webpage detection using weighted URL tokens for identity keywords retrieval Tan, Choon Lin Chiew, Kang Leng Sze, San Nah TK Electrical engineering. Electronics Nuclear engineering Phishing is an online identity theft that has threatened Internet users for more than a decade. This paper proposes an anti-phishing technique based on a weighted URL tokens system, which extracts identity keywords from a query webpage. Using the identity keywords as search terms, a search engine is invoked to pinpoint the target domain name, which can be used to determine the legitimacy of the query webpage. Experiments were conducted over 1000 datasets, where 99.20% true positives and 92.20% true negatives were achieved. Results suggest that the proposed system can detect phishing webpages effectively without using conventional language-dependent keywords extraction algorithms. Springer Verlag 2017 Article PeerReviewed text en http://ir.unimas.my/id/eprint/14955/1/Phishingwebpage-detection-usingweighted-URL-tokens-for-identity-keywords-retrieval_2017_Lecture-Notes-in-Electrical-Engineering.html Tan, Choon Lin and Chiew, Kang Leng and Sze, San Nah (2017) Phishing webpage detection using weighted URL tokens for identity keywords retrieval. Lecture Notes in Electrical Engineering, 398. pp. 133-139. ISSN 18761100 https://www.scopus.com/inward/record.uri?eid=2-s2.0-84992695464&doi=10.1007%2f978-981-10-1721-6_15&partnerID=40&md5=96ef3cf0d19ef6dd4d4e9f40c5835030 DOI: 10.1007/978-981-10-1721-6_15
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Tan, Choon Lin
Chiew, Kang Leng
Sze, San Nah
Phishing webpage detection using weighted URL tokens for identity keywords retrieval
title Phishing webpage detection using weighted URL tokens for identity keywords retrieval
title_full Phishing webpage detection using weighted URL tokens for identity keywords retrieval
title_fullStr Phishing webpage detection using weighted URL tokens for identity keywords retrieval
title_full_unstemmed Phishing webpage detection using weighted URL tokens for identity keywords retrieval
title_short Phishing webpage detection using weighted URL tokens for identity keywords retrieval
title_sort phishing webpage detection using weighted url tokens for identity keywords retrieval
topic TK Electrical engineering. Electronics Nuclear engineering
url http://ir.unimas.my/id/eprint/14955/
http://ir.unimas.my/id/eprint/14955/
http://ir.unimas.my/id/eprint/14955/
http://ir.unimas.my/id/eprint/14955/1/Phishingwebpage-detection-usingweighted-URL-tokens-for-identity-keywords-retrieval_2017_Lecture-Notes-in-Electrical-Engineering.html