Phishing webpage detection using weighted URL tokens for identity keywords retrieval

Phishing is an online identity theft that has threatened Internet users for more than a decade. This paper proposes an anti-phishing technique based on a weighted URL tokens system, which extracts identity keywords from a query webpage. Using the identity keywords as search terms, a search engine is...

Full description

Bibliographic Details
Main Authors: Tan, Choon Lin, Chiew, Kang Leng, Sze, San Nah
Format: Article
Language:English
Published: Springer Verlag 2017
Subjects:
Online Access:http://ir.unimas.my/id/eprint/14955/
http://ir.unimas.my/id/eprint/14955/1/Phishingwebpage-detection-usingweighted-URL-tokens-for-identity-keywords-retrieval_2017_Lecture-Notes-in-Electrical-Engineering.html
Description
Summary:Phishing is an online identity theft that has threatened Internet users for more than a decade. This paper proposes an anti-phishing technique based on a weighted URL tokens system, which extracts identity keywords from a query webpage. Using the identity keywords as search terms, a search engine is invoked to pinpoint the target domain name, which can be used to determine the legitimacy of the query webpage. Experiments were conducted over 1000 datasets, where 99.20% true positives and 92.20% true negatives were achieved. Results suggest that the proposed system can detect phishing webpages effectively without using conventional language-dependent keywords extraction algorithms.