PhishWHO: Phishing webpage detection via identity keywords extraction and target domain name finder

This paper proposes a phishing detection technique based on the difference between the target and actual identities of a webpage. The proposed phishing detection approach, called PhishWHO, can be divided into three phases. The first phase extracts identity keywords from the textual contents of the...

Full description

Bibliographic Details
Main Authors: Tan, Choon Lin, Chiew, Kang Leng, Wong, KokSheik, Sze, San Nah
Format: Article
Language:English
Published: Elsevier B.V. 2016
Subjects:
Online Access:http://ir.unimas.my/id/eprint/13363/
http://ir.unimas.my/id/eprint/13363/1/Phishing%20webpage%20detection%20via%20identity%20keywords%20%28abstract%29.pdf
Description
Summary:This paper proposes a phishing detection technique based on the difference between the target and actual identities of a webpage. The proposed phishing detection approach, called PhishWHO, can be divided into three phases. The first phase extracts identity keywords from the textual contents of the website, where a novel weighted URL tokens system based on the N-gram model is proposed. The second phase finds the target domain name by using a search engine, and the target domain name is selected based on identity-relevant features. In the final phase, a 3-tier identity matching system is proposed to determine the legitimacy of the query webpage. The overall experimental results suggest that the proposed system outperforms the conventional phishing detection methods considered.