Automatic conceptual analysis for plagiarism detection

In order to detect plagiarism, comparisons must be made between a target document (the suspect) and reference documents. Numerous automated systems exist which check at the text-string level. If the scope is kept constrained, as for example in within-cohort plagiarism checking, then performance is v...

Full description

Bibliographic Details
Main Author: Dreher, Heinz
Format: Journal Article
Published: The Informing Science Institute 2007
Subjects:
Online Access:http://iisit.org/IssuesVol4v2.htm
http://hdl.handle.net/20.500.11937/33407
Description
Summary:In order to detect plagiarism, comparisons must be made between a target document (the suspect) and reference documents. Numerous automated systems exist which check at the text-string level. If the scope is kept constrained, as for example in within-cohort plagiarism checking, then performance is very reasonable. On the other hand if one extends the focus to a very large corpus such as the WWW then performance can be reduced to an impracticable level. The three case studies presented in this paper give insight into the text-string comparators, whilst the third case study considers the very new and promising conceptual analysis approach to plagiarism detection which is now made achievable by the very computationally efficient Normalised Word Vector algorithm. The paper concludes with a caution on the use of high-tech in the absence of hightouch.