Automatic conceptual analysis for plagiarism detection

In order to detect plagiarism, comparisons must be made between a target document (the suspect) and reference documents. Numerous automated systems exist which check at the text-string level. If the scope is kept constrained, as for example in within-cohort plagiarism checking, then performance is v...

Full description

Bibliographic Details
Main Author: Dreher, Heinz
Format: Journal Article
Published: The Informing Science Institute 2007
Subjects:
Online Access:http://iisit.org/IssuesVol4v2.htm
http://hdl.handle.net/20.500.11937/33407
_version_ 1848753938385338368
author Dreher, Heinz
author_facet Dreher, Heinz
author_sort Dreher, Heinz
building Curtin Institutional Repository
collection Online Access
description In order to detect plagiarism, comparisons must be made between a target document (the suspect) and reference documents. Numerous automated systems exist which check at the text-string level. If the scope is kept constrained, as for example in within-cohort plagiarism checking, then performance is very reasonable. On the other hand if one extends the focus to a very large corpus such as the WWW then performance can be reduced to an impracticable level. The three case studies presented in this paper give insight into the text-string comparators, whilst the third case study considers the very new and promising conceptual analysis approach to plagiarism detection which is now made achievable by the very computationally efficient Normalised Word Vector algorithm. The paper concludes with a caution on the use of high-tech in the absence of hightouch.
first_indexed 2025-11-14T08:32:28Z
format Journal Article
id curtin-20.500.11937-33407
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T08:32:28Z
publishDate 2007
publisher The Informing Science Institute
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-334072017-01-30T13:36:55Z Automatic conceptual analysis for plagiarism detection Dreher, Heinz conceptual analysis semantic footprint NWV - Normalised Word Vector plagiarism conceptual footprint academic malpractice In order to detect plagiarism, comparisons must be made between a target document (the suspect) and reference documents. Numerous automated systems exist which check at the text-string level. If the scope is kept constrained, as for example in within-cohort plagiarism checking, then performance is very reasonable. On the other hand if one extends the focus to a very large corpus such as the WWW then performance can be reduced to an impracticable level. The three case studies presented in this paper give insight into the text-string comparators, whilst the third case study considers the very new and promising conceptual analysis approach to plagiarism detection which is now made achievable by the very computationally efficient Normalised Word Vector algorithm. The paper concludes with a caution on the use of high-tech in the absence of hightouch. 2007 Journal Article http://hdl.handle.net/20.500.11937/33407 http://iisit.org/IssuesVol4v2.htm http://proceedings.informingscience.org/InSITE2007/IISITv4p601-614Dreh383.pdf The Informing Science Institute fulltext
spellingShingle conceptual analysis
semantic footprint
NWV
- Normalised Word Vector
plagiarism
conceptual footprint
academic malpractice
Dreher, Heinz
Automatic conceptual analysis for plagiarism detection
title Automatic conceptual analysis for plagiarism detection
title_full Automatic conceptual analysis for plagiarism detection
title_fullStr Automatic conceptual analysis for plagiarism detection
title_full_unstemmed Automatic conceptual analysis for plagiarism detection
title_short Automatic conceptual analysis for plagiarism detection
title_sort automatic conceptual analysis for plagiarism detection
topic conceptual analysis
semantic footprint
NWV
- Normalised Word Vector
plagiarism
conceptual footprint
academic malpractice
url http://iisit.org/IssuesVol4v2.htm
http://iisit.org/IssuesVol4v2.htm
http://hdl.handle.net/20.500.11937/33407