Lexical criminal identification for chatting corpus

This paper aims to identify lexical of criminal elements for chatting corpus, which involved suspect and victim conversation utterances. Lexical criminal identification requires three processes. The first is tokenization to automatically assign each lexical with a corresponding serial number in ever...

Full description

Bibliographic Details
Main Authors: Marjuni, Siti Hanom, Mahmod, Ramlan, Abd Ghani, Abdul Azim, Mohd Zain, Abdullah, Mustapha, Aida
Format: Conference or Workshop Item
Language:English
Published: IEEE 2009
Online Access:http://psasir.upm.edu.my/id/eprint/68487/
http://psasir.upm.edu.my/id/eprint/68487/1/Lexical%20criminal%20identification%20for%20chatting%20corpus.pdf
_version_ 1848856141248856064
author Marjuni, Siti Hanom
Mahmod, Ramlan
Abd Ghani, Abdul Azim
Mohd Zain, Abdullah
Mustapha, Aida
author_facet Marjuni, Siti Hanom
Mahmod, Ramlan
Abd Ghani, Abdul Azim
Mohd Zain, Abdullah
Mustapha, Aida
author_sort Marjuni, Siti Hanom
building UPM Institutional Repository
collection Online Access
description This paper aims to identify lexical of criminal elements for chatting corpus, which involved suspect and victim conversation utterances. Lexical criminal identification requires three processes. The first is tokenization to automatically assign each lexical with a corresponding serial number in every suspect and victim utterance. The second is tagging the lexical with parts of speech to identify verbs and nouns in the utterances. The third is to identify and analyze the interrogative criminal construct to get the criminal evidence. The chatting corpus consists of 3,067 suspect and victim utterances with 16,278 words, collected from 9 criminal chatting cases. The results indicate that both verb and noun are the most important part of speech elements that represent the criminal constructs in chat utterances.
first_indexed 2025-11-15T11:36:56Z
format Conference or Workshop Item
id upm-68487
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T11:36:56Z
publishDate 2009
publisher IEEE
recordtype eprints
repository_type Digital Repository
spelling upm-684872019-06-10T02:19:59Z http://psasir.upm.edu.my/id/eprint/68487/ Lexical criminal identification for chatting corpus Marjuni, Siti Hanom Mahmod, Ramlan Abd Ghani, Abdul Azim Mohd Zain, Abdullah Mustapha, Aida This paper aims to identify lexical of criminal elements for chatting corpus, which involved suspect and victim conversation utterances. Lexical criminal identification requires three processes. The first is tokenization to automatically assign each lexical with a corresponding serial number in every suspect and victim utterance. The second is tagging the lexical with parts of speech to identify verbs and nouns in the utterances. The third is to identify and analyze the interrogative criminal construct to get the criminal evidence. The chatting corpus consists of 3,067 suspect and victim utterances with 16,278 words, collected from 9 criminal chatting cases. The results indicate that both verb and noun are the most important part of speech elements that represent the criminal constructs in chat utterances. IEEE 2009 Conference or Workshop Item PeerReviewed text en http://psasir.upm.edu.my/id/eprint/68487/1/Lexical%20criminal%20identification%20for%20chatting%20corpus.pdf Marjuni, Siti Hanom and Mahmod, Ramlan and Abd Ghani, Abdul Azim and Mohd Zain, Abdullah and Mustapha, Aida (2009) Lexical criminal identification for chatting corpus. In: 2009 2nd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2009), 8-11 Aug. 2009, Beijing, China. (pp. 360-364). 10.1109/ICCSIT.2009.5234700
spellingShingle Marjuni, Siti Hanom
Mahmod, Ramlan
Abd Ghani, Abdul Azim
Mohd Zain, Abdullah
Mustapha, Aida
Lexical criminal identification for chatting corpus
title Lexical criminal identification for chatting corpus
title_full Lexical criminal identification for chatting corpus
title_fullStr Lexical criminal identification for chatting corpus
title_full_unstemmed Lexical criminal identification for chatting corpus
title_short Lexical criminal identification for chatting corpus
title_sort lexical criminal identification for chatting corpus
url http://psasir.upm.edu.my/id/eprint/68487/
http://psasir.upm.edu.my/id/eprint/68487/
http://psasir.upm.edu.my/id/eprint/68487/1/Lexical%20criminal%20identification%20for%20chatting%20corpus.pdf