Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming

Lexical ambiguity is one of the problems faced by morphological analyser and stemmer. It is caused by ambiguous word form like homonym, which could direct the tools to produce incorrect output. Thus a method that can resolve ambiguity may improve the performance of such tools. Malay word affixation...

Full description

Bibliographic Details
Main Authors: Sharum, Mohd Yunus, Abdullah, Muhamad Taufik, Sulaiman, Md Nasir, Azmi Murad, Masrah Azrifah, Zainon Hamzah, Zaitul Azma
Format: Article
Language:English
Published: Advanced Institute of Convergence Information Technology 2013
Online Access:http://psasir.upm.edu.my/id/eprint/30565/
http://psasir.upm.edu.my/id/eprint/30565/1/Lexicon%20splitting%20in%20lexical%20disambiguation%20for%20Malay%20morphological%20analysis%20and%20stemming.pdf
_version_ 1848846714128039936
author Sharum, Mohd Yunus
Abdullah, Muhamad Taufik
Sulaiman, Md Nasir
Azmi Murad, Masrah Azrifah
Zainon Hamzah, Zaitul Azma
author_facet Sharum, Mohd Yunus
Abdullah, Muhamad Taufik
Sulaiman, Md Nasir
Azmi Murad, Masrah Azrifah
Zainon Hamzah, Zaitul Azma
author_sort Sharum, Mohd Yunus
building UPM Institutional Repository
collection Online Access
description Lexical ambiguity is one of the problems faced by morphological analyser and stemmer. It is caused by ambiguous word form like homonym, which could direct the tools to produce incorrect output. Thus a method that can resolve ambiguity may improve the performance of such tools. Malay word affixation differentiates between monosyllable and multisyllable word. A disambiguation method is proposed for tools that use lexicon for analysis and stemming, by splitting the lexicon into monosyllable and multisyllable words. We found that this feature could help to resolve ambiguity involving monosyllable words, improve language’s exception handling and improve storage lookup.This would be useful for Malay morphological analysis and stemming as this method does not require document-level context analysis of the analysed word.
first_indexed 2025-11-15T09:07:06Z
format Article
id upm-30565
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T09:07:06Z
publishDate 2013
publisher Advanced Institute of Convergence Information Technology
recordtype eprints
repository_type Digital Repository
spelling upm-305652016-01-28T06:49:38Z http://psasir.upm.edu.my/id/eprint/30565/ Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming Sharum, Mohd Yunus Abdullah, Muhamad Taufik Sulaiman, Md Nasir Azmi Murad, Masrah Azrifah Zainon Hamzah, Zaitul Azma Lexical ambiguity is one of the problems faced by morphological analyser and stemmer. It is caused by ambiguous word form like homonym, which could direct the tools to produce incorrect output. Thus a method that can resolve ambiguity may improve the performance of such tools. Malay word affixation differentiates between monosyllable and multisyllable word. A disambiguation method is proposed for tools that use lexicon for analysis and stemming, by splitting the lexicon into monosyllable and multisyllable words. We found that this feature could help to resolve ambiguity involving monosyllable words, improve language’s exception handling and improve storage lookup.This would be useful for Malay morphological analysis and stemming as this method does not require document-level context analysis of the analysed word. Advanced Institute of Convergence Information Technology 2013-07 Article PeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/30565/1/Lexicon%20splitting%20in%20lexical%20disambiguation%20for%20Malay%20morphological%20analysis%20and%20stemming.pdf Sharum, Mohd Yunus and Abdullah, Muhamad Taufik and Sulaiman, Md Nasir and Azmi Murad, Masrah Azrifah and Zainon Hamzah, Zaitul Azma (2013) Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming. Journal of Next Generation Information Technology, 4 (5). pp. 9-15. ISSN 2092-8637; ESSN: 2233-9388 http://www.globalcis.org/jnit/global/paper_detail.html?jname=JNIT&q=175 10.4156/jnit.vol4.issue5.2
spellingShingle Sharum, Mohd Yunus
Abdullah, Muhamad Taufik
Sulaiman, Md Nasir
Azmi Murad, Masrah Azrifah
Zainon Hamzah, Zaitul Azma
Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming
title Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming
title_full Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming
title_fullStr Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming
title_full_unstemmed Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming
title_short Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming
title_sort lexicon splitting in lexical disambiguation for malay morphological analysis and stemming
url http://psasir.upm.edu.my/id/eprint/30565/
http://psasir.upm.edu.my/id/eprint/30565/
http://psasir.upm.edu.my/id/eprint/30565/
http://psasir.upm.edu.my/id/eprint/30565/1/Lexicon%20splitting%20in%20lexical%20disambiguation%20for%20Malay%20morphological%20analysis%20and%20stemming.pdf