Component-based stemming engine for malay text / Juhari ljam

Word stemming is an important feature supported by present day indexing and search system. The idea is to improve recall by automatic handling of word ending by reducing the words to their word roots, at the time of indexing and searching. Various algorithms for stemming have been developed for the...

Full description

Bibliographic Details
Main Author: Juhari, ljam
Format: Thesis
Published: 2003
Subjects:
Online Access:http://studentsrepo.um.edu.my/8925/
http://studentsrepo.um.edu.my/8925/4/juhari.pdf
_version_ 1848773789157949440
author Juhari, ljam
author_facet Juhari, ljam
author_sort Juhari, ljam
building UM Research Repository
collection Online Access
description Word stemming is an important feature supported by present day indexing and search system. The idea is to improve recall by automatic handling of word ending by reducing the words to their word roots, at the time of indexing and searching. Various algorithms for stemming have been developed for the English and the other foreign languages, but it is still new for the Malay text. How ever most of them did not given any meaning of the development or application. This is because it cannot be reused for the other applications. These projects are studied and a new algorithm is being proposed to improve the performance of the stemming process. And the most importance of this project is to propose a new technology, which is using component based. With it, a lot of applications may derive from the component. It is because the main reason of using component base is it can be reusable. So that for those who like to build a system which is have a relationship to the IR or word stemming, not need to build it anymore for the stemming engine. The developer has just to use the component engine and get the output easily. How ever this project is proposed for a specific domain that will be covered for the generic Malay words.
first_indexed 2025-11-14T13:47:59Z
format Thesis
id um-8925
institution University Malaya
institution_category Local University
last_indexed 2025-11-14T13:47:59Z
publishDate 2003
recordtype eprints
repository_type Digital Repository
spelling um-89252019-08-26T19:25:20Z Component-based stemming engine for malay text / Juhari ljam Juhari, ljam QA76 Computer software T Technology (General) Word stemming is an important feature supported by present day indexing and search system. The idea is to improve recall by automatic handling of word ending by reducing the words to their word roots, at the time of indexing and searching. Various algorithms for stemming have been developed for the English and the other foreign languages, but it is still new for the Malay text. How ever most of them did not given any meaning of the development or application. This is because it cannot be reused for the other applications. These projects are studied and a new algorithm is being proposed to improve the performance of the stemming process. And the most importance of this project is to propose a new technology, which is using component based. With it, a lot of applications may derive from the component. It is because the main reason of using component base is it can be reusable. So that for those who like to build a system which is have a relationship to the IR or word stemming, not need to build it anymore for the stemming engine. The developer has just to use the component engine and get the output easily. How ever this project is proposed for a specific domain that will be covered for the generic Malay words. 2003 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/8925/4/juhari.pdf Juhari, ljam (2003) Component-based stemming engine for malay text / Juhari ljam. Undergraduates thesis, University of Malaya. http://studentsrepo.um.edu.my/8925/
spellingShingle QA76 Computer software
T Technology (General)
Juhari, ljam
Component-based stemming engine for malay text / Juhari ljam
title Component-based stemming engine for malay text / Juhari ljam
title_full Component-based stemming engine for malay text / Juhari ljam
title_fullStr Component-based stemming engine for malay text / Juhari ljam
title_full_unstemmed Component-based stemming engine for malay text / Juhari ljam
title_short Component-based stemming engine for malay text / Juhari ljam
title_sort component-based stemming engine for malay text / juhari ljam
topic QA76 Computer software
T Technology (General)
url http://studentsrepo.um.edu.my/8925/
http://studentsrepo.um.edu.my/8925/4/juhari.pdf