Improvement of Malay information retrieval using local stop words

This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted a...

Full description

Bibliographic Details
Main Authors: Abdullah, Muhamad Taufik, Ahmad, Fatimah, Mahmod, Ramlan, Tengku Sembok, Tengku Mohd
Format: Conference or Workshop Item
Language:English
Published: 2005
Online Access:http://psasir.upm.edu.my/id/eprint/38975/
http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf
_version_ 1848849019888992256
author Abdullah, Muhamad Taufik
Ahmad, Fatimah
Mahmod, Ramlan
Tengku Sembok, Tengku Mohd
author_facet Abdullah, Muhamad Taufik
Ahmad, Fatimah
Mahmod, Ramlan
Tengku Sembok, Tengku Mohd
author_sort Abdullah, Muhamad Taufik
building UPM Institutional Repository
collection Online Access
description This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted and ranked by frequency of occurrence in decreasing order, then the list of the 50 most frequently occurring words in the corpus are obtained. The evaluation of the new Malay stop words list is carried out on the Quranic collection. The employment of the new stop words in which the words from the Quran are compared with the stored stop words lists, results in a sum of 39.7% to 43.6% remaining terms. An experiment was done to evaluate the performance of this stop words lists in terms of the recall and precision for Malay information retrieval system. The results of the experiment without the stop words list and with using stop words lists shown that the employment of the new stop words lists had increased the average precision by 24.8% to 35.0%. The results demonstrate that this list can successfully be used in the Malay information retrieval system.
first_indexed 2025-11-15T09:43:45Z
format Conference or Workshop Item
id upm-38975
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T09:43:45Z
publishDate 2005
recordtype eprints
repository_type Digital Repository
spelling upm-389752015-08-24T02:10:26Z http://psasir.upm.edu.my/id/eprint/38975/ Improvement of Malay information retrieval using local stop words Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted and ranked by frequency of occurrence in decreasing order, then the list of the 50 most frequently occurring words in the corpus are obtained. The evaluation of the new Malay stop words list is carried out on the Quranic collection. The employment of the new stop words in which the words from the Quran are compared with the stored stop words lists, results in a sum of 39.7% to 43.6% remaining terms. An experiment was done to evaluate the performance of this stop words lists in terms of the recall and precision for Malay information retrieval system. The results of the experiment without the stop words list and with using stop words lists shown that the employment of the new stop words lists had increased the average precision by 24.8% to 35.0%. The results demonstrate that this list can successfully be used in the Malay information retrieval system. 2005 Conference or Workshop Item NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf Abdullah, Muhamad Taufik and Ahmad, Fatimah and Mahmod, Ramlan and Tengku Sembok, Tengku Mohd (2005) Improvement of Malay information retrieval using local stop words. In: International Advanced Technology Congress: Conference on Computer Integrated Systems, 6-8 Dec. 2005, Putrajaya, Malaysia. .
spellingShingle Abdullah, Muhamad Taufik
Ahmad, Fatimah
Mahmod, Ramlan
Tengku Sembok, Tengku Mohd
Improvement of Malay information retrieval using local stop words
title Improvement of Malay information retrieval using local stop words
title_full Improvement of Malay information retrieval using local stop words
title_fullStr Improvement of Malay information retrieval using local stop words
title_full_unstemmed Improvement of Malay information retrieval using local stop words
title_short Improvement of Malay information retrieval using local stop words
title_sort improvement of malay information retrieval using local stop words
url http://psasir.upm.edu.my/id/eprint/38975/
http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf