Improvement of Malay information retrieval using local stop words
This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted a...
| Main Authors: | , , , |
|---|---|
| Format: | Conference or Workshop Item |
| Language: | English |
| Published: |
2005
|
| Online Access: | http://psasir.upm.edu.my/id/eprint/38975/ http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf |
| _version_ | 1848849019888992256 |
|---|---|
| author | Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd |
| author_facet | Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd |
| author_sort | Abdullah, Muhamad Taufik |
| building | UPM Institutional Repository |
| collection | Online Access |
| description | This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted and ranked by frequency of occurrence in decreasing order, then the list of the 50 most frequently occurring words in the corpus are obtained. The evaluation of the new Malay stop words list is carried out on the Quranic collection. The employment of the new stop words in which the words from the Quran are compared with the stored stop words lists, results in a sum of 39.7% to 43.6% remaining terms. An experiment was done to evaluate the performance of this stop words lists in terms of the recall and precision for Malay information retrieval system. The results of the experiment without the stop words list and with using stop words lists shown that the employment of the new stop words lists had increased the average precision by 24.8% to 35.0%. The results demonstrate that this list can successfully be used in the Malay information retrieval system. |
| first_indexed | 2025-11-15T09:43:45Z |
| format | Conference or Workshop Item |
| id | upm-38975 |
| institution | Universiti Putra Malaysia |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T09:43:45Z |
| publishDate | 2005 |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | upm-389752015-08-24T02:10:26Z http://psasir.upm.edu.my/id/eprint/38975/ Improvement of Malay information retrieval using local stop words Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted and ranked by frequency of occurrence in decreasing order, then the list of the 50 most frequently occurring words in the corpus are obtained. The evaluation of the new Malay stop words list is carried out on the Quranic collection. The employment of the new stop words in which the words from the Quran are compared with the stored stop words lists, results in a sum of 39.7% to 43.6% remaining terms. An experiment was done to evaluate the performance of this stop words lists in terms of the recall and precision for Malay information retrieval system. The results of the experiment without the stop words list and with using stop words lists shown that the employment of the new stop words lists had increased the average precision by 24.8% to 35.0%. The results demonstrate that this list can successfully be used in the Malay information retrieval system. 2005 Conference or Workshop Item NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf Abdullah, Muhamad Taufik and Ahmad, Fatimah and Mahmod, Ramlan and Tengku Sembok, Tengku Mohd (2005) Improvement of Malay information retrieval using local stop words. In: International Advanced Technology Congress: Conference on Computer Integrated Systems, 6-8 Dec. 2005, Putrajaya, Malaysia. . |
| spellingShingle | Abdullah, Muhamad Taufik Ahmad, Fatimah Mahmod, Ramlan Tengku Sembok, Tengku Mohd Improvement of Malay information retrieval using local stop words |
| title | Improvement of Malay information retrieval using local stop words |
| title_full | Improvement of Malay information retrieval using local stop words |
| title_fullStr | Improvement of Malay information retrieval using local stop words |
| title_full_unstemmed | Improvement of Malay information retrieval using local stop words |
| title_short | Improvement of Malay information retrieval using local stop words |
| title_sort | improvement of malay information retrieval using local stop words |
| url | http://psasir.upm.edu.my/id/eprint/38975/ http://psasir.upm.edu.my/id/eprint/38975/1/38975.pdf |