The effectiveness of bottom up technique with probabilistic approach for a Malay parser

Parsing is a process of analyzing the input string in a sentence to define the syntax structures according to rules of grammar. This task is performed by a parser which will produce a parse tree as output. However, a problem occurs when the parsing process produces two or more parse trees in whic...

Full description

Bibliographic Details
Main Authors: Muhammad Azhar Fairuzz Hiloh, Mohd Juzaiddin Ab Aziz, Lailatul Qadri Zakaria
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2018
Online Access:http://journalarticle.ukm.my/13775/
http://journalarticle.ukm.my/13775/1/25512-76284-1-PB.pdf
_version_ 1848813371365785600
author Muhammad Azhar Fairuzz Hiloh,
Mohd Juzaiddin Ab Aziz,
Lailatul Qadri Zakaria,
author_facet Muhammad Azhar Fairuzz Hiloh,
Mohd Juzaiddin Ab Aziz,
Lailatul Qadri Zakaria,
author_sort Muhammad Azhar Fairuzz Hiloh,
building UKM Institutional Repository
collection Online Access
description Parsing is a process of analyzing the input string in a sentence to define the syntax structures according to rules of grammar. This task is performed by a parser which will produce a parse tree as output. However, a problem occurs when the parsing process produces two or more parse trees in which the parser unable to represent a precise parse tree. This limitation is caused by ambiguity in the structure of sentences. Ambiguity is occurred when a word is classified more than one category of syntax and its usage will affect the semantics of the sentence. Thus, the parser needs to have an approach to solve the ambiguity problem and is able to process the most appropriate parse tree to present a sentence. Like other languages in the world, Malay language, a national language for Malaysian, is not exempted from ambiguity problem. However, due to its grammar being context-free grammar, the probabilistic context-free grammar approach can be used to support the parser in determining a more accurate parse tree. This study focuses on the development of statistical parser using a bottom-up technique for Malay language. The training data, in the form of simple Malay language sentences, are collected from various sources. Based on this training data, a statistical lexical corpus of Malay language which consists of vocabulary, grammar rules and their probability was developed. The bottom up parsing will be supported by implementing Cocke–Younger–Kasami (CYK) algorithm. The parser’s performance is evaluated based on its effectiveness to overcome ambiguity by suggesting a more precise parse tree. In conclusion, the Malay Language Parser can be useful to help user identify the appropriate parse tree and solve ambiguity issues in Malay Language.
first_indexed 2025-11-15T00:17:08Z
format Article
id oai:generic.eprints.org:13775
institution Universiti Kebangasaan Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T00:17:08Z
publishDate 2018
publisher Penerbit Universiti Kebangsaan Malaysia
recordtype eprints
repository_type Digital Repository
spelling oai:generic.eprints.org:137752019-12-09T23:15:28Z http://journalarticle.ukm.my/13775/ The effectiveness of bottom up technique with probabilistic approach for a Malay parser Muhammad Azhar Fairuzz Hiloh, Mohd Juzaiddin Ab Aziz, Lailatul Qadri Zakaria, Parsing is a process of analyzing the input string in a sentence to define the syntax structures according to rules of grammar. This task is performed by a parser which will produce a parse tree as output. However, a problem occurs when the parsing process produces two or more parse trees in which the parser unable to represent a precise parse tree. This limitation is caused by ambiguity in the structure of sentences. Ambiguity is occurred when a word is classified more than one category of syntax and its usage will affect the semantics of the sentence. Thus, the parser needs to have an approach to solve the ambiguity problem and is able to process the most appropriate parse tree to present a sentence. Like other languages in the world, Malay language, a national language for Malaysian, is not exempted from ambiguity problem. However, due to its grammar being context-free grammar, the probabilistic context-free grammar approach can be used to support the parser in determining a more accurate parse tree. This study focuses on the development of statistical parser using a bottom-up technique for Malay language. The training data, in the form of simple Malay language sentences, are collected from various sources. Based on this training data, a statistical lexical corpus of Malay language which consists of vocabulary, grammar rules and their probability was developed. The bottom up parsing will be supported by implementing Cocke–Younger–Kasami (CYK) algorithm. The parser’s performance is evaluated based on its effectiveness to overcome ambiguity by suggesting a more precise parse tree. In conclusion, the Malay Language Parser can be useful to help user identify the appropriate parse tree and solve ambiguity issues in Malay Language. Penerbit Universiti Kebangsaan Malaysia 2018-05 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/13775/1/25512-76284-1-PB.pdf Muhammad Azhar Fairuzz Hiloh, and Mohd Juzaiddin Ab Aziz, and Lailatul Qadri Zakaria, (2018) The effectiveness of bottom up technique with probabilistic approach for a Malay parser. GEMA: Online Journal of Language Studies, 18 (2). pp. 124-133. ISSN 1675-8021 http://ejournal.ukm.my/gema/issue/view/1087
spellingShingle Muhammad Azhar Fairuzz Hiloh,
Mohd Juzaiddin Ab Aziz,
Lailatul Qadri Zakaria,
The effectiveness of bottom up technique with probabilistic approach for a Malay parser
title The effectiveness of bottom up technique with probabilistic approach for a Malay parser
title_full The effectiveness of bottom up technique with probabilistic approach for a Malay parser
title_fullStr The effectiveness of bottom up technique with probabilistic approach for a Malay parser
title_full_unstemmed The effectiveness of bottom up technique with probabilistic approach for a Malay parser
title_short The effectiveness of bottom up technique with probabilistic approach for a Malay parser
title_sort effectiveness of bottom up technique with probabilistic approach for a malay parser
url http://journalarticle.ukm.my/13775/
http://journalarticle.ukm.my/13775/
http://journalarticle.ukm.my/13775/1/25512-76284-1-PB.pdf