Lexicon+TX: Rapid Construction of a Multilingual Lexicon with Under-Resourced Languages

Most efforts at automatically creating multilingual lexicons require input lexical resources with rich content (e.g. semantic networks, domain codes, semantic categories) or large corpora. Such material is often unavailable and difficult to construct for under-resourced languages. In some cases, pa...

Full description

Bibliographic Details
Main Authors: Lian, Tze Lim, Lay-Ki, Soon, Tek, Yong Lim, Enya, Kong Tang, Bali, Ranaivo-Malançon
Format: Article
Language:English
Published: SpringerLink 2013
Subjects:
Online Access:http://ir.unimas.my/id/eprint/5208/
http://ir.unimas.my/id/eprint/5208/1/Lexicon%2BTX%20Rapid%20Construction%20of%20a%20Multilingual%20Lexicon%20%28abstract%29.pdf
_version_ 1848835609661014016
author Lian, Tze Lim
Lay-Ki, Soon
Tek, Yong Lim
Enya, Kong Tang
Bali, Ranaivo-Malançon
author_facet Lian, Tze Lim
Lay-Ki, Soon
Tek, Yong Lim
Enya, Kong Tang
Bali, Ranaivo-Malançon
author_sort Lian, Tze Lim
building UNIMAS Institutional Repository
collection Online Access
description Most efforts at automatically creating multilingual lexicons require input lexical resources with rich content (e.g. semantic networks, domain codes, semantic categories) or large corpora. Such material is often unavailable and difficult to construct for under-resourced languages. In some cases, particularly for some ethnic languages, even unannotated corpora are still in the process of collection. We show how multilingual lexicons with under-resourced languages can be constructed using simple bilingual translation lists, which are more readily available. The prototype multilingual lexicon developed comprise six member languages: English, Malay, Chinese, French, Thai and Iban, the last of which is an under-resourced language in Borneo. Quick evaluations showed that 91.2% of 500 random multilingual entries in the generated lexicon require minimal or no human correction.
first_indexed 2025-11-15T06:10:36Z
format Article
id unimas-5208
institution Universiti Malaysia Sarawak
institution_category Local University
language English
last_indexed 2025-11-15T06:10:36Z
publishDate 2013
publisher SpringerLink
recordtype eprints
repository_type Digital Repository
spelling unimas-52082015-03-23T06:38:01Z http://ir.unimas.my/id/eprint/5208/ Lexicon+TX: Rapid Construction of a Multilingual Lexicon with Under-Resourced Languages Lian, Tze Lim Lay-Ki, Soon Tek, Yong Lim Enya, Kong Tang Bali, Ranaivo-Malançon T Technology (General) Most efforts at automatically creating multilingual lexicons require input lexical resources with rich content (e.g. semantic networks, domain codes, semantic categories) or large corpora. Such material is often unavailable and difficult to construct for under-resourced languages. In some cases, particularly for some ethnic languages, even unannotated corpora are still in the process of collection. We show how multilingual lexicons with under-resourced languages can be constructed using simple bilingual translation lists, which are more readily available. The prototype multilingual lexicon developed comprise six member languages: English, Malay, Chinese, French, Thai and Iban, the last of which is an under-resourced language in Borneo. Quick evaluations showed that 91.2% of 500 random multilingual entries in the generated lexicon require minimal or no human correction. SpringerLink 2013 Article PeerReviewed text en http://ir.unimas.my/id/eprint/5208/1/Lexicon%2BTX%20Rapid%20Construction%20of%20a%20Multilingual%20Lexicon%20%28abstract%29.pdf Lian, Tze Lim and Lay-Ki, Soon and Tek, Yong Lim and Enya, Kong Tang and Bali, Ranaivo-Malançon (2013) Lexicon+TX: Rapid Construction of a Multilingual Lexicon with Under-Resourced Languages. Language Resources and Evaluation, 48 (3). ISSN 1574-0218 http://linkspringer.com/article/10.1007/S10579-013-9253-0
spellingShingle T Technology (General)
Lian, Tze Lim
Lay-Ki, Soon
Tek, Yong Lim
Enya, Kong Tang
Bali, Ranaivo-Malançon
Lexicon+TX: Rapid Construction of a Multilingual Lexicon with Under-Resourced Languages
title Lexicon+TX: Rapid Construction of a Multilingual Lexicon with Under-Resourced Languages
title_full Lexicon+TX: Rapid Construction of a Multilingual Lexicon with Under-Resourced Languages
title_fullStr Lexicon+TX: Rapid Construction of a Multilingual Lexicon with Under-Resourced Languages
title_full_unstemmed Lexicon+TX: Rapid Construction of a Multilingual Lexicon with Under-Resourced Languages
title_short Lexicon+TX: Rapid Construction of a Multilingual Lexicon with Under-Resourced Languages
title_sort lexicon+tx: rapid construction of a multilingual lexicon with under-resourced languages
topic T Technology (General)
url http://ir.unimas.my/id/eprint/5208/
http://ir.unimas.my/id/eprint/5208/
http://ir.unimas.my/id/eprint/5208/1/Lexicon%2BTX%20Rapid%20Construction%20of%20a%20Multilingual%20Lexicon%20%28abstract%29.pdf