Context-dependent multilingual lexical lookup for under-resourced languages

Current approaches for word sense disambiguation and translation selection typically require lexical resources or large bilingual corpora with rich information fields and annotations, which are often infeasible for under-resourced languages. We extract translation context knowledge from a bilingual...

Full description

Bibliographic Details
Main Authors: Lian, Tze Lim, Enya, Kong Tang, Lay-Ki, Soon, Tek, Yong Lim, Ranaivo-Malançon, Bali
Format: Proceeding
Language:English
Published: 2013
Subjects:
Online Access:http://ir.unimas.my/id/eprint/16527/
http://ir.unimas.my/id/eprint/16527/1/Context-dependent%20multilingual%20lexical%20lookup%20for%20under-resourced%20languages%20%28abstrak%29.pdf
Description
Summary:Current approaches for word sense disambiguation and translation selection typically require lexical resources or large bilingual corpora with rich information fields and annotations, which are often infeasible for under-resourced languages. We extract translation context knowledge from a bilingual comparable corpora of a richer-resourced language pair, and inject it into a multilingual lexicon. The multilingual lexicon can then be used to perform context-dependent lexical lookup on texts of any language, including under-resourced ones. Evaluations on a prototype lookup tool, trained on a English-Malay bilingual Wikipedia corpus, show a precision score of 0.65 (baseline 0.55) and mean reciprocal rank score of 0.81 (baseline 0.771). Based on the early encouraging results, the context-dependent lexical lookup tool may be developed further into an intelligent reading aid, to help users grasp the gist of a second or foreign language text.