Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban

This paper describes our experiments and results on using a local dominant language in Malaysia (Malay), to bootstrap automatic speech recognition (ASR) for a very under-resourced language: Iban (also spoken in Malaysia on the Borneo Island part). Resources in Iban for building a speech recognition...

Full description

Bibliographic Details
Main Authors: Juan, Sarah Samson, Besacier, Laurent, Rossato, Solange
Format: Proceeding
Language:English
Published: 2014
Subjects:
Online Access:http://ir.unimas.my/id/eprint/8879/
http://ir.unimas.my/id/eprint/8879/1/sltu2014_sarah.pdf
_version_ 1848836463122186240
author Juan, Sarah Samson
Besacier, Laurent
Rossato, Solange
author_facet Juan, Sarah Samson
Besacier, Laurent
Rossato, Solange
author_sort Juan, Sarah Samson
building UNIMAS Institutional Repository
collection Online Access
description This paper describes our experiments and results on using a local dominant language in Malaysia (Malay), to bootstrap automatic speech recognition (ASR) for a very under-resourced language: Iban (also spoken in Malaysia on the Borneo Island part). Resources in Iban for building a speech recognition were nonexistent. For this, we tried to take advantage of a language from the same family with several similarities. First, to deal with the pronunciation dictionary, we proposed a bootstrapping strategy to develop an Iban pronunciation lexicon from a Malay one. A hybrid version, mix of Malay and Iban pronunciations, was also built and evaluated. Following this, we experimented with three Iban ASRs; each depended on either one of the three different pronunciation dictionaries: Malay, Iban or hybrid.
first_indexed 2025-11-15T06:24:10Z
format Proceeding
id unimas-8879
institution Universiti Malaysia Sarawak
institution_category Local University
language English
last_indexed 2025-11-15T06:24:10Z
publishDate 2014
recordtype eprints
repository_type Digital Repository
spelling unimas-88792015-10-16T01:20:30Z http://ir.unimas.my/id/eprint/8879/ Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban Juan, Sarah Samson Besacier, Laurent Rossato, Solange Q Science (General) QA75 Electronic computers. Computer science This paper describes our experiments and results on using a local dominant language in Malaysia (Malay), to bootstrap automatic speech recognition (ASR) for a very under-resourced language: Iban (also spoken in Malaysia on the Borneo Island part). Resources in Iban for building a speech recognition were nonexistent. For this, we tried to take advantage of a language from the same family with several similarities. First, to deal with the pronunciation dictionary, we proposed a bootstrapping strategy to develop an Iban pronunciation lexicon from a Malay one. A hybrid version, mix of Malay and Iban pronunciations, was also built and evaluated. Following this, we experimented with three Iban ASRs; each depended on either one of the three different pronunciation dictionaries: Malay, Iban or hybrid. 2014-05 Proceeding PeerReviewed text en http://ir.unimas.my/id/eprint/8879/1/sltu2014_sarah.pdf Juan, Sarah Samson and Besacier, Laurent and Rossato, Solange (2014) Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban. In: Proceedings of Workshop for Spoken Language Technology for Under-resourced (SLTU), St Petersbourg, Russia.
spellingShingle Q Science (General)
QA75 Electronic computers. Computer science
Juan, Sarah Samson
Besacier, Laurent
Rossato, Solange
Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban
title Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban
title_full Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban
title_fullStr Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban
title_full_unstemmed Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban
title_short Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban
title_sort semi-supervised g2p bootstrapping and its application to asr for a very under-resourced language: iban
topic Q Science (General)
QA75 Electronic computers. Computer science
url http://ir.unimas.my/id/eprint/8879/
http://ir.unimas.my/id/eprint/8879/1/sltu2014_sarah.pdf