Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban
This paper describes our experiments and results on using a local dominant language in Malaysia (Malay), to bootstrap automatic speech recognition (ASR) for a very under-resourced language: Iban (also spoken in Malaysia on the Borneo Island part). Resources in Iban for building a speech recognition...
| Main Authors: | , , |
|---|---|
| Format: | Proceeding |
| Language: | English |
| Published: |
2014
|
| Subjects: | |
| Online Access: | http://ir.unimas.my/id/eprint/8879/ http://ir.unimas.my/id/eprint/8879/1/sltu2014_sarah.pdf |
| _version_ | 1848836463122186240 |
|---|---|
| author | Juan, Sarah Samson Besacier, Laurent Rossato, Solange |
| author_facet | Juan, Sarah Samson Besacier, Laurent Rossato, Solange |
| author_sort | Juan, Sarah Samson |
| building | UNIMAS Institutional Repository |
| collection | Online Access |
| description | This paper describes our experiments and results on using
a local dominant language in Malaysia (Malay), to bootstrap automatic speech recognition (ASR) for a very under-resourced language: Iban (also spoken in Malaysia on the Borneo Island part). Resources in Iban for building a speech recognition were nonexistent. For this, we tried to take advantage of a language from the same family with several similarities. First, to deal with the pronunciation dictionary, we proposed a bootstrapping strategy to develop an Iban pronunciation lexicon from a Malay one. A hybrid version, mix of Malay and Iban pronunciations, was also built and evaluated. Following this, we experimented with three Iban ASRs; each depended on either one of the three different pronunciation dictionaries: Malay, Iban or hybrid. |
| first_indexed | 2025-11-15T06:24:10Z |
| format | Proceeding |
| id | unimas-8879 |
| institution | Universiti Malaysia Sarawak |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T06:24:10Z |
| publishDate | 2014 |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | unimas-88792015-10-16T01:20:30Z http://ir.unimas.my/id/eprint/8879/ Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban Juan, Sarah Samson Besacier, Laurent Rossato, Solange Q Science (General) QA75 Electronic computers. Computer science This paper describes our experiments and results on using a local dominant language in Malaysia (Malay), to bootstrap automatic speech recognition (ASR) for a very under-resourced language: Iban (also spoken in Malaysia on the Borneo Island part). Resources in Iban for building a speech recognition were nonexistent. For this, we tried to take advantage of a language from the same family with several similarities. First, to deal with the pronunciation dictionary, we proposed a bootstrapping strategy to develop an Iban pronunciation lexicon from a Malay one. A hybrid version, mix of Malay and Iban pronunciations, was also built and evaluated. Following this, we experimented with three Iban ASRs; each depended on either one of the three different pronunciation dictionaries: Malay, Iban or hybrid. 2014-05 Proceeding PeerReviewed text en http://ir.unimas.my/id/eprint/8879/1/sltu2014_sarah.pdf Juan, Sarah Samson and Besacier, Laurent and Rossato, Solange (2014) Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban. In: Proceedings of Workshop for Spoken Language Technology for Under-resourced (SLTU), St Petersbourg, Russia. |
| spellingShingle | Q Science (General) QA75 Electronic computers. Computer science Juan, Sarah Samson Besacier, Laurent Rossato, Solange Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban |
| title | Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban |
| title_full | Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban |
| title_fullStr | Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban |
| title_full_unstemmed | Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban |
| title_short | Semi-supervised G2P Bootstrapping and Its Application to ASR for a Very Under-resourced Language: Iban |
| title_sort | semi-supervised g2p bootstrapping and its application to asr for a very under-resourced language: iban |
| topic | Q Science (General) QA75 Electronic computers. Computer science |
| url | http://ir.unimas.my/id/eprint/8879/ http://ir.unimas.my/id/eprint/8879/1/sltu2014_sarah.pdf |