Merging of native and non-native speech for low-resource accented ASR
This paper presents our recent study on low-resource automatic speech recognition (ASR) system with accented speech. We propose multi-accent Subspace Gaussian Mixture Models (SGMM) and accent-specific Deep Neural Networks (DNN) for improving non-native ASR performance. In the SGMM framework, we pres...
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer Verlag
2015
|
| Subjects: | |
| Online Access: | http://ir.unimas.my/id/eprint/12098/ http://ir.unimas.my/id/eprint/12098/1/No%2035%20%28abstrak%29.pdf |
| Summary: | This paper presents our recent study on low-resource automatic speech recognition (ASR) system with accented speech. We propose multi-accent Subspace Gaussian Mixture Models (SGMM) and accent-specific Deep Neural Networks (DNN) for improving non-native ASR performance. In the SGMM framework, we present an original language weighting strategy to merge the globally shared parameters of two models based on native and non-native speech espectively. In the DNN framework, a native deep neural net is fine-tuned to non-native speech. Over the non-native baseline, we achieved relative improvement of 15% for multi-accent SGMM and 34% for accent-specific DNN with speaker
adaptation. |
|---|