Comparison of Various Neural Network Language Models in Speech Recognition

© 2016 IEEE. In recent years, research on language modeling for speech recognition has increasingly focused on the application of neural networks. However, the performance of neural network language models strongly depends on their architectural structure. Three competing concepts have been develope...

Full description

Bibliographic Details
Main Authors:	Zuo, L., Wan, X., Liu, Jian
Format:	Conference Paper
Published:	2016
Online Access:	http://hdl.handle.net/20.500.11937/71074

_version_	1848762382412677120
author	Zuo, L. Wan, X. Liu, Jian
author_facet	Zuo, L. Wan, X. Liu, Jian
author_sort	Zuo, L.
building	Curtin Institutional Repository
collection	Online Access
description	© 2016 IEEE. In recent years, research on language modeling for speech recognition has increasingly focused on the application of neural networks. However, the performance of neural network language models strongly depends on their architectural structure. Three competing concepts have been developed: Firstly, feed forward neural networks representing an n-gram approach, Secondly, recurrent neural networks that may learn context dependencies spanning more than a fixed number of predecessor words, Thirdly, the long short-term memory (LSTM) neural networks can fully exploits the correlation on a telephone conversation corpus. In this paper, we compare count models to feed forward, recurrent, and LSTM neural network in conversational telephone speech recognition tasks. Furthermore, we put forward a language model estimation method introduced the information of history sentences. We evaluate the models in terms of perplexity and word error rate, experimentally validating the strong correlation of the two quantities, which we find to hold regardless of the underlying type of the language model. The experimental results show that the performance of LSTM neural network language model is optimal in n-best lists re-score. Compared to the first pass decoding, the relative decline in average word error rate is 4.3% when using ten candidate results to re-score in conversational telephone speech recognition tasks.
first_indexed	2025-11-14T10:46:41Z
format	Conference Paper
id	curtin-20.500.11937-71074
institution	Curtin University Malaysia
institution_category	Local University
last_indexed	2025-11-14T10:46:41Z
publishDate	2016
recordtype	eprints
repository_type	Digital Repository
spelling	curtin-20.500.11937-710742018-12-13T09:35:01Z Comparison of Various Neural Network Language Models in Speech Recognition Zuo, L. Wan, X. Liu, Jian © 2016 IEEE. In recent years, research on language modeling for speech recognition has increasingly focused on the application of neural networks. However, the performance of neural network language models strongly depends on their architectural structure. Three competing concepts have been developed: Firstly, feed forward neural networks representing an n-gram approach, Secondly, recurrent neural networks that may learn context dependencies spanning more than a fixed number of predecessor words, Thirdly, the long short-term memory (LSTM) neural networks can fully exploits the correlation on a telephone conversation corpus. In this paper, we compare count models to feed forward, recurrent, and LSTM neural network in conversational telephone speech recognition tasks. Furthermore, we put forward a language model estimation method introduced the information of history sentences. We evaluate the models in terms of perplexity and word error rate, experimentally validating the strong correlation of the two quantities, which we find to hold regardless of the underlying type of the language model. The experimental results show that the performance of LSTM neural network language model is optimal in n-best lists re-score. Compared to the first pass decoding, the relative decline in average word error rate is 4.3% when using ten candidate results to re-score in conversational telephone speech recognition tasks. 2016 Conference Paper http://hdl.handle.net/20.500.11937/71074 10.1109/ICISCE.2016.195 restricted
spellingShingle	Zuo, L. Wan, X. Liu, Jian Comparison of Various Neural Network Language Models in Speech Recognition
title	Comparison of Various Neural Network Language Models in Speech Recognition
title_full	Comparison of Various Neural Network Language Models in Speech Recognition
title_fullStr	Comparison of Various Neural Network Language Models in Speech Recognition
title_full_unstemmed	Comparison of Various Neural Network Language Models in Speech Recognition
title_short	Comparison of Various Neural Network Language Models in Speech Recognition
title_sort	comparison of various neural network language models in speech recognition
url	http://hdl.handle.net/20.500.11937/71074

Comparison of Various Neural Network Language Models in Speech Recognition

Similar Items