The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition

This paper presents a method to extract speech features contained in the dynamic time warping path which originally was derived from linear predictive coding (LPC). For the purpose of recognition, the extracted feature will represent the inputs into neural network back-propagation. The new method p...

Full description

Bibliographic Details
Main Authors: Sudirman, Rubita, Salleh, Sh-Hussain, Salleh, Shaharuddin
Format: Conference or Workshop Item
Language:English
Published: 2006
Subjects:
Online Access:http://eprints.utm.my/1971/
http://eprints.utm.my/1971/1/rubita06_Effectiveness_of_DTWFF_coefficients.pdf
_version_ 1848890256818962432
author Sudirman, Rubita
Salleh, Sh-Hussain
Salleh, Shaharuddin
author_facet Sudirman, Rubita
Salleh, Sh-Hussain
Salleh, Shaharuddin
author_sort Sudirman, Rubita
building UTeM Institutional Repository
collection Online Access
description This paper presents a method to extract speech features contained in the dynamic time warping path which originally was derived from linear predictive coding (LPC). For the purpose of recognition, the extracted feature will represent the inputs into neural network back-propagation. The new method presented here is how the feature is extracted and those coefficients are normalized against the template pattern according to the selected average number of frames over the samples collected. The idea behind this method is due to neural network (NN) limitation where a fixed amount of input nodes are needed for every input class especially in the application of multiple inputs. Thus, the main objective of this research is to find an alternative method to reduce the amount of computation and complexity in a neural network, in this case is for speech recognition. One way to achieve this is by reducing the number of inputs into the network. This is done through dynamic warping process in which local distance scores of the warping path will be utilized instead of the global distance scores. From the literature review, past and most current research are using the global distance score or LPC coefficients as input to the neural network. LPC certainly presented into the network with a large amount of coefficients in each speech frame.
first_indexed 2025-11-15T20:39:11Z
format Conference or Workshop Item
id utm-1971
institution Universiti Teknologi Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T20:39:11Z
publishDate 2006
recordtype eprints
repository_type Digital Repository
spelling utm-19712017-08-30T04:15:25Z http://eprints.utm.my/1971/ The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin TK Electrical engineering. Electronics Nuclear engineering This paper presents a method to extract speech features contained in the dynamic time warping path which originally was derived from linear predictive coding (LPC). For the purpose of recognition, the extracted feature will represent the inputs into neural network back-propagation. The new method presented here is how the feature is extracted and those coefficients are normalized against the template pattern according to the selected average number of frames over the samples collected. The idea behind this method is due to neural network (NN) limitation where a fixed amount of input nodes are needed for every input class especially in the application of multiple inputs. Thus, the main objective of this research is to find an alternative method to reduce the amount of computation and complexity in a neural network, in this case is for speech recognition. One way to achieve this is by reducing the number of inputs into the network. This is done through dynamic warping process in which local distance scores of the warping path will be utilized instead of the global distance scores. From the literature review, past and most current research are using the global distance score or LPC coefficients as input to the neural network. LPC certainly presented into the network with a large amount of coefficients in each speech frame. 2006 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/1971/1/rubita06_Effectiveness_of_DTWFF_coefficients.pdf Sudirman, Rubita and Salleh, Sh-Hussain and Salleh, Shaharuddin (2006) The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition. In: Proceeding of the International Conference on Artificial Intelligence, Engineering and Technology , 22-24 November 2006, Kota Kinabalu, Sabah.
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Sudirman, Rubita
Salleh, Sh-Hussain
Salleh, Shaharuddin
The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_full The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_fullStr The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_full_unstemmed The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_short The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_sort effectiveness of dtw-ff coefficients and pitch feature in nn speech recognition
topic TK Electrical engineering. Electronics Nuclear engineering
url http://eprints.utm.my/1971/
http://eprints.utm.my/1971/1/rubita06_Effectiveness_of_DTWFF_coefficients.pdf