Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir
The dissertation proposes an Arabic digits speech recognition model utilizing recurrent neural network. Speech Recognition model select the finest speech signal representation by feature extraction of Mel-Frequency Cepstrum Coefficients (MFCCs) after been processed for noise reduction and digits sep...
| Main Author: | |
|---|---|
| Format: | Thesis |
| Published: |
2018
|
| Subjects: | |
| Online Access: | http://studentsrepo.um.edu.my/9521/ http://studentsrepo.um.edu.my/9521/1/AbdulAziz_Saleh_Mahfoudh_Ba_Wazir.jpg http://studentsrepo.um.edu.my/9521/11/abdulaziz.pdf |
| _version_ | 1848773942251094016 |
|---|---|
| author | Abdul Aziz Saleh, Mahfoudh Ba Wazir |
| author_facet | Abdul Aziz Saleh, Mahfoudh Ba Wazir |
| author_sort | Abdul Aziz Saleh, Mahfoudh Ba Wazir |
| building | UM Research Repository |
| collection | Online Access |
| description | The dissertation proposes an Arabic digits speech recognition model utilizing recurrent neural network. Speech Recognition model select the finest speech signal representation by feature extraction of Mel-Frequency Cepstrum Coefficients (MFCCs) after been processed for noise reduction and digits seperation. Digit speeches extracted features are fed into a network with long short-term memory (LSTM) cells. The LSTM cells have the capability to solve problems associated with temporal dependencies and require learning long-term and solve the vanishing gradient problems associated with RNN. A dataset of 1040 samples of spoken Arabic digits from different dialects is used in this study where 840 samples used to train the network and another 200 samples are used for testing purpose. The model training is carried out using GPU. The LSTM model learning parameters is tuned for optimization purpose to achieve higher accuracy of 94% during model training. The testing results of the finest tuned parameters model shows that the LSTM model is 69% accurate in recognizing spoken Arabic digits samples. Model highest accuracy obtained when recognizing the digit zero with 80%. |
| first_indexed | 2025-11-14T13:50:25Z |
| format | Thesis |
| id | um-9521 |
| institution | University Malaya |
| institution_category | Local University |
| last_indexed | 2025-11-14T13:50:25Z |
| publishDate | 2018 |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | um-95212020-12-15T00:03:04Z Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir Abdul Aziz Saleh, Mahfoudh Ba Wazir TK Electrical engineering. Electronics Nuclear engineering The dissertation proposes an Arabic digits speech recognition model utilizing recurrent neural network. Speech Recognition model select the finest speech signal representation by feature extraction of Mel-Frequency Cepstrum Coefficients (MFCCs) after been processed for noise reduction and digits seperation. Digit speeches extracted features are fed into a network with long short-term memory (LSTM) cells. The LSTM cells have the capability to solve problems associated with temporal dependencies and require learning long-term and solve the vanishing gradient problems associated with RNN. A dataset of 1040 samples of spoken Arabic digits from different dialects is used in this study where 840 samples used to train the network and another 200 samples are used for testing purpose. The model training is carried out using GPU. The LSTM model learning parameters is tuned for optimization purpose to achieve higher accuracy of 94% during model training. The testing results of the finest tuned parameters model shows that the LSTM model is 69% accurate in recognizing spoken Arabic digits samples. Model highest accuracy obtained when recognizing the digit zero with 80%. 2018-09 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/9521/1/AbdulAziz_Saleh_Mahfoudh_Ba_Wazir.jpg application/pdf http://studentsrepo.um.edu.my/9521/11/abdulaziz.pdf Abdul Aziz Saleh, Mahfoudh Ba Wazir (2018) Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir. Masters thesis, University of Malaya. http://studentsrepo.um.edu.my/9521/ |
| spellingShingle | TK Electrical engineering. Electronics Nuclear engineering Abdul Aziz Saleh, Mahfoudh Ba Wazir Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir |
| title | Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir |
| title_full | Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir |
| title_fullStr | Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir |
| title_full_unstemmed | Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir |
| title_short | Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir |
| title_sort | spoken arabic digits recognition using deep learning / abdulaziz saleh mahfoudh ba wazir |
| topic | TK Electrical engineering. Electronics Nuclear engineering |
| url | http://studentsrepo.um.edu.my/9521/ http://studentsrepo.um.edu.my/9521/1/AbdulAziz_Saleh_Mahfoudh_Ba_Wazir.jpg http://studentsrepo.um.edu.my/9521/11/abdulaziz.pdf |