Speech emotion recognition using feature fusion of TEO and MFCC on multilingual databases

In the speech signal, emotion is considered one of the most critical elements. For the recognition of emotions, the field of speech emotion recognition came into ex-istence. Speech Emotion Recognition (SER) is becoming an area of research in-terest in the last few years. A typical SER system focuses...

Full description

Bibliographic Details
Main Authors: Ahmad Qadri, Syed Asif, Gunawan, Teddy Surya, Kartiwi, Mira, Mansor, Hasmah
Format: Book Chapter
Language:English
English
Published: Springer 2020
Subjects:
Online Access:http://irep.iium.edu.my/82534/
http://irep.iium.edu.my/82534/1/Paper_182.pdf
http://irep.iium.edu.my/82534/2/Acceptance%20Letter_DrTeddy_IIUM.pdf
Description
Summary:In the speech signal, emotion is considered one of the most critical elements. For the recognition of emotions, the field of speech emotion recognition came into ex-istence. Speech Emotion Recognition (SER) is becoming an area of research in-terest in the last few years. A typical SER system focuses on extracting features such as pitch frequency, formant features, energy-related features, and spectral features from speech, tailing it with a classification quest to foresee different clas-ses of emotion. The critical issue to be addressed for a successful SER system is the emotional feature extraction, which can be solved by using different feature extraction techniques. In this paper, along with Teager Energy Operator (TEO) and Mel Frequency Cepstral Coefficients (MFCC) a trailblazing feature extrac-tion method, a fusion of MFCC and TEO as Teager-MFCC (T-MFCC) is used for the recognition of energy-based emotions. We have used three corpora of emotions in German, English, and Hindi to develop the multilingual SER system. The classification of these energy-based emotions is done by Deep Neural Net-work (DNN). It is found that TEO achieves a better recognition rate compared to MFCC and T-MFCC.