Robust speaker gender identification using empirical mode decomposition-based cepstral features

Automatic speaker gender identification is a field of research with numerous practical applications. However, this issue has not gained its deserved attention, in particular in the presence of environmental noises. In this paper, using the empirical mode decomposition (EMD), some new and improved me...

Full description

Bibliographic Details
Main Authors: Alipoor, Ghasem, Samadi, Ehsan
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2018
Online Access:http://journalarticle.ukm.my/17756/
http://journalarticle.ukm.my/17756/1/06.pdf
Description
Summary:Automatic speaker gender identification is a field of research with numerous practical applications. However, this issue has not gained its deserved attention, in particular in the presence of environmental noises. In this paper, using the empirical mode decomposition (EMD), some new and improved mel-frequency cepstral coefficient (MFCC) features are developed to address the problem of robust speaker gender identification. In the proposed approach, EMD is employed as a filter bank to decompose the speech signal into its frequency bands. Furthermore, another variant is also developed in which the complete ensemble EMD (CEEMD) supersedes the EMD. Moreover, support vector machine (SVM) with radial basis function (RBF) kernel is employed for classification. Performance of these methods is examined for gender identification, in noise-free environments as well as in the presence of various Gaussian and non-Gaussian noises. Simulation results show that, although with fewer features used, utilizing the improved EMD-based cepstral features in noiseless situations leads to the same accuracy as that of the original MFCCs. However, in noisy environments the proposed methods outperform the conventional way of extracting the MFCCs.