Speaker discriminability for visual speech modes

Does speech mode affect recognizing people from their visual speech? We examined 3D motion data from 4 talkers saying 10 sentences (twice). Speech was in noise, in quiet or whispered. Principal Component Analyses (PCAs) were conducted and speaker classification was determined by Linear Discriminant...

Full description

Bibliographic Details
Main Authors: Kim, J., Davis, C., Kroos, Christian, Hill, H.
Other Authors: -
Format: Conference Paper
Published: ISCA 2009
Subjects:
Online Access:http://www.isca-speech.org/archive/archive_papers/interspeech_2009/papers/i09_2259.pdf
http://hdl.handle.net/20.500.11937/4521
_version_ 1848744539416690688
author Kim, J.
Davis, C.
Kroos, Christian
Hill, H.
author2 -
author_facet -
Kim, J.
Davis, C.
Kroos, Christian
Hill, H.
author_sort Kim, J.
building Curtin Institutional Repository
collection Online Access
description Does speech mode affect recognizing people from their visual speech? We examined 3D motion data from 4 talkers saying 10 sentences (twice). Speech was in noise, in quiet or whispered. Principal Component Analyses (PCAs) were conducted and speaker classification was determined by Linear Discriminant Analysis (LDA). The first five PCs for the rigid motion and the first 10 PCs each for the non-rigid motion and the combined motion were input to a series of LDAs for all possible combinations of PCs that could be constructed using the retained PCs. The discriminant functions and classification coefficients were determined on the training data to predict the talker of the test data. Classification performance for both the in-noise and whispered speech modes were superior to the in-quiet one. Superiority of classification was found even if only the first PC (jaw motion) was used, i.e., measures of jaw motion when speaking in noise or whispering hold promise for bimodal person recognition or verification.
first_indexed 2025-11-14T06:03:04Z
format Conference Paper
id curtin-20.500.11937-4521
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T06:03:04Z
publishDate 2009
publisher ISCA
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-45212017-01-30T10:39:42Z Speaker discriminability for visual speech modes Kim, J. Davis, C. Kroos, Christian Hill, H. - Visual speech speech modes speaker recognition Does speech mode affect recognizing people from their visual speech? We examined 3D motion data from 4 talkers saying 10 sentences (twice). Speech was in noise, in quiet or whispered. Principal Component Analyses (PCAs) were conducted and speaker classification was determined by Linear Discriminant Analysis (LDA). The first five PCs for the rigid motion and the first 10 PCs each for the non-rigid motion and the combined motion were input to a series of LDAs for all possible combinations of PCs that could be constructed using the retained PCs. The discriminant functions and classification coefficients were determined on the training data to predict the talker of the test data. Classification performance for both the in-noise and whispered speech modes were superior to the in-quiet one. Superiority of classification was found even if only the first PC (jaw motion) was used, i.e., measures of jaw motion when speaking in noise or whispering hold promise for bimodal person recognition or verification. 2009 Conference Paper http://hdl.handle.net/20.500.11937/4521 http://www.isca-speech.org/archive/archive_papers/interspeech_2009/papers/i09_2259.pdf ISCA restricted
spellingShingle Visual speech
speech modes
speaker recognition
Kim, J.
Davis, C.
Kroos, Christian
Hill, H.
Speaker discriminability for visual speech modes
title Speaker discriminability for visual speech modes
title_full Speaker discriminability for visual speech modes
title_fullStr Speaker discriminability for visual speech modes
title_full_unstemmed Speaker discriminability for visual speech modes
title_short Speaker discriminability for visual speech modes
title_sort speaker discriminability for visual speech modes
topic Visual speech
speech modes
speaker recognition
url http://www.isca-speech.org/archive/archive_papers/interspeech_2009/papers/i09_2259.pdf
http://hdl.handle.net/20.500.11937/4521