Exploring the representation of caricatures, facial motion, and view-invariance in face space.

Faces present a vast array of information, from invariable features such as identity, to variable features such as expression, speech and pose. Humans have an incredible capability of recognising faces (familiar faces at least) and interpreting facial actions, even across changes in view. While ther...

Full description

Bibliographic Details
Main Author: Elson, Ryan
Format: Thesis (University of Nottingham only)
Language:English
Published: 2024
Subjects:
Online Access:https://eprints.nottingham.ac.uk/77728/
_version_ 1848801022541037568
author Elson, Ryan
author_facet Elson, Ryan
author_sort Elson, Ryan
building Nottingham Research Data Repository
collection Online Access
description Faces present a vast array of information, from invariable features such as identity, to variable features such as expression, speech and pose. Humans have an incredible capability of recognising faces (familiar faces at least) and interpreting facial actions, even across changes in view. While there has been an explosion of research into developing artificial neural networks for many aspects of face processing, some of which seem to predict neural responses quite well, the current work focuses on face processing through simpler linear projection spaces. These linear projection spaces are formal instantiations of ‘face space’, built using principal component analysis (PCA). The concept of ‘face space’ (Valentine, 1991) has been a highly influential account of how faces might be represented in the brain. In particular, recent research supports the presence of a face space in the macaque brain in the form of a linear projection space, referred to as ‘axis coding’ in which individual faces can be coded as linear sum of orthogonal features. Here, these linear projection spaces are used for two streams of investigation. Firstly, we assessed the neurovascular response to hyper-caricatured faces in an fMRI study. Based on the assumption that faces further from average should project more strongly onto components in the linear space, we hypothesised that they should elicit a stronger response. Contrary to our expectations, we found little evidence for this in the fusiform face area (FFA) and face-selective cortex more generally, although the response pattern did become more consistent for caricatured faces in the FFA. We then explored the response to these caricatured faces in cortex typically associated with object processing. Interestingly, both the average response magnitude and response pattern consistency increased to these stimuli as caricaturing increased. At the current time it is unclear if this response allows some functional benefit for processing caricatured faces, or whether it simply reflects similarities in the low- and mid-level properties to certain objects. If the response is functional, then hyper-caricaturing could pave a route to improving face processing in individuals with prosopagnosia if technologies can be developed to automatically caricature faces in real-time. The second line of work addressed these linear projection spaces in the context of achieving view-invariance, specifically in the domain of facial motion and expression. How humans create view-invariant representations is still of interest, despite much research, however little work has focused on creating view-invariant representations outside of identity recognition. Likewise, there has been much research into face space and view-invariance separately, yet there is little evidence for how different views may be represented within a face space framework, and how motion might also be incorporated. Automatic face analysis systems mostly deal with pose by either aligning to a canonical frontal view or by using separate view-specific models. There is inconclusive evidence that the brain possesses an internal 3D model for ‘frontalising’ faces, therefore here we investigate how changes in view might be processed in a unified multi-view face space based on using a few prototypical 2D views. We investigate the functionality and biological plausibility of five identity-specific faces spaces, created using PCA, that allow for different views to be reconstructed from single-view video inputs of actors speaking. The most promising of these models first builds a separate orthogonal space for each viewpoint. The relationships between the components in neighbouring views are learned, and then reconstructions across views are made using a cascade of projection, transformation, and reconstruction. These reconstructions are then collated and used to build a multi-view space, which can reconstruct motion well across all learned views. This provides initial insight into how a biologically plausible, view-invariant system for facial motion processing might be represented in the brain. Moreover, it also has the capacity to improve view-transformations in automatic lip-reading software.
first_indexed 2025-11-14T21:00:51Z
format Thesis (University of Nottingham only)
id nottingham-77728
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T21:00:51Z
publishDate 2024
recordtype eprints
repository_type Digital Repository
spelling nottingham-777282024-07-23T04:40:18Z https://eprints.nottingham.ac.uk/77728/ Exploring the representation of caricatures, facial motion, and view-invariance in face space. Elson, Ryan Faces present a vast array of information, from invariable features such as identity, to variable features such as expression, speech and pose. Humans have an incredible capability of recognising faces (familiar faces at least) and interpreting facial actions, even across changes in view. While there has been an explosion of research into developing artificial neural networks for many aspects of face processing, some of which seem to predict neural responses quite well, the current work focuses on face processing through simpler linear projection spaces. These linear projection spaces are formal instantiations of ‘face space’, built using principal component analysis (PCA). The concept of ‘face space’ (Valentine, 1991) has been a highly influential account of how faces might be represented in the brain. In particular, recent research supports the presence of a face space in the macaque brain in the form of a linear projection space, referred to as ‘axis coding’ in which individual faces can be coded as linear sum of orthogonal features. Here, these linear projection spaces are used for two streams of investigation. Firstly, we assessed the neurovascular response to hyper-caricatured faces in an fMRI study. Based on the assumption that faces further from average should project more strongly onto components in the linear space, we hypothesised that they should elicit a stronger response. Contrary to our expectations, we found little evidence for this in the fusiform face area (FFA) and face-selective cortex more generally, although the response pattern did become more consistent for caricatured faces in the FFA. We then explored the response to these caricatured faces in cortex typically associated with object processing. Interestingly, both the average response magnitude and response pattern consistency increased to these stimuli as caricaturing increased. At the current time it is unclear if this response allows some functional benefit for processing caricatured faces, or whether it simply reflects similarities in the low- and mid-level properties to certain objects. If the response is functional, then hyper-caricaturing could pave a route to improving face processing in individuals with prosopagnosia if technologies can be developed to automatically caricature faces in real-time. The second line of work addressed these linear projection spaces in the context of achieving view-invariance, specifically in the domain of facial motion and expression. How humans create view-invariant representations is still of interest, despite much research, however little work has focused on creating view-invariant representations outside of identity recognition. Likewise, there has been much research into face space and view-invariance separately, yet there is little evidence for how different views may be represented within a face space framework, and how motion might also be incorporated. Automatic face analysis systems mostly deal with pose by either aligning to a canonical frontal view or by using separate view-specific models. There is inconclusive evidence that the brain possesses an internal 3D model for ‘frontalising’ faces, therefore here we investigate how changes in view might be processed in a unified multi-view face space based on using a few prototypical 2D views. We investigate the functionality and biological plausibility of five identity-specific faces spaces, created using PCA, that allow for different views to be reconstructed from single-view video inputs of actors speaking. The most promising of these models first builds a separate orthogonal space for each viewpoint. The relationships between the components in neighbouring views are learned, and then reconstructions across views are made using a cascade of projection, transformation, and reconstruction. These reconstructions are then collated and used to build a multi-view space, which can reconstruct motion well across all learned views. This provides initial insight into how a biologically plausible, view-invariant system for facial motion processing might be represented in the brain. Moreover, it also has the capacity to improve view-transformations in automatic lip-reading software. 2024-07-23 Thesis (University of Nottingham only) NonPeerReviewed application/pdf en cc_by https://eprints.nottingham.ac.uk/77728/1/Elson_Ryan_10065840_final_submission.pdf Elson, Ryan (2024) Exploring the representation of caricatures, facial motion, and view-invariance in face space. PhD thesis, University of Nottingham. face perception space representations caricatures neuroimaging
spellingShingle face perception
space representations
caricatures
neuroimaging
Elson, Ryan
Exploring the representation of caricatures, facial motion, and view-invariance in face space.
title Exploring the representation of caricatures, facial motion, and view-invariance in face space.
title_full Exploring the representation of caricatures, facial motion, and view-invariance in face space.
title_fullStr Exploring the representation of caricatures, facial motion, and view-invariance in face space.
title_full_unstemmed Exploring the representation of caricatures, facial motion, and view-invariance in face space.
title_short Exploring the representation of caricatures, facial motion, and view-invariance in face space.
title_sort exploring the representation of caricatures, facial motion, and view-invariance in face space.
topic face perception
space representations
caricatures
neuroimaging
url https://eprints.nottingham.ac.uk/77728/