Convolutional aggregation of local evidence for large pose face alignment

Methods for unconstrained face alignment must satisfy two requirements: they must not rely on accurate initialisation/face detection and they should perform equally well for the whole spectrum of facial poses. To the best of our knowledge, there are no methods meeting these requirements to satisfact...

Full description

Bibliographic Details
Main Authors: Bulat, Adrian, Tzimiropoulos, Georgios
Format: Conference or Workshop Item
Published: 2016
Online Access:https://eprints.nottingham.ac.uk/37236/
_version_ 1848795419246592000
author Bulat, Adrian
Tzimiropoulos, Georgios
author_facet Bulat, Adrian
Tzimiropoulos, Georgios
author_sort Bulat, Adrian
building Nottingham Research Data Repository
collection Online Access
description Methods for unconstrained face alignment must satisfy two requirements: they must not rely on accurate initialisation/face detection and they should perform equally well for the whole spectrum of facial poses. To the best of our knowledge, there are no methods meeting these requirements to satisfactory extent, and in this paper, we propose Convolutional Aggregation of Local Evidence (CALE), a Convolutional Neural Network (CNN) architecture particularly designed for addressing both of them. In particular, to remove the requirement for accurate face detection, our system firstly performs facial part detection, providing confidence scores for the location of each of the facial landmarks (local evidence). Next, these score maps along with early CNN features are aggregated by our system through joint regression in order to refine the landmarks’ location. Besides playing the role of a graphical model, CNN regression is a key feature of our system, guiding the network to rely on context for predicting the location of occluded landmarks, typically encountered in very large poses. The whole system is trained end-to-end with intermediate supervision. When applied to AFLW-PIFA, the most challenging human face alignment test set to date, our method provides more than 50% gain in localisation accuracy when compared to other recently published methods for large pose face alignment. Going beyond human faces, we also demonstrate that CALE is effective in dealing with very large changes in shape and appearance, typically encountered in animal faces.
first_indexed 2025-11-14T19:31:47Z
format Conference or Workshop Item
id nottingham-37236
institution University of Nottingham Malaysia Campus
institution_category Local University
last_indexed 2025-11-14T19:31:47Z
publishDate 2016
recordtype eprints
repository_type Digital Repository
spelling nottingham-372362020-05-04T18:11:41Z https://eprints.nottingham.ac.uk/37236/ Convolutional aggregation of local evidence for large pose face alignment Bulat, Adrian Tzimiropoulos, Georgios Methods for unconstrained face alignment must satisfy two requirements: they must not rely on accurate initialisation/face detection and they should perform equally well for the whole spectrum of facial poses. To the best of our knowledge, there are no methods meeting these requirements to satisfactory extent, and in this paper, we propose Convolutional Aggregation of Local Evidence (CALE), a Convolutional Neural Network (CNN) architecture particularly designed for addressing both of them. In particular, to remove the requirement for accurate face detection, our system firstly performs facial part detection, providing confidence scores for the location of each of the facial landmarks (local evidence). Next, these score maps along with early CNN features are aggregated by our system through joint regression in order to refine the landmarks’ location. Besides playing the role of a graphical model, CNN regression is a key feature of our system, guiding the network to rely on context for predicting the location of occluded landmarks, typically encountered in very large poses. The whole system is trained end-to-end with intermediate supervision. When applied to AFLW-PIFA, the most challenging human face alignment test set to date, our method provides more than 50% gain in localisation accuracy when compared to other recently published methods for large pose face alignment. Going beyond human faces, we also demonstrate that CALE is effective in dealing with very large changes in shape and appearance, typically encountered in animal faces. 2016-09-19 Conference or Workshop Item PeerReviewed Bulat, Adrian and Tzimiropoulos, Georgios (2016) Convolutional aggregation of local evidence for large pose face alignment. In: BMCV 2016, 19-22 September 2016, York, U.K.. http://bmvc2016.cs.york.ac.uk/
spellingShingle Bulat, Adrian
Tzimiropoulos, Georgios
Convolutional aggregation of local evidence for large pose face alignment
title Convolutional aggregation of local evidence for large pose face alignment
title_full Convolutional aggregation of local evidence for large pose face alignment
title_fullStr Convolutional aggregation of local evidence for large pose face alignment
title_full_unstemmed Convolutional aggregation of local evidence for large pose face alignment
title_short Convolutional aggregation of local evidence for large pose face alignment
title_sort convolutional aggregation of local evidence for large pose face alignment
url https://eprints.nottingham.ac.uk/37236/
https://eprints.nottingham.ac.uk/37236/