Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features

Depression is a serious mental disorder that affects millions of people all over the world. Traditional clinical diagnosis methods are subjective, complicated and need extensive participation of experts. Audio-visual automatic depression analysis systems predominantly base their predictions on very...

Full description

Bibliographic Details
Main Authors: Song, Siyang, Shen, Linlin, Valstar, Michel F.
Format: Conference or Workshop Item
Published: 2018
Online Access:https://eprints.nottingham.ac.uk/51476/
_version_ 1848798505191079936
author Song, Siyang
Shen, Linlin
Valstar, Michel F.
author_facet Song, Siyang
Shen, Linlin
Valstar, Michel F.
author_sort Song, Siyang
building Nottingham Research Data Repository
collection Online Access
description Depression is a serious mental disorder that affects millions of people all over the world. Traditional clinical diagnosis methods are subjective, complicated and need extensive participation of experts. Audio-visual automatic depression analysis systems predominantly base their predictions on very brief sequential segments, sometimes as little as one frame. Such data contains much redundant information, causes a high computational load, and negatively affects the detection accuracy. Final decision making at the sequence level is then based on the fusion of frame or segment level predictions. However, this approach loses longer term behavioural correlations, as the behaviours themselves are abstracted away by the frame-level predictions. We propose to on the one hand use automatically detected human behaviour primitives such as Gaze directions, Facial action units (AU), etc. as low-dimensional multi-channel time series data, which can then be used to create two sequence descriptors. The first calculates the sequence-level statistics of the behaviour primitives and the second casts the problem as a Convolutional Neural Network problem operating on a spectral representation of the multichannel behaviour signals. The results of depression detection (binary classification) and severity estimation (regression) experiments conducted on the AVEC 2016 DAIC-WOZ database show that both methods achieved significant improvement compared to the previous state of the art in terms of the depression severity estimation.
first_indexed 2025-11-14T20:20:50Z
format Conference or Workshop Item
id nottingham-51476
institution University of Nottingham Malaysia Campus
institution_category Local University
last_indexed 2025-11-14T20:20:50Z
publishDate 2018
recordtype eprints
repository_type Digital Repository
spelling nottingham-514762020-05-04T19:36:57Z https://eprints.nottingham.ac.uk/51476/ Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features Song, Siyang Shen, Linlin Valstar, Michel F. Depression is a serious mental disorder that affects millions of people all over the world. Traditional clinical diagnosis methods are subjective, complicated and need extensive participation of experts. Audio-visual automatic depression analysis systems predominantly base their predictions on very brief sequential segments, sometimes as little as one frame. Such data contains much redundant information, causes a high computational load, and negatively affects the detection accuracy. Final decision making at the sequence level is then based on the fusion of frame or segment level predictions. However, this approach loses longer term behavioural correlations, as the behaviours themselves are abstracted away by the frame-level predictions. We propose to on the one hand use automatically detected human behaviour primitives such as Gaze directions, Facial action units (AU), etc. as low-dimensional multi-channel time series data, which can then be used to create two sequence descriptors. The first calculates the sequence-level statistics of the behaviour primitives and the second casts the problem as a Convolutional Neural Network problem operating on a spectral representation of the multichannel behaviour signals. The results of depression detection (binary classification) and severity estimation (regression) experiments conducted on the AVEC 2016 DAIC-WOZ database show that both methods achieved significant improvement compared to the previous state of the art in terms of the depression severity estimation. 2018-05-17 Conference or Workshop Item PeerReviewed Song, Siyang, Shen, Linlin and Valstar, Michel F. (2018) Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features. In: 13th IEEE International Conference on Face and Gesture Recognition (FG 2018), 15-19 May, Xi'an, China. https://ieeexplore.ieee.org/document/8373825/
spellingShingle Song, Siyang
Shen, Linlin
Valstar, Michel F.
Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features
title Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features
title_full Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features
title_fullStr Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features
title_full_unstemmed Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features
title_short Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features
title_sort human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features
url https://eprints.nottingham.ac.uk/51476/
https://eprints.nottingham.ac.uk/51476/