On the optimum speech segment length for depression detection

Depression is a worldwide problem, which according to the World Health Organization, is the largest contributor to global disability. According to a study, around 18336 Malaysians are suffering from depression. Therefore, an automated system that can detect depression from human speech is needed. Th...

Full description

Bibliographic Details
Main Authors: Alghifari, Muhammad Fahreza, Gunawan, Teddy Surya, Wan Nordin, Mimi Aminah, Kartiwi, Mira, Borhan, Lihanna
Format: Proceeding Paper
Language:English
English
Published: IEEE 2019
Subjects:
Online Access:http://irep.iium.edu.my/80387/
http://irep.iium.edu.my/80387/1/80387%20On%20the%20Optimum%20Speech%20Segment%20Length.pdf
http://irep.iium.edu.my/80387/2/80387%20On%20the%20Optimum%20Speech%20Segment%20Length%20%20SCOPUS.pdf
_version_ 1848788948067811328
author Alghifari, Muhammad Fahreza
Gunawan, Teddy Surya
Wan Nordin, Mimi Aminah
Kartiwi, Mira
Borhan, Lihanna
author_facet Alghifari, Muhammad Fahreza
Gunawan, Teddy Surya
Wan Nordin, Mimi Aminah
Kartiwi, Mira
Borhan, Lihanna
author_sort Alghifari, Muhammad Fahreza
building IIUM Repository
collection Online Access
description Depression is a worldwide problem, which according to the World Health Organization, is the largest contributor to global disability. According to a study, around 18336 Malaysians are suffering from depression. Therefore, an automated system that can detect depression from human speech is needed. The main objective of this paper is to investigate the optimum speech segment length that provide fast and accurate depression detection. An artificial neural network was used as classifier to detect depression using a speech feature, i.e. the averaged Mel-frequency cepstral coefficients (MFCC). The Distress Analysis Interview Corpus Wizard of Oz (DAIC-WOZ) was used to train and test the system, measured in terms of accuracy and processing time, while varying the number of neurons used. The obtained results are further optimized by investigating the ideal segment length for depression detection. Results showed that our proposed system can recognize voiced depression in 3 levels of depression with an accuracy rate up to 98.3% when given previous samples of the same speaker for training. Furthermore, the optimum speech segment length was found to be 7 seconds, when it is tested for the length between 1 to 20 seconds.
first_indexed 2025-11-14T17:48:56Z
format Proceeding Paper
id iium-80387
institution International Islamic University Malaysia
institution_category Local University
language English
English
last_indexed 2025-11-14T17:48:56Z
publishDate 2019
publisher IEEE
recordtype eprints
repository_type Digital Repository
spelling iium-803872020-07-09T06:39:52Z http://irep.iium.edu.my/80387/ On the optimum speech segment length for depression detection Alghifari, Muhammad Fahreza Gunawan, Teddy Surya Wan Nordin, Mimi Aminah Kartiwi, Mira Borhan, Lihanna T Technology (General) Depression is a worldwide problem, which according to the World Health Organization, is the largest contributor to global disability. According to a study, around 18336 Malaysians are suffering from depression. Therefore, an automated system that can detect depression from human speech is needed. The main objective of this paper is to investigate the optimum speech segment length that provide fast and accurate depression detection. An artificial neural network was used as classifier to detect depression using a speech feature, i.e. the averaged Mel-frequency cepstral coefficients (MFCC). The Distress Analysis Interview Corpus Wizard of Oz (DAIC-WOZ) was used to train and test the system, measured in terms of accuracy and processing time, while varying the number of neurons used. The obtained results are further optimized by investigating the ideal segment length for depression detection. Results showed that our proposed system can recognize voiced depression in 3 levels of depression with an accuracy rate up to 98.3% when given previous samples of the same speaker for training. Furthermore, the optimum speech segment length was found to be 7 seconds, when it is tested for the length between 1 to 20 seconds. IEEE 2019 Proceeding Paper PeerReviewed application/pdf en http://irep.iium.edu.my/80387/1/80387%20On%20the%20Optimum%20Speech%20Segment%20Length.pdf application/pdf en http://irep.iium.edu.my/80387/2/80387%20On%20the%20Optimum%20Speech%20Segment%20Length%20%20SCOPUS.pdf Alghifari, Muhammad Fahreza and Gunawan, Teddy Surya and Wan Nordin, Mimi Aminah and Kartiwi, Mira and Borhan, Lihanna (2019) On the optimum speech segment length for depression detection. In: 2019 IEEE 6th International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA 2019), 27 - 29 Aug 2019, Kuala Lumpur, Malaysia. https://ieeexplore.ieee.org/document/9057319 10.1109/ICSIMA47653.2019.9057319
spellingShingle T Technology (General)
Alghifari, Muhammad Fahreza
Gunawan, Teddy Surya
Wan Nordin, Mimi Aminah
Kartiwi, Mira
Borhan, Lihanna
On the optimum speech segment length for depression detection
title On the optimum speech segment length for depression detection
title_full On the optimum speech segment length for depression detection
title_fullStr On the optimum speech segment length for depression detection
title_full_unstemmed On the optimum speech segment length for depression detection
title_short On the optimum speech segment length for depression detection
title_sort on the optimum speech segment length for depression detection
topic T Technology (General)
url http://irep.iium.edu.my/80387/
http://irep.iium.edu.my/80387/
http://irep.iium.edu.my/80387/
http://irep.iium.edu.my/80387/1/80387%20On%20the%20Optimum%20Speech%20Segment%20Length.pdf
http://irep.iium.edu.my/80387/2/80387%20On%20the%20Optimum%20Speech%20Segment%20Length%20%20SCOPUS.pdf