On the optimum speech segment length for depression detection
Depression is a worldwide problem, which according to the World Health Organization, is the largest contributor to global disability. According to a study, around 18336 Malaysians are suffering from depression. Therefore, an automated system that can detect depression from human speech is needed. Th...
| Main Authors: | , , , , |
|---|---|
| Format: | Proceeding Paper |
| Language: | English English |
| Published: |
IEEE
2019
|
| Subjects: | |
| Online Access: | http://irep.iium.edu.my/80387/ http://irep.iium.edu.my/80387/1/80387%20On%20the%20Optimum%20Speech%20Segment%20Length.pdf http://irep.iium.edu.my/80387/2/80387%20On%20the%20Optimum%20Speech%20Segment%20Length%20%20SCOPUS.pdf |
| _version_ | 1848788948067811328 |
|---|---|
| author | Alghifari, Muhammad Fahreza Gunawan, Teddy Surya Wan Nordin, Mimi Aminah Kartiwi, Mira Borhan, Lihanna |
| author_facet | Alghifari, Muhammad Fahreza Gunawan, Teddy Surya Wan Nordin, Mimi Aminah Kartiwi, Mira Borhan, Lihanna |
| author_sort | Alghifari, Muhammad Fahreza |
| building | IIUM Repository |
| collection | Online Access |
| description | Depression is a worldwide problem, which according to the World Health Organization, is the largest contributor to global disability. According to a study, around 18336 Malaysians are suffering from depression. Therefore, an automated system that can detect depression from human speech is needed. The main objective of this paper is to investigate the optimum speech segment length that provide fast and accurate depression detection. An artificial neural network was used as classifier to detect depression using a speech feature, i.e. the averaged Mel-frequency cepstral coefficients (MFCC). The Distress Analysis Interview Corpus Wizard of Oz (DAIC-WOZ) was used to train and test the system, measured in terms of accuracy and processing time, while varying the number of neurons used. The obtained results are further optimized by investigating the ideal segment length for depression detection. Results showed that our proposed system can recognize voiced depression in 3 levels of depression with an accuracy rate up to 98.3% when given previous samples of the same speaker for training. Furthermore, the optimum speech segment length was found to be 7 seconds, when it is tested for the length between 1 to 20 seconds. |
| first_indexed | 2025-11-14T17:48:56Z |
| format | Proceeding Paper |
| id | iium-80387 |
| institution | International Islamic University Malaysia |
| institution_category | Local University |
| language | English English |
| last_indexed | 2025-11-14T17:48:56Z |
| publishDate | 2019 |
| publisher | IEEE |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | iium-803872020-07-09T06:39:52Z http://irep.iium.edu.my/80387/ On the optimum speech segment length for depression detection Alghifari, Muhammad Fahreza Gunawan, Teddy Surya Wan Nordin, Mimi Aminah Kartiwi, Mira Borhan, Lihanna T Technology (General) Depression is a worldwide problem, which according to the World Health Organization, is the largest contributor to global disability. According to a study, around 18336 Malaysians are suffering from depression. Therefore, an automated system that can detect depression from human speech is needed. The main objective of this paper is to investigate the optimum speech segment length that provide fast and accurate depression detection. An artificial neural network was used as classifier to detect depression using a speech feature, i.e. the averaged Mel-frequency cepstral coefficients (MFCC). The Distress Analysis Interview Corpus Wizard of Oz (DAIC-WOZ) was used to train and test the system, measured in terms of accuracy and processing time, while varying the number of neurons used. The obtained results are further optimized by investigating the ideal segment length for depression detection. Results showed that our proposed system can recognize voiced depression in 3 levels of depression with an accuracy rate up to 98.3% when given previous samples of the same speaker for training. Furthermore, the optimum speech segment length was found to be 7 seconds, when it is tested for the length between 1 to 20 seconds. IEEE 2019 Proceeding Paper PeerReviewed application/pdf en http://irep.iium.edu.my/80387/1/80387%20On%20the%20Optimum%20Speech%20Segment%20Length.pdf application/pdf en http://irep.iium.edu.my/80387/2/80387%20On%20the%20Optimum%20Speech%20Segment%20Length%20%20SCOPUS.pdf Alghifari, Muhammad Fahreza and Gunawan, Teddy Surya and Wan Nordin, Mimi Aminah and Kartiwi, Mira and Borhan, Lihanna (2019) On the optimum speech segment length for depression detection. In: 2019 IEEE 6th International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA 2019), 27 - 29 Aug 2019, Kuala Lumpur, Malaysia. https://ieeexplore.ieee.org/document/9057319 10.1109/ICSIMA47653.2019.9057319 |
| spellingShingle | T Technology (General) Alghifari, Muhammad Fahreza Gunawan, Teddy Surya Wan Nordin, Mimi Aminah Kartiwi, Mira Borhan, Lihanna On the optimum speech segment length for depression detection |
| title | On the optimum speech segment length for depression detection |
| title_full | On the optimum speech segment length for depression detection |
| title_fullStr | On the optimum speech segment length for depression detection |
| title_full_unstemmed | On the optimum speech segment length for depression detection |
| title_short | On the optimum speech segment length for depression detection |
| title_sort | on the optimum speech segment length for depression detection |
| topic | T Technology (General) |
| url | http://irep.iium.edu.my/80387/ http://irep.iium.edu.my/80387/ http://irep.iium.edu.my/80387/ http://irep.iium.edu.my/80387/1/80387%20On%20the%20Optimum%20Speech%20Segment%20Length.pdf http://irep.iium.edu.my/80387/2/80387%20On%20the%20Optimum%20Speech%20Segment%20Length%20%20SCOPUS.pdf |