Browse by author
Lookup NU author(s): Dr Shuanglin Li, Dr Mohsen Naqvi
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
In the area of affective computing, speech has been identified as a promising biomarker for assessing depression and attention deficit hyperactivity disorder (ADHD). These disorders manifest as abnormalities in speech across various frequency bands and exhibit temporal variations. Most existing work on speech features relies on the magnitude spectrogram, which discards phase information and also does not consider the impact of different frequency bands on depression and ADHD detection. Inspired by these, we propose a novel multi-scale complex feature refinement and dynamic convolution attention-aware network to enhance speech-based assessment of depression and ADHD. Our approach incorporates three key components: multi-scale complex feature refinement (MSFR), dynamic convolutional neural network (Dy-CNN), and dual-attention feature enhancement (DAFE) module. The MSFR module utilizes depth-wise convolutional networks to process both magnitude and phase input, selectively emphasizing frequency bands associated with depression and ADHD. Importantly, the Dy-CNN module employs an attention mechanism to autonomously generate multiple convolution kernels that adapt to input features and capture relevant temporal dynamics linked to depression and ADHD. Additionally, the DAFE module enhances feature representation and detection performance by incorporating channel shuffle attention (CSA) and spatial axial attention (SAA) mechanisms, which leverage both inter- and intra-channel relationships and examine time-frequency characteristics of the feature map. Extensive experiments conducted on four publicly available datasets, i.e., AVEC2013, AVEC2014, E-DAIC, and a self-collected authentic ADHD dataset demonstrated that the proposed method outperforms previous approaches and exhibits superior generalization capabilities across different language settings (i.e., English, German) for speech-based depression and ADHD assessment.
Author(s): Li S, Song S, Naqvi SM
Publication type: Article
Publication status: Published
Journal: IEEE Transactions on Affective Computing
Year: 2025
Pages: epub ahead of print
Online publication date: 02/09/2025
Acceptance date: 01/09/2025
Date deposited: 08/09/2025
ISSN (print): 1949-3045
Publisher: IEEE
URL: https://doi.org/10.1109/TAFFC.2025.3604562
DOI: 10.1109/TAFFC.2025.3604562
ePrints DOI: 10.57711/8sb9-vm60
Altmetrics provided by Altmetric