Browse by author
Lookup NU author(s): Ting Zhu
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
© 2026 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.The speech of patients with dysarthria is often accompanied by symptoms such as involuntary pauses and reduced articulatory coherence, which are crucial for disease assessment. Existing pathological speech classification models rely on sets of features introduced from normal controls, neglecting the integration of disfluency-related features, thereby failing to capture the sufficient pathological information. Second, network architectures do not adequately address issues such as gradients vanishing or exploding during deep training, and they lack the capability to deeply explore channel and spatial dimension features, which in turn impacts classification accuracy. To overcome these challenges, this paper proposes a dual-branch residual pathological speech classification network with speech fluency feature compensation. Pause and coherence features are extracted based on THE-POSSD dataset and integrated with MFCC and formant features to construct a comprehensive feature set. For network architecture, wideband and narrowband spectrograms are used as dual inputs, and an adaptive feature extraction residual block with skip connections is employed to address gradient-related issues and extract deeper features. Additionally, the dual-branch features are fused using a complementary fusion module, which is weighted and optimized in conjunction with the multi-feature set to enhance recognition performance. Experimental results demonstrate that the proposed model achieves an accuracy of 96.21%, representing a 2.5% improvement over the baseline, while precision, recall, and F1 score are increased by 4.94%, 4.99%, and 5.07%, respectively. These findings validate the model's effectiveness and robustness, establishing it as a reliable tool for the clinical auxiliary diagnosis of speech disorders.
Author(s): Duan S, Cheng Y, Qin Z, Zhu T, Li F, Liang Y, Liang H, Zhang W
Publication type: Article
Publication status: Published
Journal: Speech Communication
Year: 2026
Volume: 182
Print publication date: 01/07/2026
Online publication date: 20/05/2026
Acceptance date: 18/05/2026
ISSN (print): 0167-6393
ISSN (electronic): 1872-7182
Publisher: Elsevier B.V.
URL: https://doi.org/10.1016/j.specom.2026.103417
DOI: 10.1016/j.specom.2026.103417
Altmetrics provided by Altmetric