Browse by author
Lookup NU author(s): Yang Xian, Dr Yang Sun, Dr Mohsen Naqvi
This is the authors' accepted manuscript of an article that has been published in its final definitive form by IEEE, 2021.
For re-use rights please refer to the publisher's terms and conditions.
Deep neural networks based methods dominate recent development in single channel speech enhancement. In this paper, we propose a multi-scale feature recalibration convo-lutional encoder-decoder with bidirectional gated recurrent unit (BGRU) architecture for end-to-end speech enhancement. More specifically, multi-scale recalibration 2-D convolutional layers are used to extract local and contextual features from the signal. In addition, a gating mechanism is used in the recalibration network to control the information flow among the layers, which enables the scaled features to be weighted in order to retain speech and suppress noise. The fully connected layer (FC) is then employed to compress the output of the multi-scale 2-D convolutional layer with a small number of neurons, thus capturing the global information and improving parameter efficiency. The BGRU layers employ forward and backward GRUs, which contain the reset, update, and output gates, to exploit the interdependency among the past, current and future frames to improve predictions. The experimental results confirm that the proposed MCGN method outperforms several state-of-the-art methods.
Author(s): Xian Y, Sun Y, Wang W, Naqvi SM
Publication type: Article
Publication status: Published
Journal: IEEE Journal of Selected Topics in Signal Processing
Year: 2021
Volume: 15
Issue: 1
Pages: 143-155
Print publication date: 01/01/2021
Online publication date: 18/12/2020
Acceptance date: 11/12/2020
Date deposited: 15/12/2020
ISSN (print): 1932-4553
ISSN (electronic): 1941-0484
Publisher: IEEE
URL: https://doi.org/10.1109/JSTSP.2020.3045846
DOI: 10.1109/JSTSP.2020.3045846
Altmetrics provided by Altmetric