Toggle Main Menu Toggle Search

Open Access padlockePrints

Underdetermined source separation using time-frequency masks and an adaptive combined Gaussian-Student's t probabilistic model

Lookup NU author(s): Dr Yang Sun, Waqas Rafique, Professor Jonathon Chambers, Dr Mohsen Naqvi


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


© 2017 IEEE. Time-frequency (T-F) masking algorithms are focused at separating multiple sound sources from binaural reverberant speech mixtures. The statistical modelling of binaural cues i.e. interaural phase difference (IPD) and interaural level difference (ILD) is a significant aspect of such algorithms. In this paper, a Gaussian-Student's t distribution combined mixture model is exploited for robust binaural speech separation. The weights of the distribution components are calculated adaptively with the energy of the speech mixtures. The expectation maximization (EM) algorithm is applied to calculate the parameters of the distributions. The speech signals from the TIMIT database are convolved with the real binaural room impulse responses (BRIRs) from two datasets for the evaluation of the proposed method. The objective performance measure signal to distortion ratio (SDR) confirms the improvement and robustness of the proposed method.

Publication metadata

Author(s): Sun Y, Rafique W, Chambers JA, Naqvi SM

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017)

Year of Conference: 2017

Pages: 4187-4191

Online publication date: 19/06/2017

Acceptance date: 01/02/2017

ISSN: 2379-190X

Publisher: IEEE


DOI: 10.1109/ICASSP.2017.7952945

Library holdings: Search Newcastle University Library for this item

ISBN: 9781509041176