Toggle Main Menu Toggle Search

Open Access padlockePrints

Non-Negative Matrix Factorization-Convolution Neural Network (NMF-CNN) for Sound Event Detection

Lookup NU author(s): Teck CHAN, Professor Cheng Chin


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


The main scientific question of this year DCASE challenge, Task 4 - Sound Event Detection in Domestic Environments, is to investigate the types of data (strongly labeled synthetic data, weakly labeled data, unlabeled in domain data) required to achieve the best performing system. In this paper, we proposed a deep learning model that integrates Convolution Neural Network (CNN) with Non-Negative Matrix Factorization (NMF). The best performing model can achieve a higher event based F1-score of 30.39% as compared to the baseline system that achieved an F1-score of 23.7% on the validation dataset. Based on the results, even though synthetic data is strongly labeled, it cannot be used as a sole source of training data and resulted in the worst performance. Although, using a combination of weakly and strongly labeled data can achieve the highest F1-score, but the increment was not significant and may not be worthwhile to include synthetic data into the training set. Results have also suggested that the quality of labeling unlabeled in domain data is essential and can have an adverse effect on the accuracy rather than improving the model performance if labeling was not done accurately.

Publication metadata

Author(s): Chan TK, Chin CS, Li Y

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: IEEE Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2019)

Year of Conference: 2019

Online publication date: 25/10/2019

Acceptance date: 29/07/2019

Publisher: IEEE