Browse by author
Lookup NU author(s): Dr Shidong WangORCiD
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
© 2024 Elsevier B.V.Unsupervised cross-domain road scene segmentation has attracted substantial interest because of its capability to perform segmentation on new and unlabeled domains, thereby reducing the dependence on expensive manual annotations. This is achieved by leveraging networks trained on labeled source domains to classify images on unlabeled target domains. Conventional techniques usually use adversarial networks to align inputs from the source and the target in either of their domains. However, these approaches often fall short in effectively integrating information from both domains due to Alignment in each space usually leads to bias problems during feature learning. To overcome these limitations and enhance cross-domain interaction while mitigating overfitting to the source domain, we introduce a novel framework called Semantic-Aware Feature Enhancement Network (SAFENet) for Unsupervised Cross-domain Road Scene Segmentation. SAFENet incorporates the Semantic-Aware Enhancement (SAE) module to amplify the importance of class information in segmentation tasks and uses the semantic space as a new domain to guide the alignment of the source and target domains. Additionally, we integrate Adaptive Instance Normalization with Momentum (AdaIN-M) techniques, which convert the source domain image style to the target domain image style, thereby reducing the adverse effects of source domain overfitting on target domain segmentation performance. Moreover, SAFENet employs a Knowledge Transfer (KT) module to optimize network architecture, enhancing computational efficiency during testing while maintaining the robust inference capabilities developed during training. To further improve the segmentation performance, we further employ Curriculum Learning, a self-training mechanism that uses pseudo-labels derived from the target domain to iteratively refine the network. Comprehensive experiments on three well-known datasets, “Synthia→Cityscapes” and “GTA5→Cityscapes”, demonstrate the superior performance of our method. In-depth examinations and ablation studies verify the efficacy of each module within the proposed method.
Author(s): Ren D, Li M, Wang S, Ren M, Zhang H
Publication type: Article
Publication status: Published
Journal: Image and Vision Computing
Year: 2024
Volume: 152
Print publication date: 01/12/2024
Online publication date: 04/11/2024
Acceptance date: 24/10/2024
ISSN (print): 0262-8856
ISSN (electronic): 1872-8138
Publisher: Elsevier Ltd
URL: https://doi.org/10.1016/j.imavis.2024.105318
DOI: 10.1016/j.imavis.2024.105318
Altmetrics provided by Altmetric