Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/33270
Title: A Semi-Supervised Acoustic Scene Classification Network Based on Multi-Modal Information Fusion
Authors: Yang, J
Liu, H
Shi, L
Gan, L
Nishizaki, H
Leow, CS
Keywords: training;scene classification;urban areas;pipelines;network architecture;metadata;acoustics;spatiotemporal phenomena;reliability;iterative methods
Issue Date: 22-Oct-2025
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Citation: Yang, J. et al. (2025) 'A Semi-Supervised Acoustic Scene Classification Network Based on Multi-Modal Information Fusion', 2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Singapore, 22–24 October, pp. 177–181. doi: 10.1109/apsipaasc65261.2025.11249027.
Abstract: This paper presents our semi-supervised acoustic scene classification (ASC) framework submitted to the APSIPA ASC 2025 Grand Challenge, which focuses on city- and timeaware ASC under limited labeled data. Our approach leverages a multi-modal network architecture that fuses audio melspectrograms with spatiotemporal metadata (city identity and timestamps) to capture dynamic acoustic scene variations across urban environments. The model employs a residual-based CNN with attention mechanisms for robust feature extraction, enhanced by multi-modal fusion. To address label scarcity, we adopt a staged semi-supervised pipeline: pre-training on TAU Urban Acoustic Scenes 2020 and CochlScene datasets with specaugment and mixup augmentations, and then iterative fine-tuning on challenge data with pseudo-labeling to expand the training set was conducted, resulting in performance improvement. Experimental results demonstrate the efficacy of our city/time-aware design and semi-supervised strategies on our validation data.
Description: Code Availability: We provide the code and checkpoint at https://github.com/JunkangYang/ALPS-ASC.
URI: https://bura.brunel.ac.uk/handle/2438/33270
DOI: https://doi.org/10.1109/apsipaasc65261.2025.11249027
ISBN: 979-8-3315-7206-8
979-8-3315-7207-5
ISSN: 2640-009X
Other Identifiers: ORCiD: Lu Gan https://orcid.org/0000-0003-1056-7660
Appears in Collections:Department of Electronic and Electrical Engineering Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfFor the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.522.21 kBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons