Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/33270| Title: | A Semi-Supervised Acoustic Scene Classification Network Based on Multi-Modal Information Fusion |
| Authors: | Yang, J Liu, H Shi, L Gan, L Nishizaki, H Leow, CS |
| Keywords: | training;scene classification;urban areas;pipelines;network architecture;metadata;acoustics;spatiotemporal phenomena;reliability;iterative methods |
| Issue Date: | 22-Oct-2025 |
| Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
| Citation: | Yang, J. et al. (2025) 'A Semi-Supervised Acoustic Scene Classification Network Based on Multi-Modal Information Fusion', 2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Singapore, 22–24 October, pp. 177–181. doi: 10.1109/apsipaasc65261.2025.11249027. |
| Abstract: | This paper presents our semi-supervised acoustic scene classification (ASC) framework submitted to the APSIPA ASC 2025 Grand Challenge, which focuses on city- and timeaware ASC under limited labeled data. Our approach leverages a multi-modal network architecture that fuses audio melspectrograms with spatiotemporal metadata (city identity and timestamps) to capture dynamic acoustic scene variations across urban environments. The model employs a residual-based CNN with attention mechanisms for robust feature extraction, enhanced by multi-modal fusion. To address label scarcity, we adopt a staged semi-supervised pipeline: pre-training on TAU Urban Acoustic Scenes 2020 and CochlScene datasets with specaugment and mixup augmentations, and then iterative fine-tuning on challenge data with pseudo-labeling to expand the training set was conducted, resulting in performance improvement. Experimental results demonstrate the efficacy of our city/time-aware design and semi-supervised strategies on our validation data. |
| Description: | Code Availability: We provide the code and checkpoint at https://github.com/JunkangYang/ALPS-ASC. |
| URI: | https://bura.brunel.ac.uk/handle/2438/33270 |
| DOI: | https://doi.org/10.1109/apsipaasc65261.2025.11249027 |
| ISBN: | 979-8-3315-7206-8 979-8-3315-7207-5 |
| ISSN: | 2640-009X |
| Other Identifiers: | ORCiD: Lu Gan https://orcid.org/0000-0003-1056-7660 |
| Appears in Collections: | Department of Electronic and Electrical Engineering Research Papers |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| FullText.pdf | For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising. | 522.21 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License