Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/33270
Full metadata record
DC FieldValueLanguage
dc.contributor.authorYang, J-
dc.contributor.authorLiu, H-
dc.contributor.authorShi, L-
dc.contributor.authorGan, L-
dc.contributor.authorNishizaki, H-
dc.contributor.authorLeow, CS-
dc.coverage.spatialSingapore-
dc.date.accessioned2026-05-13T08:46:03Z-
dc.date.available2026-05-13T08:46:03Z-
dc.date.issued2025-10-22-
dc.identifierORCiD: Lu Gan https://orcid.org/0000-0003-1056-7660-
dc.identifier.citationYang, J. et al. (2025) 'A Semi-Supervised Acoustic Scene Classification Network Based on Multi-Modal Information Fusion', 2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Singapore, 22–24 October, pp. 177–181. doi: 10.1109/apsipaasc65261.2025.11249027.en-US
dc.identifier.isbn979-8-3315-7206-8-
dc.identifier.isbn979-8-3315-7207-5-
dc.identifier.issn2640-009X-
dc.identifier.urihttps://bura.brunel.ac.uk/handle/2438/33270-
dc.descriptionCode Availability: We provide the code and checkpoint at https://github.com/JunkangYang/ALPS-ASC.en-US
dc.description.abstractThis paper presents our semi-supervised acoustic scene classification (ASC) framework submitted to the APSIPA ASC 2025 Grand Challenge, which focuses on city- and timeaware ASC under limited labeled data. Our approach leverages a multi-modal network architecture that fuses audio melspectrograms with spatiotemporal metadata (city identity and timestamps) to capture dynamic acoustic scene variations across urban environments. The model employs a residual-based CNN with attention mechanisms for robust feature extraction, enhanced by multi-modal fusion. To address label scarcity, we adopt a staged semi-supervised pipeline: pre-training on TAU Urban Acoustic Scenes 2020 and CochlScene datasets with specaugment and mixup augmentations, and then iterative fine-tuning on challenge data with pseudo-labeling to expand the training set was conducted, resulting in performance improvement. Experimental results demonstrate the efficacy of our city/time-aware design and semi-supervised strategies on our validation data.en-US
dc.format.extent177–181-
dc.format.mediumPrint-Electronic-
dc.languageEnglish-
dc.languageEnglishen-US
dc.language.isoengen-US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en-US
dc.rightsCreative Commons Attribution 4.0 International-
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/-
dc.source2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)-
dc.source2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)-
dc.subjecttrainingen-US
dc.subjectscene classificationen-US
dc.subjecturban areasen-US
dc.subjectpipelinesen-US
dc.subjectnetwork architectureen-US
dc.subjectmetadataen-US
dc.subjectacousticsen-US
dc.subjectspatiotemporal phenomenaen-US
dc.subjectreliabilityen-US
dc.subjectiterative methodsen-US
dc.titleA Semi-Supervised Acoustic Scene Classification Network Based on Multi-Modal Information Fusionen-US
dc.typeConference Paperen-US
dc.date.dateAccepted2025-09-05-
dc.identifier.doihttps://doi.org/10.1109/apsipaasc65261.2025.11249027-
dc.relation.isPartOf2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)-
pubs.finish-date2025-10-24-
pubs.finish-date2025-10-24-
pubs.publication-statusPublished-
pubs.start-date2025-10-22-
pubs.start-date2025-10-22-
dc.identifier.eissn2640-0103-
dcterms.dateAccepted2025-09-05-
dc.rights.holderThe Author(s)-
dc.rights.holderhttps://creativecommons.org/licenses/by/4.0/legalcode.en-
dc.contributor.orcidGan, Lu [0000-0003-1056-7660]-
Appears in Collections:Department of Electronic and Electrical Engineering Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfFor the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.522.21 kBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons