Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/30156
Full metadata record
DC FieldValueLanguage
dc.contributor.authorWan, L-
dc.contributor.authorLiu, H-
dc.contributor.authorShi, L-
dc.contributor.authorZhou, Y-
dc.contributor.authorGan, L-
dc.date.accessioned2024-11-17T17:00:19Z-
dc.date.available2024-11-17T17:00:19Z-
dc.date.issued2024-09-26-
dc.identifierORCiD: Hongqing Liu https://orcid.org/0000-0003-4839-1525-
dc.identifierORCiD: Liming Shi https://orcid.org/0000-0003-4129-0668-
dc.identifierORCiD: Yi Zhou https://orcid.org/0000-0001-7445-226X-
dc.identifierORCiD: Lu Gan https://orcid.org/0000-0003-1056-7660-
dc.identifier.citationWan, L. (2024) 'Cross Domain Optimization for Speech Enhancement: Parallel or Cascade?', IEEE/ACM Transactions on Audio Speech and Language Processing, 32, pp. 4328 - 4341. doi: 10.1109/TASLP.2024.3468026.en_US
dc.identifier.issn2329-9290-
dc.identifier.urihttps://bura.brunel.ac.uk/handle/2438/30156-
dc.descriptionWe provide a demo page containing enhanced audio clips from different models at https://wanliangdaxia.github.io/ .-
dc.description.abstractThis paper introduces five novel deep-learning architectures for speech enhancement. Existing methods typically use time-domain, time-frequency representations, or a hybrid approach. Recognizing the unique contributions of each domain to feature extraction and model design, this study investigates the integration of waveform and complex spectrogram models through cross-domain fusion to enhance speech feature learning and noise reduction, thereby improving speech quality. We examine both cascading and parallel configurations of waveform and complex spectrogram models to assess their effectiveness in speech enhancement. Additionally, we employ an orthogonal projection-based error decomposition technique and manage the inputs of individual sub-models to analyze factors affecting speech quality. The network is trained by optimizing three specific loss functions applied across all sub-models. Our experiments, using the DNS Challenge (ICASSP 2021) dataset, reveal that the proposed models surpass existing benchmarks in speech enhancement, offering superior speech quality and intelligibility. These results highlight the efficacy of our cross-domain fusion strategy.en_US
dc.format.extent4328 - 4341-
dc.format.mediumPrint-Electronic-
dc.languageenglish-
dc.language.isoen_USen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.urihttps://wanliangdaxia.github.io/-
dc.rightsCopyright © 2024 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. See: https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/-
dc.rights.urihttps://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/-
dc.subjectspeech enhancementen_US
dc.subjectwaveformen_US
dc.subjecttime-frequencyen_US
dc.subjectcomplex domainen_US
dc.subjectcross-domain speechen_US
dc.titleCross Domain Optimization for Speech Enhancement: Parallel or Cascade?en_US
dc.typeArticleen_US
dc.date.dateAccepted2024-09-16-
dc.identifier.doihttps://doi.org/10.1109/TASLP.2024.3468026-
dc.relation.isPartOfIEEE/ACM Transactions on Audio Speech and Language Processing-
pubs.publication-statusPublished-
pubs.volume32-
dc.identifier.eissn2329-9304-
dc.rights.holderInstitute of Electrical and Electronics Engineers (IEEE)-
Appears in Collections:Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfCopyright © 2024 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. See: https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/14.42 MBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.