Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/30156
Title: Cross Domain Optimization for Speech Enhancement: Parallel or Cascade?
Authors: Wan, L
Liu, H
Shi, L
Zhou, Y
Gan, L
Keywords: speech enhancement;waveform;time-frequency;complex domain;cross-domain speech
Issue Date: 26-Sep-2024
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Citation: Wan, L. (2024) 'Cross Domain Optimization for Speech Enhancement: Parallel or Cascade?', IEEE/ACM Transactions on Audio Speech and Language Processing, 32, pp. 4328 - 4341. doi: 10.1109/TASLP.2024.3468026.
Abstract: This paper introduces five novel deep-learning architectures for speech enhancement. Existing methods typically use time-domain, time-frequency representations, or a hybrid approach. Recognizing the unique contributions of each domain to feature extraction and model design, this study investigates the integration of waveform and complex spectrogram models through cross-domain fusion to enhance speech feature learning and noise reduction, thereby improving speech quality. We examine both cascading and parallel configurations of waveform and complex spectrogram models to assess their effectiveness in speech enhancement. Additionally, we employ an orthogonal projection-based error decomposition technique and manage the inputs of individual sub-models to analyze factors affecting speech quality. The network is trained by optimizing three specific loss functions applied across all sub-models. Our experiments, using the DNS Challenge (ICASSP 2021) dataset, reveal that the proposed models surpass existing benchmarks in speech enhancement, offering superior speech quality and intelligibility. These results highlight the efficacy of our cross-domain fusion strategy.
Description: We provide a demo page containing enhanced audio clips from different models at https://wanliangdaxia.github.io/ .
URI: https://bura.brunel.ac.uk/handle/2438/30156
DOI: https://doi.org/10.1109/TASLP.2024.3468026
ISSN: 2329-9290
Other Identifiers: ORCiD: Hongqing Liu https://orcid.org/0000-0003-4839-1525
ORCiD: Liming Shi https://orcid.org/0000-0003-4129-0668
ORCiD: Yi Zhou https://orcid.org/0000-0001-7445-226X
ORCiD: Lu Gan https://orcid.org/0000-0003-1056-7660
Appears in Collections:Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfCopyright © 2024 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. See: https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/14.42 MBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.