Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/30156
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Wan, L | - |
dc.contributor.author | Liu, H | - |
dc.contributor.author | Shi, L | - |
dc.contributor.author | Zhou, Y | - |
dc.contributor.author | Gan, L | - |
dc.date.accessioned | 2024-11-17T17:00:19Z | - |
dc.date.available | 2024-11-17T17:00:19Z | - |
dc.date.issued | 2024-09-26 | - |
dc.identifier | ORCiD: Hongqing Liu https://orcid.org/0000-0003-4839-1525 | - |
dc.identifier | ORCiD: Liming Shi https://orcid.org/0000-0003-4129-0668 | - |
dc.identifier | ORCiD: Yi Zhou https://orcid.org/0000-0001-7445-226X | - |
dc.identifier | ORCiD: Lu Gan https://orcid.org/0000-0003-1056-7660 | - |
dc.identifier.citation | Wan, L. (2024) 'Cross Domain Optimization for Speech Enhancement: Parallel or Cascade?', IEEE/ACM Transactions on Audio Speech and Language Processing, 32, pp. 4328 - 4341. doi: 10.1109/TASLP.2024.3468026. | en_US |
dc.identifier.issn | 2329-9290 | - |
dc.identifier.uri | https://bura.brunel.ac.uk/handle/2438/30156 | - |
dc.description | We provide a demo page containing enhanced audio clips from different models at https://wanliangdaxia.github.io/ . | - |
dc.description.abstract | This paper introduces five novel deep-learning architectures for speech enhancement. Existing methods typically use time-domain, time-frequency representations, or a hybrid approach. Recognizing the unique contributions of each domain to feature extraction and model design, this study investigates the integration of waveform and complex spectrogram models through cross-domain fusion to enhance speech feature learning and noise reduction, thereby improving speech quality. We examine both cascading and parallel configurations of waveform and complex spectrogram models to assess their effectiveness in speech enhancement. Additionally, we employ an orthogonal projection-based error decomposition technique and manage the inputs of individual sub-models to analyze factors affecting speech quality. The network is trained by optimizing three specific loss functions applied across all sub-models. Our experiments, using the DNS Challenge (ICASSP 2021) dataset, reveal that the proposed models surpass existing benchmarks in speech enhancement, offering superior speech quality and intelligibility. These results highlight the efficacy of our cross-domain fusion strategy. | en_US |
dc.format.extent | 4328 - 4341 | - |
dc.format.medium | Print-Electronic | - |
dc.language | english | - |
dc.language.iso | en_US | en_US |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | en_US |
dc.relation.uri | https://wanliangdaxia.github.io/ | - |
dc.rights | Copyright © 2024 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. See: https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/ | - |
dc.rights.uri | https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/ | - |
dc.subject | speech enhancement | en_US |
dc.subject | waveform | en_US |
dc.subject | time-frequency | en_US |
dc.subject | complex domain | en_US |
dc.subject | cross-domain speech | en_US |
dc.title | Cross Domain Optimization for Speech Enhancement: Parallel or Cascade? | en_US |
dc.type | Article | en_US |
dc.date.dateAccepted | 2024-09-16 | - |
dc.identifier.doi | https://doi.org/10.1109/TASLP.2024.3468026 | - |
dc.relation.isPartOf | IEEE/ACM Transactions on Audio Speech and Language Processing | - |
pubs.publication-status | Published | - |
pubs.volume | 32 | - |
dc.identifier.eissn | 2329-9304 | - |
dc.rights.holder | Institute of Electrical and Electronics Engineers (IEEE) | - |
Appears in Collections: | Dept of Electronic and Electrical Engineering Research Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FullText.pdf | Copyright © 2024 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. See: https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/ | 14.42 MB | Adobe PDF | View/Open |
Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.