Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/27972
Title: | CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation |
Authors: | Lei, T Sun, R Wang, X Wang, Y He, X Nandi, A |
Issue Date: | 19-Aug-2023 |
Publisher: | International Joint Conference on Artificial Intelligence (IJCAI) |
Citation: | Lei, T. et al. (2023) 'CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation', Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, Macao, S.A.R., 19-25 August,, pp. 1017 - 1025. Available at: https://www.ijcai.org/proceedings/2023/113. |
Abstract: | The hybrid architecture of convolutional neural networks (CNNs) and Transformer are very popular for medical image segmentation. However, it suffers from two challenges. First, although a CNNs branch can capture the local image features using vanilla convolution, it cannot achieve adaptive feature learning. Second, although a Transformer branch can capture the global features, it ignores the channel and cross-dimensional self-attention, resulting in a low segmentation accuracy on complex-content images. To address these challenges, we propose a novel hybrid architecture of convolutional neural networks hand in hand with vision Transformers (CiT-Net) for medical image segmentation. Our network has two advantages. First, we design a dynamic deformable convolution and apply it to the CNNs branch, which overcomes the weak feature extraction ability due to fixed-size convolution kernels and the stiff design of sharing kernel parameters among different inputs. Second, we design a shifted-window adaptive complementary attention module and a compact convolutional projection. We apply them to the Transformer branch to learn the cross-dimensional long-term dependency for medical images. Experimental results show that our CiT-Net provides better medical image segmentation results than popular SOTA methods. Besides, our CiT-Net requires lower parameters and less computational costs and does not rely on pre-training. |
Description: | The code is publicly available at: https://github.com/SR0920/CiT-Net . The conference paper archived on this institutional repository is the second, revised version made available at arXiv:2306.03373v2 [eess.IV], [v2] Wed, 20 Dec 2023 02:42:13 UTC (3,309 KB), https://arxiv.org/abs/2306.03373 under an arXiv non-exclusive license (https://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html). |
URI: | https://bura.brunel.ac.uk/handle/2438/27972 |
DOI: | https://doi.org/10.24963/ijcai.2023/113 |
ISBN: | 978-1-956792-03-4 |
ISSN: | 1045-0823 |
Other Identifiers: | ORCID iD: Asoke Nandi https://orcid.org/0000-0001-6248-2875 arXiv:2306.03373v2 [eess.IV] |
Appears in Collections: | Dept of Electronic and Electrical Engineering Research Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FullText.pdf | The conference paper archived on this institutional repository is the second, revised version made available at arXiv:2306.03373v2 [eess.IV], [v2] Wed, 20 Dec 2023 02:42:13 UTC (3,309 KB), https://arxiv.org/abs/2306.03373 under an arXiv non-exclusive license (https://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html). | 5.13 MB | Adobe PDF | View/Open |
Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.