CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

Lei, T; Sun, R; Wang, X; Wang, Y; He, X; Nandi, A

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/27972

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lei, T	-
dc.contributor.author	Sun, R	-
dc.contributor.author	Wang, X	-
dc.contributor.author	Wang, Y	-
dc.contributor.author	He, X	-
dc.contributor.author	Nandi, A	-
dc.coverage.spatial	Macao, S.A.R.	-
dc.date.accessioned	2024-01-06T12:40:39Z	-
dc.date.available	2024-01-06T12:40:39Z	-
dc.date.issued	2023-08-19	-
dc.identifier	ORCiD: Asoke Nandi https://orcid.org/0000-0001-6248-2875	-
dc.identifier	arXiv:2306.03373v2 [eess.IV]	-
dc.identifier.citation	Lei, T. et al. (2023) 'CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation', Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, Macao, S.A.R., 19-25 August,, pp. 1017 - 1025. Available at: https://www.ijcai.org/proceedings/2023/113.	en_US
dc.identifier.isbn	978-1-956792-03-4	-
dc.identifier.issn	1045-0823	-
dc.identifier.uri	https://bura.brunel.ac.uk/handle/2438/27972	-
dc.description	The code is publicly available at: https://github.com/SR0920/CiT-Net .	en_US
dc.description	The conference paper archived on this institutional repository is the second, revised version made available at arXiv:2306.03373v2 [eess.IV], [v2] Wed, 20 Dec 2023 02:42:13 UTC (3,309 KB), https://arxiv.org/abs/2306.03373 under an arXiv non-exclusive license (https://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html).	-
dc.description.abstract	The hybrid architecture of convolutional neural networks (CNNs) and Transformer are very popular for medical image segmentation. However, it suffers from two challenges. First, although a CNNs branch can capture the local image features using vanilla convolution, it cannot achieve adaptive feature learning. Second, although a Transformer branch can capture the global features, it ignores the channel and cross-dimensional self-attention, resulting in a low segmentation accuracy on complex-content images. To address these challenges, we propose a novel hybrid architecture of convolutional neural networks hand in hand with vision Transformers (CiT-Net) for medical image segmentation. Our network has two advantages. First, we design a dynamic deformable convolution and apply it to the CNNs branch, which overcomes the weak feature extraction ability due to fixed-size convolution kernels and the stiff design of sharing kernel parameters among different inputs. Second, we design a shifted-window adaptive complementary attention module and a compact convolutional projection. We apply them to the Transformer branch to learn the cross-dimensional long-term dependency for medical images. Experimental results show that our CiT-Net provides better medical image segmentation results than popular SOTA methods. Besides, our CiT-Net requires lower parameters and less computational costs and does not rely on pre-training.	en_US
dc.description.sponsorship	National Natural Science Foundation of China under Grants 62271296, 62201334 and 62201452, in part by the Natural Science Basic Research Program of Shaanxi under Grant 2021JC-47, and in part by the Key Research and Development Program of Shaanxi under Grants 2022GY-436 and 2021ZDLGY08-07.	en_US
dc.format.extent	1017 - 1025	-
dc.format.medium	Print-Electronic	-
dc.language	English	-
dc.language.iso	en_US	en_US
dc.publisher	International Joint Conference on Artificial Intelligence (IJCAI)	en_US
dc.relation.uri	https://github.com/SR0920/CiT-Net	-
dc.relation.uri	https://www.ijcai.org/proceedings/2023/	-
dc.relation.uri	https://arxiv.org/abs/2306.03373	-
dc.rights	The conference paper archived on this institutional repository is the second, revised version made available at arXiv:2306.03373v2 [eess.IV], [v2] Wed, 20 Dec 2023 02:42:13 UTC (3,309 KB), https://arxiv.org/abs/2306.03373 under an arXiv non-exclusive license (https://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html).	-
dc.rights.uri	https://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html	-
dc.source	32nd International Joint Conference on Artificial Intelligence (IJCAI 2023)	-
dc.source	32nd International Joint Conference on Artificial Intelligence (IJCAI 2023)	-
dc.title	CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation	en_US
dc.type	Article	en_US
dc.date.dateAccepted	2023-04-19	-
dc.identifier.doi	https://doi.org/10.24963/ijcai.2023/113	-
dc.relation.isPartOf	IJCAI International Joint Conference on Artificial Intelligence	-
pubs.finish-date	2023-08-25	-
pubs.finish-date	2023-08-25	-
pubs.publication-status	Published	-
pubs.start-date	2023-08-19	-
pubs.start-date	2023-08-19	-
pubs.volume	2023-August	-
dcterms.dateAccepted	2023-04-19	-
dc.rights.holder	The Authors	-
Appears in Collections:	Department of Electronic and Electrical Engineering Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	The conference paper archived on this institutional repository is the second, revised version made available at arXiv:2306.03373v2 [eess.IV], [v2] Wed, 20 Dec 2023 02:42:13 UTC (3,309 KB), https://arxiv.org/abs/2306.03373 under an arXiv non-exclusive license (https://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html).	5.13 MB	Adobe PDF	View/Open

Show simple item record