Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/31391
Title: Cross-Block Sparse Class Token Contrast for Weakly Supervised Semantic Segmentation
Authors: Cheng, K
Tang, J
Gu, H
Wan, H
Li, M
Keywords: weakly supervised;semantic segmentation;token contrast;dynamic sparse
Issue Date: 12-Aug-2024
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Citation: Cheng, K. et al. (2024) 'Cross-Block Sparse Class Token Contrast for Weakly Supervised Semantic Segmentation', IEEE Transactions on Circuits and Systems for Video Technology, 34 (12), pp. 13004 - 13015. doi: 10.1109/TCSVT.2024.3442310.
Abstract: Most existing Vision Transformer-based frameworks for weakly supervised semantic segmentation utilize class activation maps to generate pseudo masks. Although it mitigates the class-agnostic issue, this approach still suffers from misclassification and noise in segmentation results. To overcome these limitations, we propose an attention-based framework named Cross-block Sparse Class Token Contrast (CB-SCTC), which incorporates Dynamic Sparse Attention module (DSA) and Cross-block Class Token Contrast scheme (CB-CTC). Specifically, the proposed Cross-block Class Token Contrast scheme forces diversity between the final class tokens by learning from the lower similarity of the class tokens in the relatively shallower blocks. Moreover, the Dynamic Sparse Attention module is designed to post-process the output from the softmax function in the attention mechanism to reduce noise. Extensive experiments prove the proposed framework is a valid alternative to class activation maps. Our framework demonstrates competitive mIoU scores on the PASCAL VOC 2012(val:75.5%, test:75.2%) and MS COCO 2014 dataset(val:46.9%). Our code is available at https://github.com/Jingfeng-Tang/CB-SCTC.
URI: https://bura.brunel.ac.uk/handle/2438/31391
DOI: https://doi.org/10.1109/TCSVT.2024.3442310
ISSN: 1051-8215
Other Identifiers: ORCiD: Keyang Cheng https://orcid.org/0000-0001-5240-1605
ORCiD: Jingfeng Tang https://orcid.org/0009-0001-0291-4047
ORCiD: Mazhen Li https://orcid.org/0000-0002-0820-5487
Appears in Collections:Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfCopyright © 2024 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. See: https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/36.69 MBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.