Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/32290| Title: | Visual transformer with depthwise separable convolution projections for video-based human action recognition |
| Authors: | Cao, Y Wang, F Zheng, Q |
| Issue Date: | 1-Oct-2025 |
| Publisher: | EDP Sciences |
| Citation: | Cao, Y., Wang, F. and Zheng, Q. (2025) 'Visual transformer with depthwise separable convolution projections for video-based human action recognition', MATEC Web of Conferences, 413, 06003, pp. 1 - 5. doi: 10.1051/matecconf/202541306003. |
| Abstract: | Human action recognition is a task that utilizes algorithms to recognize human actions from videos. Transformer-based algorithms have attracted growing attention in recent years. However, transformer networks often suffer from slow convergence and require large amounts of training data, due to their inability to prioritize information from neighboring pixels. To address these issues, we propose a novel network architecture that combines a depthwise separable convolution layer with transformer modules. The proposed network has been evaluated on the medium-sized benchmark dataset UCF101 and the results have demonstrated that the proposed model converges quickly during training and achieves competitive performance compared with SOTA pure transformer network, while reducing approximately 7.4 million parameters. |
| URI: | https://bura.brunel.ac.uk/handle/2438/32290 |
| DOI: | https://doi.org/10.1051/matecconf/202541306003 |
| ISSN: | 2274-7214 |
| Other Identifiers: | ORCiD: Fang Wang https://orcid.org/0000-0003-1987-9150 Article number: 06003 |
| Appears in Collections: | Dept of Mechanical and Aerospace Engineering Research Papers |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| FullText.pdf | Copyright © The Authors, published by EDP Sciences, 2025. Licence: Creative Commons. This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0 (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. | 323.91 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License