Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/32701Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Alkandary, K | - |
| dc.contributor.author | Yildiz, AS | - |
| dc.contributor.author | Meng, H | - |
| dc.date.accessioned | 2026-01-23T10:58:28Z | - |
| dc.date.available | 2026-01-23T10:58:28Z | - |
| dc.date.issued | 2026-01-07 | - |
| dc.identifier | ORCiD: Khadijah Alkandary https://orcid.org/0009-0000-0260-0817 | - |
| dc.identifier | ORCiD: Ahmet Serhat Yildiz https://orcid.org/0000-0002-2957-7394 | - |
| dc.identifier | ORCiD: Hongying Meng https://orcid.org/0000-0002-8836-1382 | - |
| dc.identifier | Article number: 265 | - |
| dc.identifier.citation | Alkandary, K., Yildiz, A.S. and . (2026) 'Enhancing Multi Object Tracking with CLIP: A Comparative Study on DeepSORT and StrongSORT', Electronics, 15 (2), 265, pp. 1 - 18. doi: 10.3390/electronics15020265. | en_US |
| dc.identifier.uri | https://bura.brunel.ac.uk/handle/2438/32701 | - |
| dc.description | Data Availability Statement: The data presented in this study are openly available in https://motchallenge.net (accessed on 1 September 2025), and reference [14]. Wojke, N., Bewley, A. and Paulus, D. (2017) 'Simple Online and Realtime Tracking with a Deep Association Metric', Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17-20 September, pp. 3645 - 3649. doi: 10.1109/ICIP.2017.8296962. | en_US |
| dc.description.abstract | Multi object tracking (MOT) is a crucial task in video analysis but is often hindered by frequent identity (ID) switches, particularly in crowded or occluded scenarios. This study explores the integration of a vision-language model, into two tracking by detection frameworks DeepSORT and StrongSORT to enhance appearance-based re-identification. YOLOv8x is employed as the base detector due to its robust localization performance, while CLIP’s visual features replace the default appearance encoders, providing more discriminative and semantically rich embeddings. We evaluated the CLIP enhanced DeepSORT and StrongSORT on sequences from two challenging real world benchmarks: MOT15 and MOT16. Furthermore, we analyze the generalizability of YOLOv8x when trained on the MOT20 benchmark and applied to the chosen trackers on MOT15 and MOT16. Our findings show that both CLIP enhanced trackers substantially reduce ID switches and improve ID-based tracking metrics, with CLIP StrongSORT achieving the most consistent gains. In addition, YOLOv8x demonstrates strong generalization capabilities for unseen datasets. These results highlight the effectiveness of incorporating vision language models into MOT frameworks, particularly under visually challenging conditions. | en_US |
| dc.description.sponsorship | This research received no external funding. | en_US |
| dc.format.extent | 1 - 18 | - |
| dc.format.medium | Electronic | - |
| dc.language | English | - |
| dc.language.iso | en_US | en_US |
| dc.publisher | MDPI | en_US |
| dc.rights | Creative Commons Attribution 4.0 International | - |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | - |
| dc.subject | YOLO | en_US |
| dc.subject | DeepSORT | en_US |
| dc.subject | StrongSORT | en_US |
| dc.subject | detection | en_US |
| dc.subject | tracking | en_US |
| dc.subject | autonomous driving | en_US |
| dc.subject | CLIP | en_US |
| dc.subject | vision-language models | en_US |
| dc.title | Enhancing Multi Object Tracking with CLIP: A Comparative Study on DeepSORT and StrongSORT | en_US |
| dc.type | Article | en_US |
| dc.date.dateAccepted | 2025-12-30 | - |
| dc.identifier.doi | https://doi.org/10.3390/electronics15020265 | - |
| dc.relation.isPartOf | Electronics | - |
| pubs.issue | 2 | - |
| pubs.publication-status | Published online | - |
| pubs.volume | 15 | - |
| dc.identifier.eissn | 2079-9292 | - |
| dc.rights.license | https://creativecommons.org/licenses/by/4.0/legalcode.en | - |
| dcterms.dateAccepted | 2025-12-30 | - |
| dc.rights.holder | The authors | - |
| dc.contributor.orcid | Alkandary, Khadijah [0009-0000-0260-0817] | - |
| dc.contributor.orcid | Yildiz, Ahmet Serhat [0000-0002-2957-7394] | - |
| dc.contributor.orcid | Meng, Hongying [0000-0002-8836-1382] | - |
| Appears in Collections: | Dept of Electronic and Electrical Engineering Research Papers | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| FullText.pdf | Copyright © 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). | 4.56 MB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License