Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/30907
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Shetty, N | - |
dc.contributor.author | Li, Y | - |
dc.date.accessioned | 2025-03-14T17:53:58Z | - |
dc.date.available | 2025-03-14T17:53:58Z | - |
dc.date.issued | 2024-11-28 | - |
dc.identifier | ORCiD: Yongmin Li https://orcid.org/0000-0003-1668-2440 | - |
dc.identifier | Article no. 444 | - |
dc.identifier.citation | Shetty, N. and Li, Y. (2024) 'Detailed Image Captioning and Hashtag Generation', Future Internet, 16 (12), 444, pp. 1 - 18. doi: 10.3390/fi16120444. | en_US |
dc.identifier.uri | https://bura.brunel.ac.uk/handle/2438/30907 | - |
dc.description | Data Availability Statement: The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author. | en_US |
dc.description.abstract | This article presents CapFlow, an integrated approach to detailed image captioning and hashtag generation. Based on a thorough performance evaluation, the image captioning model utilizes a fine-tuned vision-language model with Low-Rank Adaptation (LoRA), while the hashtag generation employs the keyword extraction method. We evaluated the state-of-the-art image captioning models using both traditional metrics (BLEU, METEOR, ROUGE-L, and CIDEr) and the specialized CAPTURE metric for detailed captions. The hashtag generation models were assessed using precision, recall, and F1-score. The proposed method demonstrates competitive results against larger models while maintaining efficiency suitable for real-time applications. The image captioning model outperforms the base Florence-2 model and favorably compares with larger models. The KeyBERT implementation for hashtag generation surpasses other keyword extraction methods in both accuracy and speed. This work contributes to the field of AI-assisted content analysis and generation, offering insights into the practical implementation of advanced vision-language models for detailed image understanding and relevant tag generation. | en_US |
dc.description.sponsorship | This research received no external funding. | en_US |
dc.format.extent | 1 - 18 | - |
dc.format.medium | Electronic | - |
dc.language | English | - |
dc.language.iso | en_US | en_US |
dc.publisher | MDPI | en_US |
dc.rights | Attribution 4.0 International | - |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | - |
dc.subject | image captioning | en_US |
dc.subject | hashtag generation | en_US |
dc.subject | vision-language models | en_US |
dc.subject | AI-assisted content analysis | en_US |
dc.title | Detailed Image Captioning and Hashtag Generation | en_US |
dc.type | Article | en_US |
dc.identifier.doi | https://doi.org/10.3390/fi16120444 | - |
dc.relation.isPartOf | Future Internet | - |
pubs.issue | 12 | - |
pubs.publication-status | Published | - |
pubs.volume | 16 | - |
dc.identifier.eissn | 1999-5903 | - |
dc.rights.license | https://creativecommons.org/licenses/by/4.0/legalcode.en | - |
dcterms.dateAccepted | 2024-11-26 | - |
dc.rights.holder | The authors | - |
Appears in Collections: | Dept of Computer Science Research Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FullText.pdf | Copyright © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). | 21.44 MB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License