Detailed Image Captioning and Hashtag Generation

Shetty, N; Li, Y

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/30907

Full metadata record

DC Field	Value	Language
dc.contributor.author	Shetty, N	-
dc.contributor.author	Li, Y	-
dc.date.accessioned	2025-03-14T17:53:58Z	-
dc.date.available	2025-03-14T17:53:58Z	-
dc.date.issued	2024-11-28	-
dc.identifier	ORCiD: Yongmin Li https://orcid.org/0000-0003-1668-2440	-
dc.identifier	Article no. 444	-
dc.identifier.citation	Shetty, N. and Li, Y. (2024) 'Detailed Image Captioning and Hashtag Generation', Future Internet, 16 (12), 444, pp. 1 - 18. doi: 10.3390/fi16120444.	en_US
dc.identifier.uri	https://bura.brunel.ac.uk/handle/2438/30907	-
dc.description	Data Availability Statement: The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.	en_US
dc.description.abstract	This article presents CapFlow, an integrated approach to detailed image captioning and hashtag generation. Based on a thorough performance evaluation, the image captioning model utilizes a fine-tuned vision-language model with Low-Rank Adaptation (LoRA), while the hashtag generation employs the keyword extraction method. We evaluated the state-of-the-art image captioning models using both traditional metrics (BLEU, METEOR, ROUGE-L, and CIDEr) and the specialized CAPTURE metric for detailed captions. The hashtag generation models were assessed using precision, recall, and F1-score. The proposed method demonstrates competitive results against larger models while maintaining efficiency suitable for real-time applications. The image captioning model outperforms the base Florence-2 model and favorably compares with larger models. The KeyBERT implementation for hashtag generation surpasses other keyword extraction methods in both accuracy and speed. This work contributes to the field of AI-assisted content analysis and generation, offering insights into the practical implementation of advanced vision-language models for detailed image understanding and relevant tag generation.	en_US
dc.description.sponsorship	This research received no external funding.	en_US
dc.format.extent	1 - 18	-
dc.format.medium	Electronic	-
dc.language	English	-
dc.language.iso	en_US	en_US
dc.publisher	MDPI	en_US
dc.rights	Attribution 4.0 International	-
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	-
dc.subject	image captioning	en_US
dc.subject	hashtag generation	en_US
dc.subject	vision-language models	en_US
dc.subject	AI-assisted content analysis	en_US
dc.title	Detailed Image Captioning and Hashtag Generation	en_US
dc.type	Article	en_US
dc.identifier.doi	https://doi.org/10.3390/fi16120444	-
dc.relation.isPartOf	Future Internet	-
pubs.issue	12	-
pubs.publication-status	Published	-
pubs.volume	16	-
dc.identifier.eissn	1999-5903	-
dc.rights.license	https://creativecommons.org/licenses/by/4.0/legalcode.en	-
dcterms.dateAccepted	2024-11-26	-
dc.rights.holder	The authors	-
Appears in Collections:	Department of Computer Science Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).	21.44 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License