A review of faithfulness metrics for hallucination assessment in Large Language Models

Malin, B; Kalganova, T; Boulgouris, N

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/31364

Full metadata record

DC Field	Value	Language
dc.contributor.author	Malin, B	-
dc.contributor.author	Kalganova, T	-
dc.contributor.author	Boulgouris, N	-
dc.date.accessioned	2025-05-31T16:46:50Z	-
dc.date.available	2025-05-31T16:46:50Z	-
dc.date.issued	2025-06-12	-
dc.identifier	ORCiD: Tatiana Kalganova https://orcid.org/0000-0003-4859-7152	-
dc.identifier	ORCiD: Nikolaos Boulgouris https://orcid.org/0000-0002-5382-6856	-
dc.identifier.citation	Malin, B., Kalganova, T. and Boulgouris, N. (2025) 'A review of faithfulness metrics for hallucination assessment in Large Language Models', IEEE Journal of Selected Topics in Signal Processing, 0 (Early Access, Special Issue), pp. 1 - 13. doi: 10.1109/JSTSP.2025.3579203.	en_US
dc.identifier.issn	1932-4553	-
dc.identifier.uri	https://bura.brunel.ac.uk/handle/2438/31364	-
dc.description	This article has been accepted for publication in IEEE Journal of Selected Topics in Signal Processing. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/JSTSP.2025.3579203 .	en_US
dc.description.abstract	This review examines the means with which faithfulness has been evaluated across open-ended summarization, question answering and machine translation tasks. We find that the use of Large Language Models (LLMs) as a faithfulness evaluator is commonly the metric that is most highly correlated with human judgement. The means with which other studies have mitigated hallucinations is discussed, with both retrieval augmented generation (RAG) and prompting framework approaches having been linked with superior faithfulness, whilst other recommendations for mitigation are provided. Research into faithfulness is integral to the continued widespread use of LLMs, as unfaithful responses can pose major risks to many areas whereby LLMs would otherwise be suitable. Furthermore, evaluating open-ended generation provides a more comprehensive measure of LLM performance than commonly used multiplechoice benchmarking, which can help in advancing the trust that can be placed within LLMs.	en_US
dc.description.sponsorship	10.13039/501100000780-European Commission. This work has been funded by the European Union.	en_US
dc.format.extent	1 - 13	-
dc.format.medium	Print-Electronic	-
dc.language.iso	en_US	en_US
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	en_US
dc.rights	Copyright © 2025 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works ( https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/	-
dc.rights.uri	https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/	-
dc.subject	evaluation	en_US
dc.subject	fact extraction	en_US
dc.subject	faithfulness	en_US
dc.subject	hallucination	en_US
dc.subject	LLM	en_US
dc.subject	machine translation	en_US
dc.subject	question answering	en_US
dc.subject	RAG	en_US
dc.subject	summarization	en_US
dc.title	A review of faithfulness metrics for hallucination assessment in Large Language Models	en_US
dc.type	Article	en_US
dc.date.dateAccepted	2025-05-30	-
dc.identifier.doi	https://doi.org/10.1109/JSTSP.2025.3579203	-
dc.relation.isPartOf	IEEE Journal of Selected Topics in Signal Processing	-
pubs.issue	SPECIAL ISSUE	-
pubs.issue	00	-
pubs.publication-status	Published online	-
pubs.volume	0	-
dc.identifier.eissn	1941-0484	-
dcterms.dateAccepted	2025-05-30	-
dc.rights.holder	Institute of Electrical and Electronics Engineers (IEEE)	-
Appears in Collections:	Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2025 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works ( https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/	1.22 MB	Adobe PDF	View/Open

Show simple item record