CommGPT: A Graph and Retrieval- Augmented Multimodal Communication Foundation Model

Jiang, F; Zhu, W; Dong, L; Wang, K; Yang, K; Pan, C; Dobre, OA

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/32918

Full metadata record

DC Field	Value	Language
dc.contributor.author	Jiang, F	-
dc.contributor.author	Zhu, W	-
dc.contributor.author	Dong, L	-
dc.contributor.author	Wang, K	-
dc.contributor.author	Yang, K	-
dc.contributor.author	Pan, C	-
dc.contributor.author	Dobre, OA	-
dc.date.accessioned	2026-03-02T10:11:14Z	-
dc.date.available	2026-03-02T10:11:14Z	-
dc.date.issued	2026-01-29	-
dc.identifier	ORCiD: Kezhi Wang https://orcid.org/0000-0001-8602-0800	-
dc.identifier.citation	Jiang, F. et al. (2026) 'CommGPT: A Graph and Retrieval- Augmented Multimodal Communication Foundation Model', IEEE Communications Magazine, 0 (early access), pp. 1–7. doi: 10.1109/mcom.001.2500111.	en-US
dc.identifier.issn	0163-6804	-
dc.identifier.uri	https://bura.brunel.ac.uk/handle/2438/32918	-
dc.description	The preprint version of the magazine article is archived on this institutional repository. It is also available at arXiv:2502.18763v1 [cs.IT] (https://arxiv.org/abs/2502.18763 -- [v1] Wed, 26 Feb 2025 02:44:21 UTC (1,089 KB)). It has not been certified by peer review.	en-US
dc.description.abstract	Large Language Models (LLMs) exhibit advanced cognitive and decision-making capabilities, positioning them as a pivotal technology for 6G networks. However, applying LLMs to the communication domain faces three major challenges: 1) Inadequate communication data; 2) Restricted input modalities; and 3) Difficulty in knowledge retrieval. To overcome these issues, we propose CommGPT, a multimodal foundation model designed specifically for communications. First, we create high-quality pretraining and fine-tuning datasets tailored to communication, enabling the LLM to engage in further pretraining and fine-tuning with communication concepts and knowledge. Then, we design a multimodal encoder to understand and process information from various input modalities. Next, we construct a Graph and Retrieval-Augmented Generation (GRG) framework, efficiently coupling Knowledge Graph (KG) with Retrieval-Augmented Generation (RAG) for multi-scale learning. Finally, we demonstrate the feasibility and effectiveness of the CommGPT through experimental validation.	en-US
dc.format.extent	1–7	-
dc.format.medium	Print-Electronic	-
dc.language.iso	en-US	en-US
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	en-US
dc.rights	arXiv.org - Non-exclusive license to distribute - The URI https://arxiv.org/licenses/nonexclusive-distrib/1.0/ is used to record the fact that the submitter granted the following license to arXiv.org on submission of an article: • I grant arXiv.org a perpetual, non-exclusive license to distribute this article. • I certify that I have the right to grant this license. • I understand that submissions cannot be completely removed once accepted. • I understand that arXiv.org reserves the right to reclassify or reject any submission.	-
dc.rights.uri	https://arxiv.org/licenses/nonexclusive-distrib/1.0/	-
dc.title	CommGPT: A Graph and Retrieval- Augmented Multimodal Communication Foundation Model	en-US
dc.type	Article	en-US
dc.identifier.doi	https://doi.org/10.1109/mcom.001.2500111	-
dc.relation.isPartOf	IEEE Communications Magazine	-
pubs.publication-status	Published	-
dc.identifier.eissn	1558-1896	-
dc.rights.holder	The Author(s)	-
dc.contributor.orcid	Wang, Kezhi [0000-0001-8602-0800]	-
Appears in Collections:	Department of Computer Science Research Papers

Files in This Item:

File	Description	Size	Format
Preprint.pdf	arXiv.org - Non-exclusive license to distribute (https://arxiv.org/licenses/nonexclusive-distrib/1.0/).	1.21 MB	Adobe PDF	View/Open

Show simple item record