Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/27173
Title: | Large AI Model Empowered Multimodal Semantic Communications |
Authors: | Jiang, F Peng, Y Dong, L Wang, K Yang, K Pan, C You, X |
Keywords: | semantic communication;large AI models;LLM;MLM;knowledgebase;artificial intelligence (cs.AI);computation and language (cs.CL);machine learning (cs.LG) |
Issue Date: | 9-Sep-2024 |
Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
Citation: | Jiang, F. et al. (2024) 'Large AI Model Empowered Multimodal Semantic Communications', IEEE Communications Magazine, 63 (1), pp. 76 - 82. doi: 10.1109/MCOM.001.2300575. |
Abstract: | Multimodal signals, including text, audio, image, and video, can be integrated into semantic communication (SC) systems to provide an immersive experience with low latency and high quality at the semantic level. However, the multimodal SC has several challenges, including data heterogeneity, semantic ambiguity, and signal distortion during transmission. Recent advancements in large AI models, particularly in the multimodal language model (MLM) and large language model (LLM), offer potential solutions for addressing these issues. To this end, we propose a large AI model-based multimodal SC (LAM-MSC) framework, where we first present the MLM-based multimodal alignment (MMA) that utilizes the MLM to enable the transformation between multimodal and unimodal data while preserving semantic consistency. Then, a personalized LLM-based knowledge base (LKB) is proposed, which allows users to perform personalized semantic extraction or recovery through the LLM. This effectively addresses the semantic ambiguity. Finally, we apply the conditional generative adversarial networks-based channel estimation (CGE) for estimating the wireless channel state information. This approach effectively mitigates the impact of fading channels in SC. Finally, we conduct simulations that demonstrate the superior performance of the LAM-MSC framework. |
URI: | https://bura.brunel.ac.uk/handle/2438/27173 |
DOI: | https://doi.org/10.1109/MCOM.001.2300575 |
ISSN: | 0163-6804 |
Other Identifiers: | ORCiD: Kezhi Wang https://orcid.org/0000-0001-8602-0800 arXiv:2309.01249v2 [cs.AI] |
Appears in Collections: | Dept of Computer Science Research Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FullText.pdf | The article version on this institutional repository is available at arXiv:2309.01249v2 [cs.AI], https://arxiv.org/abs/2309.01249. Comments: Accepted by IEEE CM. [v2] Sun, 4 Aug 2024 12:34:29 UTC (1,779 KB). Copyright © 2024 The Author(s). arXiv.org perpetual, non-exclusive license 1.0 (https://arxiv.org/licenses/nonexclusive-distrib/1.0/). This license gives limited rights to arXiv to distribute the article, and also limits re-use of any type from other entities or individuals. | 1.84 MB | Adobe PDF | View/Open |
Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.