Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/27173
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Jiang, F | - |
dc.contributor.author | Peng, Y | - |
dc.contributor.author | Dong, L | - |
dc.contributor.author | Wang, K | - |
dc.contributor.author | Yang, K | - |
dc.contributor.author | Pan, C | - |
dc.contributor.author | You, X | - |
dc.date.accessioned | 2023-09-13T09:34:18Z | - |
dc.date.available | 2023-09-13T09:34:18Z | - |
dc.date.issued | 2023-09-03 | - |
dc.identifier | ORCID iD: Kezhi Wang https://orcid.org/0000-0001-8602-0800 | - |
dc.identifier | arXiv:2309.01249v1 [cs.AI] | - |
dc.identifier.citation | Jiang, F. et al. (2023) 'Large AI Model Empowered Multimodal Semantic Communications', arXiv:2309.01249v1 [cs.AI], pp. 1 - 8. doi: 10.48550/arXiv.2309.01249. | en_US |
dc.identifier.uri | https://bura.brunel.ac.uk/handle/2438/27173 | - |
dc.description | The file on this repository is an arXiv preprint. It has not been certified by peer review. It may be submitted to a journal for publication and replaced by the authors' accepted manuscript in due course. | - |
dc.description.abstract | Multimodal signals, including text, audio, image and video, can be integrated into Semantic Communication (SC) for providing an immersive experience with low latency and high quality at the semantic level. However, the multimodal SC has several challenges, including data heterogeneity, semantic ambiguity, and signal fading. Recent advancements in large AI models, particularly in Multimodal Language Model (MLM) and Large Language Model (LLM), offer potential solutions for these issues. To this end, we propose a Large AI Model-based Multimodal SC (LAM-MSC) framework, in which we first present the MLM-based Multimodal Alignment (MMA) that utilizes the MLM to enable the transformation between multimodal and unimodal data while preserving semantic consistency. Then, a personalized LLM-based Knowledge Base (LKB) is proposed, which allows users to perform personalized semantic extraction or recovery through the LLM. This effectively addresses the semantic ambiguity. Finally, we apply the Conditional Generative adversarial networks-based channel Estimation (CGE) to obtain Channel State Information (CSI). This approach effectively mitigates the impact of fading channels in SC. Finally, we conduct simulations that demonstrate the superior performance of the LAM-MSC framework. | en_US |
dc.format.extent | 1 - 8 | - |
dc.format.medium | Electronic | - |
dc.language.iso | en_US | en_US |
dc.publisher | Cornell University | en_US |
dc.relation.uri | https://arxiv.org/abs/2309.01249v1 | - |
dc.subject | semantic communication | en_US |
dc.subject | large AI models | en_US |
dc.subject | LLM | en_US |
dc.subject | MLM | en_US |
dc.subject | knowledgebase | - |
dc.title | Large AI Model Empowered Multimodal Semantic Communications | en_US |
dc.type | Article | en_US |
dc.identifier.doi | https://doi.org/10.48550/arXiv.2309.01249 | - |
pubs.notes | To be submitted for journal publication | - |
dc.identifier.eissn | 2331-8422 | - |
Appears in Collections: | Dept of Computer Science Research Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Preprint.pdf | The file on this repository is an arXiv preprint. It has not been certified by peer review. It may be submitted to a journal for publication and replaced by the authors' accepted manuscript in due course. | 4.09 MB | Adobe PDF | View/Open |
Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.