File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Addressing Out-of-Distribution Challenges in Image Semantic Communication Systems with Multi-modal Large Language Models
| Title | Addressing Out-of-Distribution Challenges in Image Semantic Communication Systems with Multi-modal Large Language Models |
|---|---|
| Authors | |
| Keywords | generative Als multi-modal foundation model out-of-distribution problem Semantic communication |
| Issue Date | 2024 |
| Citation | Proceedings of the International Symposium on Modeling and Optimization in Mobile Ad Hoc and Wireless Networks Wiopt, 2024, p. 7-14 How to Cite? |
| Abstract | Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel 'Plan A - Plan B' framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. The novel framework integrates the anti-OOD ability of MLLMs with the domain expertise of ML models in tasks they have been trained for, thus enhancing the accuracy in the semantic encoding process. Further, at the receiver side of the communication system, we put forth a 'generate-criticize' framework that allows one MLLM to challenge the image generated by another MLLM, which then revises the generated image in the next iteration. The joint effort of the two MLLMs significantly enhances the reliability of image reconstruction. |
| Persistent Identifier | http://hdl.handle.net/10722/362952 |
| ISSN |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Zhang, Feifan | - |
| dc.contributor.author | Du, Yuyang | - |
| dc.contributor.author | Chen, Kexin | - |
| dc.contributor.author | Shao, Yulin | - |
| dc.contributor.author | Liew, Soung Chang | - |
| dc.date.accessioned | 2025-10-10T07:43:38Z | - |
| dc.date.available | 2025-10-10T07:43:38Z | - |
| dc.date.issued | 2024 | - |
| dc.identifier.citation | Proceedings of the International Symposium on Modeling and Optimization in Mobile Ad Hoc and Wireless Networks Wiopt, 2024, p. 7-14 | - |
| dc.identifier.issn | 2690-3334 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/362952 | - |
| dc.description.abstract | Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel 'Plan A - Plan B' framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. The novel framework integrates the anti-OOD ability of MLLMs with the domain expertise of ML models in tasks they have been trained for, thus enhancing the accuracy in the semantic encoding process. Further, at the receiver side of the communication system, we put forth a 'generate-criticize' framework that allows one MLLM to challenge the image generated by another MLLM, which then revises the generated image in the next iteration. The joint effort of the two MLLMs significantly enhances the reliability of image reconstruction. | - |
| dc.language | eng | - |
| dc.relation.ispartof | Proceedings of the International Symposium on Modeling and Optimization in Mobile Ad Hoc and Wireless Networks Wiopt | - |
| dc.subject | generative Als | - |
| dc.subject | multi-modal foundation model | - |
| dc.subject | out-of-distribution problem | - |
| dc.subject | Semantic communication | - |
| dc.title | Addressing Out-of-Distribution Challenges in Image Semantic Communication Systems with Multi-modal Large Language Models | - |
| dc.type | Conference_Paper | - |
| dc.description.nature | link_to_subscribed_fulltext | - |
| dc.identifier.scopus | eid_2-s2.0-85215525063 | - |
| dc.identifier.spage | 7 | - |
| dc.identifier.epage | 14 | - |
| dc.identifier.eissn | 2690-3342 | - |

