File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Addressing Out-of-Distribution Challenges in Image Semantic Communication Systems with Multi-modal Large Language Models

TitleAddressing Out-of-Distribution Challenges in Image Semantic Communication Systems with Multi-modal Large Language Models
Authors
Keywordsgenerative Als
multi-modal foundation model
out-of-distribution problem
Semantic communication
Issue Date2024
Citation
Proceedings of the International Symposium on Modeling and Optimization in Mobile Ad Hoc and Wireless Networks Wiopt, 2024, p. 7-14 How to Cite?
AbstractSemantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel 'Plan A - Plan B' framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. The novel framework integrates the anti-OOD ability of MLLMs with the domain expertise of ML models in tasks they have been trained for, thus enhancing the accuracy in the semantic encoding process. Further, at the receiver side of the communication system, we put forth a 'generate-criticize' framework that allows one MLLM to challenge the image generated by another MLLM, which then revises the generated image in the next iteration. The joint effort of the two MLLMs significantly enhances the reliability of image reconstruction.
Persistent Identifierhttp://hdl.handle.net/10722/362952
ISSN

 

DC FieldValueLanguage
dc.contributor.authorZhang, Feifan-
dc.contributor.authorDu, Yuyang-
dc.contributor.authorChen, Kexin-
dc.contributor.authorShao, Yulin-
dc.contributor.authorLiew, Soung Chang-
dc.date.accessioned2025-10-10T07:43:38Z-
dc.date.available2025-10-10T07:43:38Z-
dc.date.issued2024-
dc.identifier.citationProceedings of the International Symposium on Modeling and Optimization in Mobile Ad Hoc and Wireless Networks Wiopt, 2024, p. 7-14-
dc.identifier.issn2690-3334-
dc.identifier.urihttp://hdl.handle.net/10722/362952-
dc.description.abstractSemantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel 'Plan A - Plan B' framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. The novel framework integrates the anti-OOD ability of MLLMs with the domain expertise of ML models in tasks they have been trained for, thus enhancing the accuracy in the semantic encoding process. Further, at the receiver side of the communication system, we put forth a 'generate-criticize' framework that allows one MLLM to challenge the image generated by another MLLM, which then revises the generated image in the next iteration. The joint effort of the two MLLMs significantly enhances the reliability of image reconstruction.-
dc.languageeng-
dc.relation.ispartofProceedings of the International Symposium on Modeling and Optimization in Mobile Ad Hoc and Wireless Networks Wiopt-
dc.subjectgenerative Als-
dc.subjectmulti-modal foundation model-
dc.subjectout-of-distribution problem-
dc.subjectSemantic communication-
dc.titleAddressing Out-of-Distribution Challenges in Image Semantic Communication Systems with Multi-modal Large Language Models-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.scopuseid_2-s2.0-85215525063-
dc.identifier.spage7-
dc.identifier.epage14-
dc.identifier.eissn2690-3342-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats