File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TCOMM.2024.3471992
- Scopus: eid_2-s2.0-105003047688
- Find via

Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: Semantic-Topology Preserving Quantization of Word Embeddings for Human-to-Machine Communications
| Title | Semantic-Topology Preserving Quantization of Word Embeddings for Human-to-Machine Communications |
|---|---|
| Authors | |
| Keywords | human-robot interaction Semantics vector quantization (VQ) |
| Issue Date | 1-Jan-2025 |
| Publisher | Institute of Electrical and Electronics Engineers |
| Citation | IEEE Transactions on Communications, 2025, v. 73, n. 4, p. 2401-2415 How to Cite? |
| Abstract | The vision of 6G mobile networks aims to connect intelligent machines to humans to provide the latter with cooperation, care, and assistance. The mainstream approach for human-to-machine (H2M) semantic communication is to map words into (word) embedding vectors which are clustered according to their semantic similarity to facilitate machines’ interpretation of human languages. The computation-intensive tasks of text-to-embedding mapping are usually delegated to an edge server that senses human commands, maps them into embedding vectors, and then transmits the vectors to a machine over a wireless link. In this work, we propose a quantization framework customized for embedding vectors, called semantic-topology preserving VQ (SemTop-VQ), to overcome the communication bottleneck due to the vectors’ high dimensionality. While traditional VQ focuses on minimizing the distortion of individual vectors, SemTop-VQ aims to minimize the distortion of the topology of embedding matrix, referring to the vectors’ relative positions that represent semantics. To this end, we adopt a topology-distortion metric, termed pointwise-inner-product (PIP) loss, a hierarchical VQ architecture targeting high-dimensional VQ. In this architecture, an embedding vector is decomposed into blocks; the norm and shape (normalized vector) are quantized separately using a scalar and a Grassmannian quantizers, respectively. The main feature of SemTop-VQ lies in deriving from the PIP loss a set of so-called semantic-importance indicators, which reflect the level of influences of individual blocks’ quantization errors on the topology distortion. Then the indicators are applied to optimize quantization-bit allocation for decomposed vector blocks under the criterion of PIP-loss minimization. In practice, the usage probabilities of embedding vectors for a specific machine task are highly skewed and the task is time-varying. We exploit this fact to further develop SemTop-VQ to feature task adaptation that can attain a higher communication efficiency. The task-adaptive VQ is realized via the use of a frequently used (quantization) codebook that is much smaller in size than the original codebook and continuously updated via estimation of embedding-usage distribution. Our experiments using real embedding datasets, namely Word2Vec and Glove, demonstrate the effectiveness of SemTop-VQ as a goal-oriented technique for efficient H2M communications. |
| Persistent Identifier | http://hdl.handle.net/10722/362127 |
| ISSN | 2023 Impact Factor: 7.2 2020 SCImago Journal Rankings: 1.468 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Lin, Zhenyi | - |
| dc.contributor.author | Yang, Lin | - |
| dc.contributor.author | Gong, Yi | - |
| dc.contributor.author | Huang, Kaibin | - |
| dc.date.accessioned | 2025-09-19T00:32:26Z | - |
| dc.date.available | 2025-09-19T00:32:26Z | - |
| dc.date.issued | 2025-01-01 | - |
| dc.identifier.citation | IEEE Transactions on Communications, 2025, v. 73, n. 4, p. 2401-2415 | - |
| dc.identifier.issn | 0090-6778 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/362127 | - |
| dc.description.abstract | The vision of 6G mobile networks aims to connect intelligent machines to humans to provide the latter with cooperation, care, and assistance. The mainstream approach for human-to-machine (H2M) semantic communication is to map words into (word) embedding vectors which are clustered according to their semantic similarity to facilitate machines’ interpretation of human languages. The computation-intensive tasks of text-to-embedding mapping are usually delegated to an edge server that senses human commands, maps them into embedding vectors, and then transmits the vectors to a machine over a wireless link. In this work, we propose a quantization framework customized for embedding vectors, called semantic-topology preserving VQ (SemTop-VQ), to overcome the communication bottleneck due to the vectors’ high dimensionality. While traditional VQ focuses on minimizing the distortion of individual vectors, SemTop-VQ aims to minimize the distortion of the topology of embedding matrix, referring to the vectors’ relative positions that represent semantics. To this end, we adopt a topology-distortion metric, termed pointwise-inner-product (PIP) loss, a hierarchical VQ architecture targeting high-dimensional VQ. In this architecture, an embedding vector is decomposed into blocks; the norm and shape (normalized vector) are quantized separately using a scalar and a Grassmannian quantizers, respectively. The main feature of SemTop-VQ lies in deriving from the PIP loss a set of so-called semantic-importance indicators, which reflect the level of influences of individual blocks’ quantization errors on the topology distortion. Then the indicators are applied to optimize quantization-bit allocation for decomposed vector blocks under the criterion of PIP-loss minimization. In practice, the usage probabilities of embedding vectors for a specific machine task are highly skewed and the task is time-varying. We exploit this fact to further develop SemTop-VQ to feature task adaptation that can attain a higher communication efficiency. The task-adaptive VQ is realized via the use of a frequently used (quantization) codebook that is much smaller in size than the original codebook and continuously updated via estimation of embedding-usage distribution. Our experiments using real embedding datasets, namely Word2Vec and Glove, demonstrate the effectiveness of SemTop-VQ as a goal-oriented technique for efficient H2M communications. | - |
| dc.language | eng | - |
| dc.publisher | Institute of Electrical and Electronics Engineers | - |
| dc.relation.ispartof | IEEE Transactions on Communications | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.subject | human-robot interaction | - |
| dc.subject | Semantics | - |
| dc.subject | vector quantization (VQ) | - |
| dc.title | Semantic-Topology Preserving Quantization of Word Embeddings for Human-to-Machine Communications | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.1109/TCOMM.2024.3471992 | - |
| dc.identifier.scopus | eid_2-s2.0-105003047688 | - |
| dc.identifier.volume | 73 | - |
| dc.identifier.issue | 4 | - |
| dc.identifier.spage | 2401 | - |
| dc.identifier.epage | 2415 | - |
| dc.identifier.eissn | 1558-0857 | - |
| dc.identifier.issnl | 0090-6778 | - |
