LiteGT: Efficient and Lightweight Graph Transformers

Chen, C; Tao, C; Wong, N

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1145/3459637.3482272
Scopus: eid_2-s2.0-85119211996
WOS: WOS:001054156200019

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Electrical & Electronic Engineering: Conference papers

Conference Paper: LiteGT: Efficient and Lightweight Graph Transformers

Title	LiteGT: Efficient and Lightweight Graph Transformers
Authors	Chen, C Tao, C Wong, N
Issue Date	2021
Publisher	Association for Computing Machinery.
Citation	Proceedings of th 30th ACM International Conference on Information and Knowledge Management (CIKM2021), Online Meeting, Gold Coast, Queensland, Australia, 1-5 November 2021, p. 161-170 How to Cite? DOI: http://dx.doi.org/10.1145/3459637.3482272
Abstract	Transformers have shown great potential for modeling long-term dependencies for natural language processing and computer vision. However, little study has applied transformers to graphs, which is challenging due to the poor scalability of the attention mechanism and the under-exploration of graph inductive bias. To bridge this gap, we propose a Lite Graph Transformer (LiteGT) that learns on arbitrary graphs efficiently. First, a node sampling strategy is proposed to sparsify the considered nodes in self-attention with only $mathcal{O}(Nlog N)$ time. Second, we devise two kernelization approaches to form two-branch attention blocks, which not only leverage graph-specific topology information, but also reduce computation further to $mathcal{O}(frac{1}{2}Nlog N)$. Third, the nodes are updated with different attention schemes during training, thus largely mitigating over-smoothing problems when the model layers deepen. Extensive experiments demonstrate that LiteGT achieves competitive performance on both extit{node classification} and extit{link prediction} on datasets with millions of nodes. Specifically, extit{Jaccard + Sampling + Dim. reducing} setting reduces more than $100 imes$ computation and halves the model size without performance degradation.
Description	Full Papers
Persistent Identifier	http://hdl.handle.net/10722/301981
ISBN	9781450384469
ISI Accession Number ID	WOS:001054156200019

DC Field	Value	Language
dc.contributor.author	Chen, C	-
dc.contributor.author	Tao, C	-
dc.contributor.author	Wong, N	-
dc.date.accessioned	2021-08-21T03:29:49Z	-
dc.date.available	2021-08-21T03:29:49Z	-
dc.date.issued	2021	-
dc.identifier.citation	Proceedings of th 30th ACM International Conference on Information and Knowledge Management (CIKM2021), Online Meeting, Gold Coast, Queensland, Australia, 1-5 November 2021, p. 161-170	-
dc.identifier.isbn	9781450384469	-
dc.identifier.uri	http://hdl.handle.net/10722/301981	-
dc.description	Full Papers	-
dc.description.abstract	Transformers have shown great potential for modeling long-term dependencies for natural language processing and computer vision. However, little study has applied transformers to graphs, which is challenging due to the poor scalability of the attention mechanism and the under-exploration of graph inductive bias. To bridge this gap, we propose a Lite Graph Transformer (LiteGT) that learns on arbitrary graphs efficiently. First, a node sampling strategy is proposed to sparsify the considered nodes in self-attention with only $mathcal{O}(Nlog N)$ time. Second, we devise two kernelization approaches to form two-branch attention blocks, which not only leverage graph-specific topology information, but also reduce computation further to $mathcal{O}(frac{1}{2}Nlog N)$. Third, the nodes are updated with different attention schemes during training, thus largely mitigating over-smoothing problems when the model layers deepen. Extensive experiments demonstrate that LiteGT achieves competitive performance on both extit{node classification} and extit{link prediction} on datasets with millions of nodes. Specifically, extit{Jaccard + Sampling + Dim. reducing} setting reduces more than $100 imes$ computation and halves the model size without performance degradation.	-
dc.language	eng	-
dc.publisher	Association for Computing Machinery.	-
dc.relation.ispartof	The 30th ACM International Conference on Information and Knowledge Management (CIKM2021) Proceedings	-
dc.rights	The 30th ACM International Conference on Information and Knowledge Management (CIKM2021) Proceedings. Copyright © Association for Computing Machinery.	-
dc.title	LiteGT: Efficient and Lightweight Graph Transformers	-
dc.type	Conference_Paper	-
dc.identifier.email	Wong, N: nwong@eee.hku.hk	-
dc.identifier.authority	Wong, N=rp00190	-
dc.identifier.doi	10.1145/3459637.3482272	-
dc.identifier.scopus	eid_2-s2.0-85119211996	-
dc.identifier.hkuros	324505	-
dc.identifier.spage	161	-
dc.identifier.epage	170	-
dc.identifier.isi	WOS:001054156200019	-
dc.publisher.place	New York, NY	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: LiteGT: Efficient and Lightweight Graph Transformers

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats