Parallel Sequence Modeling via Generalized Spatial Propagation Network

Wang, Hongjun; Byeon, Wonmin; Xu, Jiarui; Gu, Jinwei; Cheung, Ka, Chun; Wang, Xiaolong; Han, Kai; Kautz, Jan; Liu, Sifei

File Download

2501.12381v1.pdf

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Parallel Sequence Modeling via Generalized Spatial Propagation Network

Title	Parallel Sequence Modeling via Generalized Spatial Propagation Network
Authors	Wang, Hongjun Byeon, Wonmin Xu, Jiarui Gu, Jinwei Cheung, Ka, Chun Wang, Xiaolong Han, Kai Kautz, Jan Liu, Sifei
Issue Date	1-Jun-2025
Abstract	We present the Generalized Spatial Propagation Network (GSPN), a new attention mechanism optimized for vision tasks that inherently captures 2D spatial structures. Existing attention models, including transformers, linear attention, and state-space models like Mamba, process multi-dimensional data as 1D sequences, compromising spatial coherence and efficiency. GSPN overcomes these limitations by directly operating on spatially coherent image data and forming dense pairwise connections through a line-scan approach. Central to GSPN is the Stability-Context Condition, which ensures stable, context-aware propagation across 2D sequences and reduces the effective sequence length to √N for a square map with N elements, significantly enhancing computational efficiency. With learnable, input-dependent weights and no reliance on positional embeddings, GSPN achieves superior spatial fidelity and state-of-the-art performance in vision tasks, including ImageNet classification, class-guided image generation, and text-to-image generation. Notably, GSPN accelerates SD-XL with softmax-attention by over 84× when generating 16K images
Persistent Identifier	http://hdl.handle.net/10722/362398

DC Field	Value	Language
dc.contributor.author	Wang, Hongjun	-
dc.contributor.author	Byeon, Wonmin	-
dc.contributor.author	Xu, Jiarui	-
dc.contributor.author	Gu, Jinwei	-
dc.contributor.author	Cheung, Ka, Chun	-
dc.contributor.author	Wang, Xiaolong	-
dc.contributor.author	Han, Kai	-
dc.contributor.author	Kautz, Jan	-
dc.contributor.author	Liu, Sifei	-
dc.date.accessioned	2025-09-23T00:31:15Z	-
dc.date.available	2025-09-23T00:31:15Z	-
dc.date.issued	2025-06-01	-
dc.identifier.uri	http://hdl.handle.net/10722/362398	-
dc.description.abstract	<p>We present the Generalized Spatial Propagation Network (GSPN), a new attention mechanism optimized for vision tasks that inherently captures 2D spatial structures. Existing attention models, including transformers, linear attention, and state-space models like Mamba, process multi-dimensional data as 1D sequences, compromising spatial coherence and efficiency. GSPN overcomes these limitations by directly operating on spatially coherent image data and forming dense pairwise connections through a line-scan approach. Central to GSPN is the Stability-Context Condition, which ensures stable, context-aware propagation across 2D sequences and reduces the effective sequence length to √N for a square map with N elements, significantly enhancing computational efficiency. With learnable, input-dependent weights and no reliance on positional embeddings, GSPN achieves superior spatial fidelity and state-of-the-art performance in vision tasks, including ImageNet classification, class-guided image generation, and text-to-image generation. Notably, GSPN accelerates SD-XL with softmax-attention by over 84× when generating 16K images</p>	-
dc.language	eng	-
dc.relation.ispartof	Computer Vision and Pattern Recognition (CVPR) 2025 (11/06/2025-15/06/2025, Nashville)	-
dc.title	Parallel Sequence Modeling via Generalized Spatial Propagation Network	-
dc.type	Conference_Paper	-
dc.description.nature	preprint	-

File Download

Supplementary

Conference Paper: Parallel Sequence Modeling via Generalized Spatial Propagation Network

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats