File Download
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: Parallel Sequence Modeling via Generalized Spatial Propagation Network
| Title | Parallel Sequence Modeling via Generalized Spatial Propagation Network |
|---|---|
| Authors | |
| Issue Date | 1-Jun-2025 |
| Abstract | We present the Generalized Spatial Propagation Network (GSPN), a new attention mechanism optimized for vision tasks that inherently captures 2D spatial structures. Existing attention models, including transformers, linear attention, and state-space models like Mamba, process multi-dimensional data as 1D sequences, compromising spatial coherence and efficiency. GSPN overcomes these limitations by directly operating on spatially coherent image data and forming dense pairwise connections through a line-scan approach. Central to GSPN is the Stability-Context Condition, which ensures stable, context-aware propagation across 2D sequences and reduces the effective sequence length to √N for a square map with N elements, significantly enhancing computational efficiency. With learnable, input-dependent weights and no reliance on positional embeddings, GSPN achieves superior spatial fidelity and state-of-the-art performance in vision tasks, including ImageNet classification, class-guided image generation, and text-to-image generation. Notably, GSPN accelerates SD-XL with softmax-attention by over 84× when generating 16K images |
| Persistent Identifier | http://hdl.handle.net/10722/362398 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Wang, Hongjun | - |
| dc.contributor.author | Byeon, Wonmin | - |
| dc.contributor.author | Xu, Jiarui | - |
| dc.contributor.author | Gu, Jinwei | - |
| dc.contributor.author | Cheung, Ka, Chun | - |
| dc.contributor.author | Wang, Xiaolong | - |
| dc.contributor.author | Han, Kai | - |
| dc.contributor.author | Kautz, Jan | - |
| dc.contributor.author | Liu, Sifei | - |
| dc.date.accessioned | 2025-09-23T00:31:15Z | - |
| dc.date.available | 2025-09-23T00:31:15Z | - |
| dc.date.issued | 2025-06-01 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/362398 | - |
| dc.description.abstract | <p>We present the Generalized Spatial Propagation Network (GSPN), a new attention mechanism optimized for vision tasks that inherently captures 2D spatial structures. Existing attention models, including transformers, linear attention, and state-space models like Mamba, process multi-dimensional data as 1D sequences, compromising spatial coherence and efficiency. GSPN overcomes these limitations by directly operating on spatially coherent image data and forming dense pairwise connections through a line-scan approach. Central to GSPN is the Stability-Context Condition, which ensures stable, context-aware propagation across 2D sequences and reduces the effective sequence length to √N for a square map with N elements, significantly enhancing computational efficiency. With learnable, input-dependent weights and no reliance on positional embeddings, GSPN achieves superior spatial fidelity and state-of-the-art performance in vision tasks, including ImageNet classification, class-guided image generation, and text-to-image generation. Notably, GSPN accelerates SD-XL with softmax-attention by over 84× when generating 16K images</p> | - |
| dc.language | eng | - |
| dc.relation.ispartof | Computer Vision and Pattern Recognition (CVPR) 2025 (11/06/2025-15/06/2025, Nashville) | - |
| dc.title | Parallel Sequence Modeling via Generalized Spatial Propagation Network | - |
| dc.type | Conference_Paper | - |
| dc.description.nature | preprint | - |
