File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.eswa.2024.124466
- Scopus: eid_2-s2.0-85195818018
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: STVANet: A spatio-temporal visual attention framework with large kernel attention mechanism for citywide traffic dynamics prediction
Title | STVANet: A spatio-temporal visual attention framework with large kernel attention mechanism for citywide traffic dynamics prediction |
---|---|
Authors | |
Keywords | 2D ConvNets Deep learning Large kernel attention Spatio-temporal data Squeeze-and-Excitation mechanism Traffic information |
Issue Date | 15-Nov-2024 |
Publisher | Elsevier |
Citation | Expert Systems with Applications, 2024, v. 254 How to Cite? |
Abstract | Enhancing the efficiency and safety of the Intelligent Transportation System requires effective modeling and prediction of citywide traffic dynamics. Most studies employ convolutional neural networks (CNNs) with a 3D convolutional structure or spatio-temporal models with self-attention mechanisms to capture the spatio-temporal information of traffic distribution. Although 3D CNNs excel at capturing local contextual information, they are computationally complex due to the large number of parameters and cannot capture long-range dependence. By contrast, although self-attention mechanisms originally designed to address challenges in natural language processing can capture long-range dependence, their application to 2D image structures requires breaking down the inherent 2D context into a 1D sequence, increasing the computational complexity and neglecting the adaptability between local contextual information and channels. Accordingly, we propose a spatio-temporal visual attention neural network (STVANet), a novel spatio-temporal visual attention 2D CNN, which integrates a unique visual attention module with a large kernel attention (LKA) mechanism, a squeeze-and-excitation (SE) mechanism and a feedforward component to capture long-range dependence and channel information in urban traffic data while preserving the 2D image structure. LKA-based spatio-temporal attention networks extract spatial and temporal features from weekly, daily, and recent hourly periods, and aggregate them with weighted consideration of external features to make predictions. Evaluation of real-world datasets demonstrates STVANet’s superiority over baseline models, showcasing its potential in citywide traffic prediction. |
Persistent Identifier | http://hdl.handle.net/10722/344383 |
ISSN | 2023 Impact Factor: 7.5 2023 SCImago Journal Rankings: 1.875 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yang, Hongtai | - |
dc.contributor.author | Jiang, Junbo | - |
dc.contributor.author | Zhao, Zhan | - |
dc.contributor.author | Pan, Renbin | - |
dc.contributor.author | Tao, Siyu | - |
dc.date.accessioned | 2024-07-24T13:51:09Z | - |
dc.date.available | 2024-07-24T13:51:09Z | - |
dc.date.issued | 2024-11-15 | - |
dc.identifier.citation | Expert Systems with Applications, 2024, v. 254 | - |
dc.identifier.issn | 0957-4174 | - |
dc.identifier.uri | http://hdl.handle.net/10722/344383 | - |
dc.description.abstract | <p>Enhancing the efficiency and safety of the Intelligent Transportation System requires effective modeling and prediction of citywide traffic dynamics. Most studies employ convolutional neural networks (CNNs) with a 3D convolutional structure or spatio-temporal models with self-attention mechanisms to capture the spatio-temporal information of traffic distribution. Although 3D CNNs excel at capturing local contextual information, they are computationally complex due to the large number of parameters and cannot capture long-range dependence. By contrast, although self-attention mechanisms originally designed to address challenges in natural language processing can capture long-range dependence, their application to 2D image structures requires breaking down the inherent 2D context into a 1D sequence, increasing the computational complexity and neglecting the adaptability between local contextual information and channels. Accordingly, we propose a spatio-temporal visual attention neural network (STVANet), a novel spatio-temporal visual attention 2D CNN, which integrates a unique visual attention module with a large kernel attention (LKA) mechanism, a squeeze-and-excitation (SE) mechanism and a feedforward component to capture long-range dependence and channel information in urban traffic data while preserving the 2D image structure. LKA-based spatio-temporal attention networks extract spatial and temporal features from weekly, daily, and recent hourly periods, and aggregate them with weighted consideration of external features to make predictions. Evaluation of real-world datasets demonstrates STVANet’s superiority over baseline models, showcasing its potential in citywide traffic prediction.<br></p> | - |
dc.language | eng | - |
dc.publisher | Elsevier | - |
dc.relation.ispartof | Expert Systems with Applications | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject | 2D ConvNets | - |
dc.subject | Deep learning | - |
dc.subject | Large kernel attention | - |
dc.subject | Spatio-temporal data | - |
dc.subject | Squeeze-and-Excitation mechanism | - |
dc.subject | Traffic information | - |
dc.title | STVANet: A spatio-temporal visual attention framework with large kernel attention mechanism for citywide traffic dynamics prediction | - |
dc.type | Article | - |
dc.identifier.doi | 10.1016/j.eswa.2024.124466 | - |
dc.identifier.scopus | eid_2-s2.0-85195818018 | - |
dc.identifier.volume | 254 | - |
dc.identifier.eissn | 1873-6793 | - |
dc.identifier.issnl | 0957-4174 | - |