File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Causal Graph Discovery for Urban Bus Operation Delays: A Case Study in Stockholm

TitleCausal Graph Discovery for Urban Bus Operation Delays: A Case Study in Stockholm
Authors
Keywordsbig data
data and data science
data mining
GTFS
operations
public transportation
transformative trends in transit data
Issue Date31-Jan-2025
PublisherSAGE Publications
Citation
Transportation Research Record: Journal of the Transportation Research Board, 2025 How to Cite?
AbstractBus delays significantly affect urban public transportation by reducing operational efficiency and incurring high costs. Understanding the causes of these delays is essential for developing targeted mitigation strategies. While traditional research focuses on correlation-based analysis, it often fails to uncover the underlying causal mechanisms. This study examines various causal graph discovery algorithms combined with structural equation models (SEMs) to infer the causal relationships among factors that affect bus delays. These algorithms generate causal graphs for bus delays, revealing the interrelations and impacts of various operational factors. SEM is used to quantify the causal effects. This study evaluates the performance of these algorithms from the perspectives of both the statistical data fitting and the causal relationships generated. A case study is conducted using General Transit Feed Specification (GTFS) data from frequent bus routes in Stockholm, Sweden. The validation results demonstrate the effectiveness of data-driven causal discovery models in identifying causal links, particularly when combined with domain knowledge. The empirical analysis shows the complexity of factors contributing to bus delays, emphasizing the necessity of integrating causality into bus delay analysis. For example, a high correlation between origin delay and bus arrival delay (coefficient = 0.63) does not indicate direct causation, and a strong causation between dwell time and arrival delay does not imply a higher correlation (coefficient = 0.12). Comparing variable importance with linear regression (LR) reveals notable differences; origin delay, which is often overlooked by previous studies, is significant in the causal graph model (standardized coefficient = 0.601) but ranks much lower in LR (standardized coefficient = 0.003). These insights underscore the importance of automated, data-driven causal discovery in enhancing decision-making processes and improving the efficiency and reliability of transit services.
Persistent Identifierhttp://hdl.handle.net/10722/354618
ISSN
2023 Impact Factor: 1.6
2023 SCImago Journal Rankings: 0.543

 

DC FieldValueLanguage
dc.contributor.authorZhang, Qi-
dc.contributor.authorMa, Zhenliang-
dc.contributor.authorLing, Yancheng-
dc.contributor.authorQin, Zhenlin-
dc.contributor.authorZhang, Pengfei-
dc.contributor.authorZhao, Zhan-
dc.date.accessioned2025-02-24T00:40:18Z-
dc.date.available2025-02-24T00:40:18Z-
dc.date.issued2025-01-31-
dc.identifier.citationTransportation Research Record: Journal of the Transportation Research Board, 2025-
dc.identifier.issn0361-1981-
dc.identifier.urihttp://hdl.handle.net/10722/354618-
dc.description.abstractBus delays significantly affect urban public transportation by reducing operational efficiency and incurring high costs. Understanding the causes of these delays is essential for developing targeted mitigation strategies. While traditional research focuses on correlation-based analysis, it often fails to uncover the underlying causal mechanisms. This study examines various causal graph discovery algorithms combined with structural equation models (SEMs) to infer the causal relationships among factors that affect bus delays. These algorithms generate causal graphs for bus delays, revealing the interrelations and impacts of various operational factors. SEM is used to quantify the causal effects. This study evaluates the performance of these algorithms from the perspectives of both the statistical data fitting and the causal relationships generated. A case study is conducted using General Transit Feed Specification (GTFS) data from frequent bus routes in Stockholm, Sweden. The validation results demonstrate the effectiveness of data-driven causal discovery models in identifying causal links, particularly when combined with domain knowledge. The empirical analysis shows the complexity of factors contributing to bus delays, emphasizing the necessity of integrating causality into bus delay analysis. For example, a high correlation between origin delay and bus arrival delay (coefficient = 0.63) does not indicate direct causation, and a strong causation between dwell time and arrival delay does not imply a higher correlation (coefficient = 0.12). Comparing variable importance with linear regression (LR) reveals notable differences; origin delay, which is often overlooked by previous studies, is significant in the causal graph model (standardized coefficient = 0.601) but ranks much lower in LR (standardized coefficient = 0.003). These insights underscore the importance of automated, data-driven causal discovery in enhancing decision-making processes and improving the efficiency and reliability of transit services.-
dc.languageeng-
dc.publisherSAGE Publications-
dc.relation.ispartofTransportation Research Record: Journal of the Transportation Research Board-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subjectbig data-
dc.subjectdata and data science-
dc.subjectdata mining-
dc.subjectGTFS-
dc.subjectoperations-
dc.subjectpublic transportation-
dc.subjecttransformative trends in transit data-
dc.titleCausal Graph Discovery for Urban Bus Operation Delays: A Case Study in Stockholm-
dc.typeArticle-
dc.identifier.doi10.1177/03611981241306754-
dc.identifier.scopuseid_2-s2.0-85216764386-
dc.identifier.eissn2169-4052-
dc.identifier.issnl0361-1981-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats