File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Improving Robustness to Out-of-Distribution States in Imitation Learning via Deep Koopman-Boosted Diffusion Policy

TitleImproving Robustness to Out-of-Distribution States in Imitation Learning via Deep Koopman-Boosted Diffusion Policy
Authors
KeywordsAction Chunkings Aggregation
Deep Koopman Operator
Diffusion Policy
Imitation Learning
Issue Date1-Jan-2025
PublisherInstitute of Electrical and Electronics Engineers
Citation
IEEE Transactions on Robotics, 2025 How to Cite?
Abstract

Integrating generative models with action chunking has shown significant promise in imitation learning for robotic manipulation. However, the existing diffusion-based paradigm often struggles to capture strong temporal dependencies across multiple steps, particularly when incorporating proprioceptive input. This limitation can lead to task failures, where the policy overfits to proprioceptive cues at the expense of capturing the visually derived features of the task. To overcome this challenge, we propose the Deep Koopman-boosted Dual-branch Diffusion Policy (D3P) algorithm. D3P introduces a dual-branch architecture to decouple the roles of different sensory modality combinations. The visual branch encodes the visual observations to indicate task progression, while the fused branch integrates both visual and proprioceptive inputs for precise manipulation. Within this architecture, when the robot fails to accomplish intermediate goals, such as grasping a drawer handle, the policy can dynamically switch to execute action chunks generated by the visual branch, allowing recovery to previously observed states and facilitating retrial of the task. To further enhance visual representation learning, we incorporate a Deep Koopman Operator module that captures structured temporal dynamics from visual inputs. During inference, we use the test-time loss of the generative model as a confidence signal to guide the aggregation of the temporally overlapping predicted action chunks, thereby enhancing the reliability of policy execution. In simulation experiments across six RLBench tabletop tasks, D3P outperforms the state-of-the-art diffusion policy by an average of 14.6%. On three real-world robotic manipulation tasks, it achieves a 15.0% improvement.


Persistent Identifierhttp://hdl.handle.net/10722/367051
ISSN
2023 Impact Factor: 9.4
2023 SCImago Journal Rankings: 3.669

 

DC FieldValueLanguage
dc.contributor.authorHuang, Dianye-
dc.contributor.authorNavab, Nassir-
dc.contributor.authorJiang, Zhongliang-
dc.date.accessioned2025-12-02T00:35:26Z-
dc.date.available2025-12-02T00:35:26Z-
dc.date.issued2025-01-01-
dc.identifier.citationIEEE Transactions on Robotics, 2025-
dc.identifier.issn1552-3098-
dc.identifier.urihttp://hdl.handle.net/10722/367051-
dc.description.abstract<p>Integrating generative models with action chunking has shown significant promise in imitation learning for robotic manipulation. However, the existing diffusion-based paradigm often struggles to capture strong temporal dependencies across multiple steps, particularly when incorporating proprioceptive input. This limitation can lead to task failures, where the policy overfits to proprioceptive cues at the expense of capturing the visually derived features of the task. To overcome this challenge, we propose the Deep Koopman-boosted Dual-branch Diffusion Policy (D3P) algorithm. D3P introduces a dual-branch architecture to decouple the roles of different sensory modality combinations. The visual branch encodes the visual observations to indicate task progression, while the fused branch integrates both visual and proprioceptive inputs for precise manipulation. Within this architecture, when the robot fails to accomplish intermediate goals, such as grasping a drawer handle, the policy can dynamically switch to execute action chunks generated by the visual branch, allowing recovery to previously observed states and facilitating retrial of the task. To further enhance visual representation learning, we incorporate a Deep Koopman Operator module that captures structured temporal dynamics from visual inputs. During inference, we use the test-time loss of the generative model as a confidence signal to guide the aggregation of the temporally overlapping predicted action chunks, thereby enhancing the reliability of policy execution. In simulation experiments across six RLBench tabletop tasks, D3P outperforms the state-of-the-art diffusion policy by an average of 14.6%. On three real-world robotic manipulation tasks, it achieves a 15.0% improvement.<br></p>-
dc.languageeng-
dc.publisherInstitute of Electrical and Electronics Engineers-
dc.relation.ispartofIEEE Transactions on Robotics-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subjectAction Chunkings Aggregation-
dc.subjectDeep Koopman Operator-
dc.subjectDiffusion Policy-
dc.subjectImitation Learning-
dc.titleImproving Robustness to Out-of-Distribution States in Imitation Learning via Deep Koopman-Boosted Diffusion Policy-
dc.typeArticle-
dc.identifier.doi10.1109/TRO.2025.3629819-
dc.identifier.scopuseid_2-s2.0-105021125644-
dc.identifier.eissn1941-0468-
dc.identifier.issnl1552-3098-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats