Don’t touch what matters: Task-aware lipschitz data augmentationfor visual reinforcement learning

Yuan, Z; Ma, G; Mu, Y; Xia, B; Yuan, B; Wang, X; Luo, P; Xu, H

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.24963/ijcai.2022/514

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Don’t touch what matters: Task-aware lipschitz data augmentationfor visual reinforcement learning

Title	Don’t touch what matters: Task-aware lipschitz data augmentationfor visual reinforcement learning
Authors	Yuan, Z Ma, G Mu, Y Xia, B Yuan, B Wang, X Luo, P Xu, H
Keywords	Deep reinforcement learning Reinforcement learning Learning in robotics
Issue Date	2022
Publisher	International Joint Conferences on Artificial Intelligence.
Citation	The 31st International Joint Conference on Artificial Intelligence (IJCAI), Vienna, Austria, July 23-29, 2022. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, Vienna, 23-29 July 2022 How to Cite? DOI: http://dx.doi.org/10.24963/ijcai.2022/514
Abstract	One of the key challenges in visual Reinforcement Learning (RL) is to learn policies that can generalize to unseen environments. Recently, data augmentation techniques aiming at enhancing data diversity have demonstrated proven performance in improving the generalization ability of learned policies. However, due to the sensitivity of RL training, naively applying data augmentation, which transforms each pixel in a task-agnostic manner, may suffer from instability and damage the sample efficiency, thus further exacerbating the generalization performance. At the heart of this phenomenon is the diverged action distribution and high-variance value estimation in the face of augmented images. To alleviate this issue, we propose Task-aware Lipschitz Data Augmentation (TLDA) for visual RL, which explicitly identifies the task-correlated pixels with large Lipschitz constants, and only augments the task-irrelevant pixels for stability. We verify the effectiveness of our approach on DeepMind Control suite, CARLA and DeepMind Manipulation tasks. The extensive empirical results show that TLDA improves both sample efficiency and generalization; it outperforms previous state-of-the-art methods across 3 different visual control benchmarks.
Description	Sponsored by International Joint Conferences on Artificial Intelligence (IJCAI); Oral
Persistent Identifier	http://hdl.handle.net/10722/315554

DC Field	Value	Language
dc.contributor.author	Yuan, Z	-
dc.contributor.author	Ma, G	-
dc.contributor.author	Mu, Y	-
dc.contributor.author	Xia, B	-
dc.contributor.author	Yuan, B	-
dc.contributor.author	Wang, X	-
dc.contributor.author	Luo, P	-
dc.contributor.author	Xu, H	-
dc.date.accessioned	2022-08-19T09:00:03Z	-
dc.date.available	2022-08-19T09:00:03Z	-
dc.date.issued	2022	-
dc.identifier.citation	The 31st International Joint Conference on Artificial Intelligence (IJCAI), Vienna, Austria, July 23-29, 2022. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, Vienna, 23-29 July 2022	-
dc.identifier.uri	http://hdl.handle.net/10722/315554	-
dc.description	Sponsored by International Joint Conferences on Artificial Intelligence (IJCAI); Oral	-
dc.description.abstract	One of the key challenges in visual Reinforcement Learning (RL) is to learn policies that can generalize to unseen environments. Recently, data augmentation techniques aiming at enhancing data diversity have demonstrated proven performance in improving the generalization ability of learned policies. However, due to the sensitivity of RL training, naively applying data augmentation, which transforms each pixel in a task-agnostic manner, may suffer from instability and damage the sample efficiency, thus further exacerbating the generalization performance. At the heart of this phenomenon is the diverged action distribution and high-variance value estimation in the face of augmented images. To alleviate this issue, we propose Task-aware Lipschitz Data Augmentation (TLDA) for visual RL, which explicitly identifies the task-correlated pixels with large Lipschitz constants, and only augments the task-irrelevant pixels for stability. We verify the effectiveness of our approach on DeepMind Control suite, CARLA and DeepMind Manipulation tasks. The extensive empirical results show that TLDA improves both sample efficiency and generalization; it outperforms previous state-of-the-art methods across 3 different visual control benchmarks.	-
dc.language	eng	-
dc.publisher	International Joint Conferences on Artificial Intelligence.	-
dc.relation.ispartof	Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, Vienna, 23-29 July 2022	-
dc.subject	Deep reinforcement learning	-
dc.subject	Reinforcement learning	-
dc.subject	Learning in robotics	-
dc.title	Don’t touch what matters: Task-aware lipschitz data augmentationfor visual reinforcement learning	-
dc.type	Conference_Paper	-
dc.identifier.email	Luo, P: pluo@hku.hk	-
dc.identifier.authority	Luo, P=rp02575	-
dc.identifier.doi	10.24963/ijcai.2022/514	-
dc.identifier.hkuros	335586	-
dc.publisher.place	Austria	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Don’t touch what matters: Task-aware lipschitz data augmentationfor visual reinforcement learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats