Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios

Fan, T; Long, P; Liu, W; Pan, J

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1177/0278364920916531
Scopus: eid_2-s2.0-85085708485
WOS: WOS:000537190100001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios

Title	Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios
Authors	Fan, T Long, P Liu, W Pan, J
Keywords	Distributed collision avoidance multi-robot systems multi-scenario multi-stage reinforcement learning hybrid control
Issue Date	2020
Publisher	Sage Publications Ltd. The Journal's web site is located at http://ijr.sagepub.com
Citation	International Journal on Robotics Research, 2020, v. 39 n. 7, p. 856-892 How to Cite? DOI: http://dx.doi.org/10.1177/0278364920916531
Abstract	Developing a safe and efficient collision-avoidance policy for multiple robots is challenging in the decentralized scenarios where each robot generates its paths with limited observation of other robots’ states and intentions. Prior distributed multi-robot collision-avoidance systems often require frequent inter-robot communication or agent-level features to plan a local collision-free action, which is not robust and computationally prohibitive. In addition, the performance of these methods is not comparable with their centralized counterparts in practice. In this article, we present a decentralized sensor-level collision-avoidance policy for multi-robot systems, which shows promising results in practical applications. In particular, our policy directly maps raw sensor measurements to an agent’s steering commands in terms of the movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to learn an optimal policy. The policy is trained over a large number of robots in rich, complex environments simultaneously using a policy-gradient-based reinforcement-learning algorithm. The learning algorithm is also integrated into a hybrid control framework to further improve the policy’s robustness and effectiveness. We validate the learned sensor-level collision-3avoidance policy in a variety of simulated and real-world scenarios with thorough performance evaluations for large-scale multi-robot systems. The generalization of the learned policy is verified in a set of unseen scenarios including the navigation of a group of heterogeneous robots and a large-scale scenario with 100 robots. Although the policy is trained using simulation data only, we have successfully deployed it on physical robots with shapes and dynamics characteristics that are different from the simulated agents, in order to demonstrate the controller’s robustness against the simulation-to-real modeling error. Finally, we show that the collision-avoidance policy learned from multi-robot navigation tasks provides an excellent solution for safe and effective autonomous navigation for a single robot working in a dense real human crowd. Our learned policy enables a robot to make effective progress in a crowd without getting stuck. More importantly, the policy has been successfully deployed on different types of physical robot platforms without tedious parameter tuning. Videos are available at https://sites.google.com/view/hybridmrca.
Persistent Identifier	http://hdl.handle.net/10722/285104
ISSN	0278-3649 2021 Impact Factor: 6.887 2020 SCImago Journal Rankings: 1.786
ISI Accession Number ID	WOS:000537190100001

DC Field	Value	Language
dc.contributor.author	Fan, T	-
dc.contributor.author	Long, P	-
dc.contributor.author	Liu, W	-
dc.contributor.author	Pan, J	-
dc.date.accessioned	2020-08-07T09:06:48Z	-
dc.date.available	2020-08-07T09:06:48Z	-
dc.date.issued	2020	-
dc.identifier.citation	International Journal on Robotics Research, 2020, v. 39 n. 7, p. 856-892	-
dc.identifier.issn	0278-3649	-
dc.identifier.uri	http://hdl.handle.net/10722/285104	-
dc.description.abstract	Developing a safe and efficient collision-avoidance policy for multiple robots is challenging in the decentralized scenarios where each robot generates its paths with limited observation of other robots’ states and intentions. Prior distributed multi-robot collision-avoidance systems often require frequent inter-robot communication or agent-level features to plan a local collision-free action, which is not robust and computationally prohibitive. In addition, the performance of these methods is not comparable with their centralized counterparts in practice. In this article, we present a decentralized sensor-level collision-avoidance policy for multi-robot systems, which shows promising results in practical applications. In particular, our policy directly maps raw sensor measurements to an agent’s steering commands in terms of the movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to learn an optimal policy. The policy is trained over a large number of robots in rich, complex environments simultaneously using a policy-gradient-based reinforcement-learning algorithm. The learning algorithm is also integrated into a hybrid control framework to further improve the policy’s robustness and effectiveness. We validate the learned sensor-level collision-3avoidance policy in a variety of simulated and real-world scenarios with thorough performance evaluations for large-scale multi-robot systems. The generalization of the learned policy is verified in a set of unseen scenarios including the navigation of a group of heterogeneous robots and a large-scale scenario with 100 robots. Although the policy is trained using simulation data only, we have successfully deployed it on physical robots with shapes and dynamics characteristics that are different from the simulated agents, in order to demonstrate the controller’s robustness against the simulation-to-real modeling error. Finally, we show that the collision-avoidance policy learned from multi-robot navigation tasks provides an excellent solution for safe and effective autonomous navigation for a single robot working in a dense real human crowd. Our learned policy enables a robot to make effective progress in a crowd without getting stuck. More importantly, the policy has been successfully deployed on different types of physical robot platforms without tedious parameter tuning. Videos are available at https://sites.google.com/view/hybridmrca.	-
dc.language	eng	-
dc.publisher	Sage Publications Ltd. The Journal's web site is located at http://ijr.sagepub.com	-
dc.relation.ispartof	International Journal on Robotics Research	-
dc.rights	International Journal on Robotics Research. Copyright © Sage Publications Ltd.	-
dc.subject	Distributed collision avoidance	-
dc.subject	multi-robot systems	-
dc.subject	multi-scenario multi-stage reinforcement learning	-
dc.subject	hybrid control	-
dc.title	Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios	-
dc.type	Article	-
dc.identifier.email	Pan, J: jpan@cs.hku.hk	-
dc.identifier.authority	Pan, J=rp01984	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1177/0278364920916531	-
dc.identifier.scopus	eid_2-s2.0-85085708485	-
dc.identifier.hkuros	312106	-
dc.identifier.volume	39	-
dc.identifier.issue	7	-
dc.identifier.spage	856	-
dc.identifier.epage	892	-
dc.identifier.isi	WOS:000537190100001	-
dc.publisher.place	United Kingdom	-
dc.identifier.issnl	0278-3649	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats