Disturbance-aware reinforcement learning for rejecting excessive disturbances

Lu, Wenjie; Hu, Manman

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1016/j.robot.2022.104341
Scopus: eid_2-s2.0-85145648620
WOS: WOS:000950855200001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Civil Engineering: Journal/Magazine Articles

Article: Disturbance-aware reinforcement learning for rejecting excessive disturbances

Title	Disturbance-aware reinforcement learning for rejecting excessive disturbances
Authors	Lu, Wenjie Hu, Manman
Keywords	Disturbance observer Disturbance rejection Reinforcement learning
Issue Date	1-Mar-2023
Publisher	Elsevier
Citation	Robotics and Autonomous Systems, 2023, v. 161 How to Cite? DOI: http://dx.doi.org/10.1016/j.robot.2022.104341
Abstract	This paper presents a disturbance-aware Reinforcement Learning (RL) approach for stabilizing a free-floating platform under excessive external disturbances. In particular, we consider the scenarios where disturbances frequently exceed actuator limits and largely affect the dynamics characterizing the disturbed platform. This stabilization problem is better described by a set of Unknown Partially Observable Markovian Decision Processes (POMDPs), as opposed to a single-POMDP formulation, making online disturbance awareness necessary. This paper proposes a new Disturbance-Observer network (DO-net) that mimics prediction procedures through an auxiliary Gated Recurrent Unit (GRU), for the purpose of estimating and encoding the disturbance states and the disturbance transition functions, respectively. Then the controller subnetwork is trained with joint optimization of the observer subnetwork in an RL manner for mutual robustness and runtime efficiency. Numerical simulations on position regulation tasks have demonstrated that the DO-net outperforms the DOB-net and reduces the gap with an ideal performance estimate, the latter of which is obtained by a commercial solver given precise disturbance knowledge.
Persistent Identifier	http://hdl.handle.net/10722/338592
ISSN	0921-8890 2023 Impact Factor: 4.3 2023 SCImago Journal Rankings: 1.303
ISI Accession Number ID	WOS:000950855200001

DC Field	Value	Language
dc.contributor.author	Lu, Wenjie	-
dc.contributor.author	Hu, Manman	-
dc.date.accessioned	2024-03-11T10:30:03Z	-
dc.date.available	2024-03-11T10:30:03Z	-
dc.date.issued	2023-03-01	-
dc.identifier.citation	Robotics and Autonomous Systems, 2023, v. 161	-
dc.identifier.issn	0921-8890	-
dc.identifier.uri	http://hdl.handle.net/10722/338592	-
dc.description.abstract	<p>This paper presents a disturbance-aware <a href="https://www.sciencedirect.com/topics/computer-science/reinforcement-learning" title="Learn more about Reinforcement Learning from ScienceDirect's AI-generated Topic Pages">Reinforcement Learning</a> (RL) approach for stabilizing a free-floating platform under excessive <a href="https://www.sciencedirect.com/topics/engineering/external-disturbance" title="Learn more about external disturbances from ScienceDirect's AI-generated Topic Pages">external disturbances</a>. In particular, we consider the scenarios where disturbances frequently exceed actuator limits and largely affect the dynamics characterizing the disturbed platform. This stabilization problem is better described by a set of Unknown Partially Observable Markovian Decision Processes (POMDPs), as opposed to a single-POMDP formulation, making online disturbance awareness necessary. This paper proposes a new Disturbance-Observer network (DO-net) that mimics prediction procedures through an auxiliary <a href="https://www.sciencedirect.com/topics/computer-science/gated-recurrent-unit" title="Learn more about Gated Recurrent Unit from ScienceDirect's AI-generated Topic Pages">Gated Recurrent Unit</a> (GRU), for the purpose of estimating and encoding the disturbance states and the disturbance transition functions, respectively. Then the controller <a href="https://www.sciencedirect.com/topics/engineering/subnetwork" title="Learn more about subnetwork from ScienceDirect's AI-generated Topic Pages">subnetwork</a> is trained with <a href="https://www.sciencedirect.com/topics/engineering/joints-structural-components" title="Learn more about joint from ScienceDirect's AI-generated Topic Pages">joint</a> optimization of the observer <a href="https://www.sciencedirect.com/topics/engineering/subnetwork" title="Learn more about subnetwork from ScienceDirect's AI-generated Topic Pages">subnetwork</a> in an RL manner for mutual robustness and runtime efficiency. Numerical simulations on position regulation tasks have demonstrated that the DO-net outperforms the DOB-net and reduces the gap with an ideal performance estimate, the latter of which is obtained by a commercial solver given precise disturbance knowledge.<br></p>	-
dc.language	eng	-
dc.publisher	Elsevier	-
dc.relation.ispartof	Robotics and Autonomous Systems	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject	Disturbance observer	-
dc.subject	Disturbance rejection	-
dc.subject	Reinforcement learning	-
dc.title	Disturbance-aware reinforcement learning for rejecting excessive disturbances	-
dc.type	Article	-
dc.identifier.doi	10.1016/j.robot.2022.104341	-
dc.identifier.scopus	eid_2-s2.0-85145648620	-
dc.identifier.volume	161	-
dc.identifier.isi	WOS:000950855200001	-
dc.identifier.issnl	0921-8890	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Disturbance-aware reinforcement learning for rejecting excessive disturbances

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats