File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Disturbance-aware reinforcement learning for rejecting excessive disturbances

TitleDisturbance-aware reinforcement learning for rejecting excessive disturbances
Authors
KeywordsDisturbance observer
Disturbance rejection
Reinforcement learning
Issue Date1-Mar-2023
PublisherElsevier
Citation
Robotics and Autonomous Systems, 2023, v. 161 How to Cite?
Abstract

This paper presents a disturbance-aware Reinforcement Learning (RL) approach for stabilizing a free-floating platform under excessive external disturbances. In particular, we consider the scenarios where disturbances frequently exceed actuator limits and largely affect the dynamics characterizing the disturbed platform. This stabilization problem is better described by a set of Unknown Partially Observable Markovian Decision Processes (POMDPs), as opposed to a single-POMDP formulation, making online disturbance awareness necessary. This paper proposes a new Disturbance-Observer network (DO-net) that mimics prediction procedures through an auxiliary Gated Recurrent Unit (GRU), for the purpose of estimating and encoding the disturbance states and the disturbance transition functions, respectively. Then the controller subnetwork is trained with joint optimization of the observer subnetwork in an RL manner for mutual robustness and runtime efficiency. Numerical simulations on position regulation tasks have demonstrated that the DO-net outperforms the DOB-net and reduces the gap with an ideal performance estimate, the latter of which is obtained by a commercial solver given precise disturbance knowledge.


Persistent Identifierhttp://hdl.handle.net/10722/338592
ISSN
2023 Impact Factor: 4.3
2023 SCImago Journal Rankings: 1.303
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorLu, Wenjie-
dc.contributor.authorHu, Manman-
dc.date.accessioned2024-03-11T10:30:03Z-
dc.date.available2024-03-11T10:30:03Z-
dc.date.issued2023-03-01-
dc.identifier.citationRobotics and Autonomous Systems, 2023, v. 161-
dc.identifier.issn0921-8890-
dc.identifier.urihttp://hdl.handle.net/10722/338592-
dc.description.abstract<p>This paper presents a disturbance-aware <a href="https://www.sciencedirect.com/topics/computer-science/reinforcement-learning" title="Learn more about Reinforcement Learning from ScienceDirect's AI-generated Topic Pages">Reinforcement Learning</a> (RL) approach for stabilizing a free-floating platform under excessive <a href="https://www.sciencedirect.com/topics/engineering/external-disturbance" title="Learn more about external disturbances from ScienceDirect's AI-generated Topic Pages">external disturbances</a>. In particular, we consider the scenarios where disturbances frequently exceed actuator limits and largely affect the dynamics characterizing the disturbed platform. This stabilization problem is better described by a set of Unknown Partially Observable Markovian Decision Processes (POMDPs), as opposed to a single-POMDP formulation, making online disturbance awareness necessary. This paper proposes a new Disturbance-Observer network (DO-net) that mimics prediction procedures through an auxiliary <a href="https://www.sciencedirect.com/topics/computer-science/gated-recurrent-unit" title="Learn more about Gated Recurrent Unit from ScienceDirect's AI-generated Topic Pages">Gated Recurrent Unit</a> (GRU), for the purpose of estimating and encoding the disturbance states and the disturbance transition functions, respectively. Then the controller <a href="https://www.sciencedirect.com/topics/engineering/subnetwork" title="Learn more about subnetwork from ScienceDirect's AI-generated Topic Pages">subnetwork</a> is trained with <a href="https://www.sciencedirect.com/topics/engineering/joints-structural-components" title="Learn more about joint from ScienceDirect's AI-generated Topic Pages">joint</a> optimization of the observer <a href="https://www.sciencedirect.com/topics/engineering/subnetwork" title="Learn more about subnetwork from ScienceDirect's AI-generated Topic Pages">subnetwork</a> in an RL manner for mutual robustness and runtime efficiency. Numerical simulations on position regulation tasks have demonstrated that the DO-net outperforms the DOB-net and reduces the gap with an ideal performance estimate, the latter of which is obtained by a commercial solver given precise disturbance knowledge.<br></p>-
dc.languageeng-
dc.publisherElsevier-
dc.relation.ispartofRobotics and Autonomous Systems-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subjectDisturbance observer-
dc.subjectDisturbance rejection-
dc.subjectReinforcement learning-
dc.titleDisturbance-aware reinforcement learning for rejecting excessive disturbances-
dc.typeArticle-
dc.identifier.doi10.1016/j.robot.2022.104341-
dc.identifier.scopuseid_2-s2.0-85145648620-
dc.identifier.volume161-
dc.identifier.isiWOS:000950855200001-
dc.identifier.issnl0921-8890-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats