File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.robot.2022.104341
- Scopus: eid_2-s2.0-85145648620
- WOS: WOS:000950855200001
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Disturbance-aware reinforcement learning for rejecting excessive disturbances
Title | Disturbance-aware reinforcement learning for rejecting excessive disturbances |
---|---|
Authors | |
Keywords | Disturbance observer Disturbance rejection Reinforcement learning |
Issue Date | 1-Mar-2023 |
Publisher | Elsevier |
Citation | Robotics and Autonomous Systems, 2023, v. 161 How to Cite? |
Abstract | This paper presents a disturbance-aware Reinforcement Learning (RL) approach for stabilizing a free-floating platform under excessive external disturbances. In particular, we consider the scenarios where disturbances frequently exceed actuator limits and largely affect the dynamics characterizing the disturbed platform. This stabilization problem is better described by a set of Unknown Partially Observable Markovian Decision Processes (POMDPs), as opposed to a single-POMDP formulation, making online disturbance awareness necessary. This paper proposes a new Disturbance-Observer network (DO-net) that mimics prediction procedures through an auxiliary Gated Recurrent Unit (GRU), for the purpose of estimating and encoding the disturbance states and the disturbance transition functions, respectively. Then the controller subnetwork is trained with joint optimization of the observer subnetwork in an RL manner for mutual robustness and runtime efficiency. Numerical simulations on position regulation tasks have demonstrated that the DO-net outperforms the DOB-net and reduces the gap with an ideal performance estimate, the latter of which is obtained by a commercial solver given precise disturbance knowledge. |
Persistent Identifier | http://hdl.handle.net/10722/338592 |
ISSN | 2023 Impact Factor: 4.3 2023 SCImago Journal Rankings: 1.303 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lu, Wenjie | - |
dc.contributor.author | Hu, Manman | - |
dc.date.accessioned | 2024-03-11T10:30:03Z | - |
dc.date.available | 2024-03-11T10:30:03Z | - |
dc.date.issued | 2023-03-01 | - |
dc.identifier.citation | Robotics and Autonomous Systems, 2023, v. 161 | - |
dc.identifier.issn | 0921-8890 | - |
dc.identifier.uri | http://hdl.handle.net/10722/338592 | - |
dc.description.abstract | <p>This paper presents a disturbance-aware <a href="https://www.sciencedirect.com/topics/computer-science/reinforcement-learning" title="Learn more about Reinforcement Learning from ScienceDirect's AI-generated Topic Pages">Reinforcement Learning</a> (RL) approach for stabilizing a free-floating platform under excessive <a href="https://www.sciencedirect.com/topics/engineering/external-disturbance" title="Learn more about external disturbances from ScienceDirect's AI-generated Topic Pages">external disturbances</a>. In particular, we consider the scenarios where disturbances frequently exceed actuator limits and largely affect the dynamics characterizing the disturbed platform. This stabilization problem is better described by a set of Unknown Partially Observable Markovian Decision Processes (POMDPs), as opposed to a single-POMDP formulation, making online disturbance awareness necessary. This paper proposes a new Disturbance-Observer network (DO-net) that mimics prediction procedures through an auxiliary <a href="https://www.sciencedirect.com/topics/computer-science/gated-recurrent-unit" title="Learn more about Gated Recurrent Unit from ScienceDirect's AI-generated Topic Pages">Gated Recurrent Unit</a> (GRU), for the purpose of estimating and encoding the disturbance states and the disturbance transition functions, respectively. Then the controller <a href="https://www.sciencedirect.com/topics/engineering/subnetwork" title="Learn more about subnetwork from ScienceDirect's AI-generated Topic Pages">subnetwork</a> is trained with <a href="https://www.sciencedirect.com/topics/engineering/joints-structural-components" title="Learn more about joint from ScienceDirect's AI-generated Topic Pages">joint</a> optimization of the observer <a href="https://www.sciencedirect.com/topics/engineering/subnetwork" title="Learn more about subnetwork from ScienceDirect's AI-generated Topic Pages">subnetwork</a> in an RL manner for mutual robustness and runtime efficiency. Numerical simulations on position regulation tasks have demonstrated that the DO-net outperforms the DOB-net and reduces the gap with an ideal performance estimate, the latter of which is obtained by a commercial solver given precise disturbance knowledge.<br></p> | - |
dc.language | eng | - |
dc.publisher | Elsevier | - |
dc.relation.ispartof | Robotics and Autonomous Systems | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject | Disturbance observer | - |
dc.subject | Disturbance rejection | - |
dc.subject | Reinforcement learning | - |
dc.title | Disturbance-aware reinforcement learning for rejecting excessive disturbances | - |
dc.type | Article | - |
dc.identifier.doi | 10.1016/j.robot.2022.104341 | - |
dc.identifier.scopus | eid_2-s2.0-85145648620 | - |
dc.identifier.volume | 161 | - |
dc.identifier.isi | WOS:000950855200001 | - |
dc.identifier.issnl | 0921-8890 | - |