File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TAC.2023.3328827
- Scopus: eid_2-s2.0-85181807035
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: Regret Analysis of Learning-Based MPC With Partially-Unknown Cost Function
Title | Regret Analysis of Learning-Based MPC With Partially-Unknown Cost Function |
---|---|
Authors | |
Keywords | Adaptation models Control systems Cost function Costs HVAC Learning-Based control Linear systems model predictive control non-myopic exploitation restless bandits Ventilation |
Issue Date | 1-Jul-2023 |
Publisher | Institute of Electrical and Electronics Engineers |
Citation | IEEE Transactions on Automatic Control, 2023, p. 1-8 How to Cite? |
Abstract | The exploration/exploitation trade-off is an inherent challenge in data-driven adaptive control. Though this trade-off has been studied for multi-armed bandits (MAB's) and reinforcement learning for linear systems; it is less well-studied for learning-based control of nonlinear systems. A significant theoretical challenge in the nonlinear setting is that there is no explicit characterization of an optimal controller for a given set of cost and system parameters. We propose the use of a finite-horizon oracle controller with full knowledge of parameters as a reasonable surrogate to optimal controller. This allows us to develop policies in the context of learning-based MPC and MAB's and conduct a control-theoretic analysis using techniques from MPC- and optimization-theory to show these policies achieve low regret with respect to this finite-horizon oracle. Our simulations exhibit the low regret of our policy on a heating, ventilation, and air-conditioning model with partially-unknown cost function. |
Persistent Identifier | http://hdl.handle.net/10722/336546 |
ISSN | 2023 Impact Factor: 6.2 2023 SCImago Journal Rankings: 4.501 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Dogan, I | - |
dc.contributor.author | Shen, ZJM | - |
dc.contributor.author | Aswani, A | - |
dc.date.accessioned | 2024-02-16T03:57:37Z | - |
dc.date.available | 2024-02-16T03:57:37Z | - |
dc.date.issued | 2023-07-01 | - |
dc.identifier.citation | IEEE Transactions on Automatic Control, 2023, p. 1-8 | - |
dc.identifier.issn | 0018-9286 | - |
dc.identifier.uri | http://hdl.handle.net/10722/336546 | - |
dc.description.abstract | The exploration/exploitation trade-off is an inherent challenge in data-driven adaptive control. Though this trade-off has been studied for multi-armed bandits (MAB's) and reinforcement learning for linear systems; it is less well-studied for learning-based control of nonlinear systems. A significant theoretical challenge in the nonlinear setting is that there is no explicit characterization of an optimal controller for a given set of cost and system parameters. We propose the use of a finite-horizon oracle controller with full knowledge of parameters as a reasonable surrogate to optimal controller. This allows us to develop policies in the context of learning-based MPC and MAB's and conduct a control-theoretic analysis using techniques from MPC- and optimization-theory to show these policies achieve low regret with respect to this finite-horizon oracle. Our simulations exhibit the low regret of our policy on a heating, ventilation, and air-conditioning model with partially-unknown cost function. | - |
dc.language | eng | - |
dc.publisher | Institute of Electrical and Electronics Engineers | - |
dc.relation.ispartof | IEEE Transactions on Automatic Control | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject | Adaptation models | - |
dc.subject | Control systems | - |
dc.subject | Cost function | - |
dc.subject | Costs | - |
dc.subject | HVAC | - |
dc.subject | Learning-Based control | - |
dc.subject | Linear systems | - |
dc.subject | model predictive control | - |
dc.subject | non-myopic exploitation | - |
dc.subject | restless bandits | - |
dc.subject | Ventilation | - |
dc.title | Regret Analysis of Learning-Based MPC With Partially-Unknown Cost Function | - |
dc.type | Article | - |
dc.identifier.doi | 10.1109/TAC.2023.3328827 | - |
dc.identifier.scopus | eid_2-s2.0-85181807035 | - |
dc.identifier.spage | 1 | - |
dc.identifier.epage | 8 | - |
dc.identifier.eissn | 1558-2523 | - |
dc.identifier.issnl | 0018-9286 | - |