Regret Analysis of Learning-Based MPC With Partially-Unknown Cost Function

Dogan, I; Shen, ZJM; Aswani, A

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TAC.2023.3328827
Scopus: eid_2-s2.0-85181807035
WOS: WOS:001262903700001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Industrial & Manufacturing Systems Engineering: Journal/Magazine Articles
- President's Office: Journal/Magazine Articles

Article: Regret Analysis of Learning-Based MPC With Partially-Unknown Cost Function

Title	Regret Analysis of Learning-Based MPC With Partially-Unknown Cost Function
Authors	Dogan, I Shen, ZJM Aswani, A
Keywords	Adaptation models Control systems Cost function Costs HVAC Learning-Based control Linear systems model predictive control non-myopic exploitation restless bandits Ventilation
Issue Date	1-Jul-2023
Publisher	Institute of Electrical and Electronics Engineers
Citation	IEEE Transactions on Automatic Control, 2023, p. 1-8 How to Cite? DOI: http://dx.doi.org/10.1109/TAC.2023.3328827
Abstract	The exploration/exploitation trade-off is an inherent challenge in data-driven adaptive control. Though this trade-off has been studied for multi-armed bandits (MAB's) and reinforcement learning for linear systems; it is less well-studied for learning-based control of nonlinear systems. A significant theoretical challenge in the nonlinear setting is that there is no explicit characterization of an optimal controller for a given set of cost and system parameters. We propose the use of a finite-horizon oracle controller with full knowledge of parameters as a reasonable surrogate to optimal controller. This allows us to develop policies in the context of learning-based MPC and MAB's and conduct a control-theoretic analysis using techniques from MPC- and optimization-theory to show these policies achieve low regret with respect to this finite-horizon oracle. Our simulations exhibit the low regret of our policy on a heating, ventilation, and air-conditioning model with partially-unknown cost function.
Persistent Identifier	http://hdl.handle.net/10722/336546
ISSN	0018-9286 2023 Impact Factor: 6.2 2023 SCImago Journal Rankings: 4.501
ISI Accession Number ID	WOS:001262903700001

DC Field	Value	Language
dc.contributor.author	Dogan, I	-
dc.contributor.author	Shen, ZJM	-
dc.contributor.author	Aswani, A	-
dc.date.accessioned	2024-02-16T03:57:37Z	-
dc.date.available	2024-02-16T03:57:37Z	-
dc.date.issued	2023-07-01	-
dc.identifier.citation	IEEE Transactions on Automatic Control, 2023, p. 1-8	-
dc.identifier.issn	0018-9286	-
dc.identifier.uri	http://hdl.handle.net/10722/336546	-
dc.description.abstract	The exploration/exploitation trade-off is an inherent challenge in data-driven adaptive control. Though this trade-off has been studied for multi-armed bandits (MAB's) and reinforcement learning for linear systems; it is less well-studied for learning-based control of nonlinear systems. A significant theoretical challenge in the nonlinear setting is that there is no explicit characterization of an optimal controller for a given set of cost and system parameters. We propose the use of a finite-horizon oracle controller with full knowledge of parameters as a reasonable surrogate to optimal controller. This allows us to develop policies in the context of learning-based MPC and MAB's and conduct a control-theoretic analysis using techniques from MPC- and optimization-theory to show these policies achieve low regret with respect to this finite-horizon oracle. Our simulations exhibit the low regret of our policy on a heating, ventilation, and air-conditioning model with partially-unknown cost function.	-
dc.language	eng	-
dc.publisher	Institute of Electrical and Electronics Engineers	-
dc.relation.ispartof	IEEE Transactions on Automatic Control	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject	Adaptation models	-
dc.subject	Control systems	-
dc.subject	Cost function	-
dc.subject	Costs	-
dc.subject	HVAC	-
dc.subject	Learning-Based control	-
dc.subject	Linear systems	-
dc.subject	model predictive control	-
dc.subject	non-myopic exploitation	-
dc.subject	restless bandits	-
dc.subject	Ventilation	-
dc.title	Regret Analysis of Learning-Based MPC With Partially-Unknown Cost Function	-
dc.type	Article	-
dc.identifier.doi	10.1109/TAC.2023.3328827	-
dc.identifier.scopus	eid_2-s2.0-85181807035	-
dc.identifier.spage	1	-
dc.identifier.epage	8	-
dc.identifier.eissn	1558-2523	-
dc.identifier.isi	WOS:001262903700001	-
dc.identifier.issnl	0018-9286	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Regret Analysis of Learning-Based MPC With Partially-Unknown Cost Function

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats