File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1609/aaai.v34i02.5574
- Scopus: eid_2-s2.0-85099875908
- WOS: WOS:000667722802012
Supplementary
- Citations:
- Appears in Collections:
Conference Paper: Model and reinforcement learning for Markov games with risk preferences
Title | Model and reinforcement learning for Markov games with risk preferences |
---|---|
Authors | |
Issue Date | 2020 |
Citation | The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), New York, NY, 7-12 February 2020. In Proceedings of the AAAI Conference on Artificial Intelligence, 2020, v. 34 n. 2, p. 2022-2029 How to Cite? |
Abstract | We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic “risk” from both stochastic state transitions (inherent to the game) and randomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutani's fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and covers many widely investigated risk measures. Finally, the almost sure convergence of this simulation-based algorithm to an equilibrium is demonstrated under some mild conditions. Our numerical experiments on a two player queuing game validate the properties of our model and algorithm, and demonstrate their worth and applicability in real life competitive decision-making. |
Persistent Identifier | http://hdl.handle.net/10722/308899 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Huang, Wenjie | - |
dc.contributor.author | Hai, Pham Viet | - |
dc.contributor.author | Haskell, William B. | - |
dc.date.accessioned | 2021-12-08T07:50:22Z | - |
dc.date.available | 2021-12-08T07:50:22Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), New York, NY, 7-12 February 2020. In Proceedings of the AAAI Conference on Artificial Intelligence, 2020, v. 34 n. 2, p. 2022-2029 | - |
dc.identifier.uri | http://hdl.handle.net/10722/308899 | - |
dc.description.abstract | We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic “risk” from both stochastic state transitions (inherent to the game) and randomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutani's fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and covers many widely investigated risk measures. Finally, the almost sure convergence of this simulation-based algorithm to an equilibrium is demonstrated under some mild conditions. Our numerical experiments on a two player queuing game validate the properties of our model and algorithm, and demonstrate their worth and applicability in real life competitive decision-making. | - |
dc.language | eng | - |
dc.relation.ispartof | Proceedings of the AAAI Conference on Artificial Intelligence | - |
dc.title | Model and reinforcement learning for Markov games with risk preferences | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_OA_fulltext | - |
dc.identifier.doi | 10.1609/aaai.v34i02.5574 | - |
dc.identifier.scopus | eid_2-s2.0-85099875908 | - |
dc.identifier.volume | 34 | - |
dc.identifier.issue | 2 | - |
dc.identifier.spage | 2022 | - |
dc.identifier.epage | 2029 | - |
dc.identifier.isi | WOS:000667722802012 | - |