File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: State-Aware Value Function Approximation with Attention Mechanism for Restless Multi-armed Bandits

TitleState-Aware Value Function Approximation with Attention Mechanism for Restless Multi-armed Bandits
Authors
KeywordsAgent-based and Multi-agent Systems: Multi-agent Planning
Agent-based and Multi-agent Systems: Resource Allocation
Planning and Scheduling: Planning and Scheduling
Planning and Scheduling: Markov Decisions Processes
Issue Date2021
PublisherInternational Joint Conference on Artificial Intelligence. The Journal's web site is located at https://www.ijcai.org/past_proceedings
Citation
Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI-21), Virtual Meeting, Montreal, Canada, 19-27 August 2021 , p. 458-464 How to Cite?
AbstractThe restless multi-armed bandit (RMAB) problem is a generalization of the multi-armed bandit with non-stationary rewards. Its optimal solution is intractable due to exponentially large state and action spaces with respect to the number of arms. Existing approximation approaches, e.g., Whittle's index policy, have difficulty in capturing either temporal or spatial factors such as impacts from other arms. We propose considering both factors using the attention mechanism, which has achieved great success in deep learning. Our state-aware value function approximation solution comprises an attention-based value function approximator and a Bellman equation solver. The attention-based coordination module capture both spatial and temporal factors for arm coordination. The Bellman equation solver utilizes the decoupling structure of RMABs to acquire solutions with significantly reduced computation overheads. In particular, the time complexity of our approximation is linear in the number of arms. Finally, we illustrate the effectiveness and investigate the properties of our proposed method with numerical experiments.
DescriptionMain Track: Agent-based and Multi-agent Systems
Persistent Identifierhttp://hdl.handle.net/10722/299754
ISSN
2020 SCImago Journal Rankings: 0.649

 

DC FieldValueLanguage
dc.contributor.authorWu, S-
dc.contributor.authorZhao, J-
dc.contributor.authorTian, G-
dc.contributor.authorWang, J-
dc.date.accessioned2021-05-26T03:28:35Z-
dc.date.available2021-05-26T03:28:35Z-
dc.date.issued2021-
dc.identifier.citationProceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI-21), Virtual Meeting, Montreal, Canada, 19-27 August 2021 , p. 458-464-
dc.identifier.issn1045-0823-
dc.identifier.urihttp://hdl.handle.net/10722/299754-
dc.descriptionMain Track: Agent-based and Multi-agent Systems-
dc.description.abstractThe restless multi-armed bandit (RMAB) problem is a generalization of the multi-armed bandit with non-stationary rewards. Its optimal solution is intractable due to exponentially large state and action spaces with respect to the number of arms. Existing approximation approaches, e.g., Whittle's index policy, have difficulty in capturing either temporal or spatial factors such as impacts from other arms. We propose considering both factors using the attention mechanism, which has achieved great success in deep learning. Our state-aware value function approximation solution comprises an attention-based value function approximator and a Bellman equation solver. The attention-based coordination module capture both spatial and temporal factors for arm coordination. The Bellman equation solver utilizes the decoupling structure of RMABs to acquire solutions with significantly reduced computation overheads. In particular, the time complexity of our approximation is linear in the number of arms. Finally, we illustrate the effectiveness and investigate the properties of our proposed method with numerical experiments.-
dc.languageeng-
dc.publisherInternational Joint Conference on Artificial Intelligence. The Journal's web site is located at https://www.ijcai.org/past_proceedings-
dc.relation.ispartofThe 30th International Joint Conference on Artificial Intelligence (IJCAI)-
dc.subjectAgent-based and Multi-agent Systems: Multi-agent Planning-
dc.subjectAgent-based and Multi-agent Systems: Resource Allocation-
dc.subjectPlanning and Scheduling: Planning and Scheduling-
dc.subjectPlanning and Scheduling: Markov Decisions Processes-
dc.titleState-Aware Value Function Approximation with Attention Mechanism for Restless Multi-armed Bandits-
dc.typeConference_Paper-
dc.identifier.doi10.24963/ijcai.2021/64-
dc.identifier.hkuros322587-
dc.identifier.spage458-
dc.identifier.epage464-
dc.publisher.placeUnited States-
dc.identifier.eisbn9780999241196-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats