File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TWC.2019.2935201
- Scopus: eid_2-s2.0-85079784199
- WOS: WOS:000522027400001
- Find via

Supplementary
- Citations:
- Appears in Collections:
Article: Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks
| Title | Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks |
|---|---|
| Authors | |
| Keywords | Dynamic resource allocation multi-agent reinforcement learning (MARL) stochastic games UAV communications |
| Issue Date | 2020 |
| Citation | IEEE Transactions on Wireless Communications, 2020, v. 19, n. 2, p. 729-743 How to Cite? |
| Abstract | Unmanned aerial vehicles (UAVs) are capable of serving as aerial base stations (BSs) for providing both cost-effective and on-demand wireless communications. This article investigates dynamic resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. More particularly, each UAV communicates with a ground user by automatically selecting its communicating user, power level and subchannel without any information exchange among UAVs. To model the dynamics and uncertainty in environments, we formulate the long-term resource allocation problem as a stochastic game for maximizing the expected rewards, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Afterwards, we develop a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that: 1) appropriate parameters for exploitation and exploration are capable of enhancing the performance of the proposed MARL based resource allocation algorithm; 2) the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs. By doing so, it strikes a good tradeoff between performance gains and information exchange overheads. |
| Persistent Identifier | http://hdl.handle.net/10722/349404 |
| ISSN | 2023 Impact Factor: 8.9 2023 SCImago Journal Rankings: 5.371 |
| ISI Accession Number ID |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Cui, Jingjing | - |
| dc.contributor.author | Liu, Yuanwei | - |
| dc.contributor.author | Nallanathan, Arumugam | - |
| dc.date.accessioned | 2024-10-17T06:58:18Z | - |
| dc.date.available | 2024-10-17T06:58:18Z | - |
| dc.date.issued | 2020 | - |
| dc.identifier.citation | IEEE Transactions on Wireless Communications, 2020, v. 19, n. 2, p. 729-743 | - |
| dc.identifier.issn | 1536-1276 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/349404 | - |
| dc.description.abstract | Unmanned aerial vehicles (UAVs) are capable of serving as aerial base stations (BSs) for providing both cost-effective and on-demand wireless communications. This article investigates dynamic resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. More particularly, each UAV communicates with a ground user by automatically selecting its communicating user, power level and subchannel without any information exchange among UAVs. To model the dynamics and uncertainty in environments, we formulate the long-term resource allocation problem as a stochastic game for maximizing the expected rewards, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Afterwards, we develop a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that: 1) appropriate parameters for exploitation and exploration are capable of enhancing the performance of the proposed MARL based resource allocation algorithm; 2) the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs. By doing so, it strikes a good tradeoff between performance gains and information exchange overheads. | - |
| dc.language | eng | - |
| dc.relation.ispartof | IEEE Transactions on Wireless Communications | - |
| dc.subject | Dynamic resource allocation | - |
| dc.subject | multi-agent reinforcement learning (MARL) | - |
| dc.subject | stochastic games | - |
| dc.subject | UAV communications | - |
| dc.title | Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks | - |
| dc.type | Article | - |
| dc.description.nature | link_to_subscribed_fulltext | - |
| dc.identifier.doi | 10.1109/TWC.2019.2935201 | - |
| dc.identifier.scopus | eid_2-s2.0-85079784199 | - |
| dc.identifier.volume | 19 | - |
| dc.identifier.issue | 2 | - |
| dc.identifier.spage | 729 | - |
| dc.identifier.epage | 743 | - |
| dc.identifier.eissn | 1558-2248 | - |
| dc.identifier.isi | WOS:000522027400001 | - |
