File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TWC.2021.3104633
- Scopus: eid_2-s2.0-85113298251
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: Multi-Agent Reinforcement Learning in NOMA-Aided UAV Networks for Cellular Offloading
Title | Multi-Agent Reinforcement Learning in NOMA-Aided UAV Networks for Cellular Offloading |
---|---|
Authors | |
Keywords | Deep Q-network non-orthogonal multiple access reinforcement learning unmanned aerial vehicle |
Issue Date | 2022 |
Citation | IEEE Transactions on Wireless Communications, 2022, v. 21, n. 3, p. 1498-1512 How to Cite? |
Abstract | A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs), while non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network. The optimization problem of joint three-dimensional (3D) trajectory design and power allocation is formulated for maximizing the throughput. Since ground mobile users are considered as roaming continuously, the UAVs need to be re-deployed timely based on the movement of users. In an effort to solve this pertinent dynamic problem, a K-means based clustering algorithm is first adopted for periodically partitioning users. Afterward, a mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs. In contrast to the conventional deep Q-network (DQN) algorithm, the MDQN algorithm enables the experience of multi-agent to be input into a shared neural network to shorten the training time with the assistance of state abstraction. Numerical results demonstrate that: 1) the proposed MDQN algorithm is capable of converging under minor constraints and has a faster convergence rate than the conventional DQN algorithm in the multi-agent case; 2) The achievable sum rate of the NOMA enhanced UAV network is 23% superior to the case of orthogonal multiple access (OMA); 3) By designing the optimal 3D trajectory of UAVs with the MDON algorithm, the sum rate of the network enjoys 142% and 56% gains than invoking the circular trajectory and the 2D trajectory, respectively. |
Persistent Identifier | http://hdl.handle.net/10722/349600 |
ISSN | 2023 Impact Factor: 8.9 2023 SCImago Journal Rankings: 5.371 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Zhong, Ruikang | - |
dc.contributor.author | Liu, Xiao | - |
dc.contributor.author | Liu, Yuanwei | - |
dc.contributor.author | Chen, Yue | - |
dc.date.accessioned | 2024-10-17T06:59:37Z | - |
dc.date.available | 2024-10-17T06:59:37Z | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | IEEE Transactions on Wireless Communications, 2022, v. 21, n. 3, p. 1498-1512 | - |
dc.identifier.issn | 1536-1276 | - |
dc.identifier.uri | http://hdl.handle.net/10722/349600 | - |
dc.description.abstract | A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs), while non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network. The optimization problem of joint three-dimensional (3D) trajectory design and power allocation is formulated for maximizing the throughput. Since ground mobile users are considered as roaming continuously, the UAVs need to be re-deployed timely based on the movement of users. In an effort to solve this pertinent dynamic problem, a K-means based clustering algorithm is first adopted for periodically partitioning users. Afterward, a mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs. In contrast to the conventional deep Q-network (DQN) algorithm, the MDQN algorithm enables the experience of multi-agent to be input into a shared neural network to shorten the training time with the assistance of state abstraction. Numerical results demonstrate that: 1) the proposed MDQN algorithm is capable of converging under minor constraints and has a faster convergence rate than the conventional DQN algorithm in the multi-agent case; 2) The achievable sum rate of the NOMA enhanced UAV network is 23% superior to the case of orthogonal multiple access (OMA); 3) By designing the optimal 3D trajectory of UAVs with the MDON algorithm, the sum rate of the network enjoys 142% and 56% gains than invoking the circular trajectory and the 2D trajectory, respectively. | - |
dc.language | eng | - |
dc.relation.ispartof | IEEE Transactions on Wireless Communications | - |
dc.subject | Deep Q-network | - |
dc.subject | non-orthogonal multiple access | - |
dc.subject | reinforcement learning | - |
dc.subject | unmanned aerial vehicle | - |
dc.title | Multi-Agent Reinforcement Learning in NOMA-Aided UAV Networks for Cellular Offloading | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/TWC.2021.3104633 | - |
dc.identifier.scopus | eid_2-s2.0-85113298251 | - |
dc.identifier.volume | 21 | - |
dc.identifier.issue | 3 | - |
dc.identifier.spage | 1498 | - |
dc.identifier.epage | 1512 | - |
dc.identifier.eissn | 1558-2248 | - |