Multi-Agent Reinforcement Learning in NOMA-Aided UAV Networks for Cellular Offloading

Zhong, Ruikang; Liu, Xiao; Liu, Yuanwei; Chen, Yue

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TWC.2021.3104633
Scopus: eid_2-s2.0-85113298251
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Electrical & Electronic Engineering: Journal/Magazine Articles

Article: Multi-Agent Reinforcement Learning in NOMA-Aided UAV Networks for Cellular Offloading

Title	Multi-Agent Reinforcement Learning in NOMA-Aided UAV Networks for Cellular Offloading
Authors	Zhong, Ruikang Liu, Xiao Liu, Yuanwei Chen, Yue
Keywords	Deep Q-network non-orthogonal multiple access reinforcement learning unmanned aerial vehicle
Issue Date	2022
Citation	IEEE Transactions on Wireless Communications, 2022, v. 21, n. 3, p. 1498-1512 How to Cite? DOI: http://dx.doi.org/10.1109/TWC.2021.3104633
Abstract	A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs), while non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network. The optimization problem of joint three-dimensional (3D) trajectory design and power allocation is formulated for maximizing the throughput. Since ground mobile users are considered as roaming continuously, the UAVs need to be re-deployed timely based on the movement of users. In an effort to solve this pertinent dynamic problem, a K-means based clustering algorithm is first adopted for periodically partitioning users. Afterward, a mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs. In contrast to the conventional deep Q-network (DQN) algorithm, the MDQN algorithm enables the experience of multi-agent to be input into a shared neural network to shorten the training time with the assistance of state abstraction. Numerical results demonstrate that: 1) the proposed MDQN algorithm is capable of converging under minor constraints and has a faster convergence rate than the conventional DQN algorithm in the multi-agent case; 2) The achievable sum rate of the NOMA enhanced UAV network is 23% superior to the case of orthogonal multiple access (OMA); 3) By designing the optimal 3D trajectory of UAVs with the MDON algorithm, the sum rate of the network enjoys 142% and 56% gains than invoking the circular trajectory and the 2D trajectory, respectively.
Persistent Identifier	http://hdl.handle.net/10722/349600
ISSN	1536-1276 2023 Impact Factor: 8.9 2023 SCImago Journal Rankings: 5.371

DC Field	Value	Language
dc.contributor.author	Zhong, Ruikang	-
dc.contributor.author	Liu, Xiao	-
dc.contributor.author	Liu, Yuanwei	-
dc.contributor.author	Chen, Yue	-
dc.date.accessioned	2024-10-17T06:59:37Z	-
dc.date.available	2024-10-17T06:59:37Z	-
dc.date.issued	2022	-
dc.identifier.citation	IEEE Transactions on Wireless Communications, 2022, v. 21, n. 3, p. 1498-1512	-
dc.identifier.issn	1536-1276	-
dc.identifier.uri	http://hdl.handle.net/10722/349600	-
dc.description.abstract	A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs), while non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network. The optimization problem of joint three-dimensional (3D) trajectory design and power allocation is formulated for maximizing the throughput. Since ground mobile users are considered as roaming continuously, the UAVs need to be re-deployed timely based on the movement of users. In an effort to solve this pertinent dynamic problem, a K-means based clustering algorithm is first adopted for periodically partitioning users. Afterward, a mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs. In contrast to the conventional deep Q-network (DQN) algorithm, the MDQN algorithm enables the experience of multi-agent to be input into a shared neural network to shorten the training time with the assistance of state abstraction. Numerical results demonstrate that: 1) the proposed MDQN algorithm is capable of converging under minor constraints and has a faster convergence rate than the conventional DQN algorithm in the multi-agent case; 2) The achievable sum rate of the NOMA enhanced UAV network is 23% superior to the case of orthogonal multiple access (OMA); 3) By designing the optimal 3D trajectory of UAVs with the MDON algorithm, the sum rate of the network enjoys 142% and 56% gains than invoking the circular trajectory and the 2D trajectory, respectively.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Transactions on Wireless Communications	-
dc.subject	Deep Q-network	-
dc.subject	non-orthogonal multiple access	-
dc.subject	reinforcement learning	-
dc.subject	unmanned aerial vehicle	-
dc.title	Multi-Agent Reinforcement Learning in NOMA-Aided UAV Networks for Cellular Offloading	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TWC.2021.3104633	-
dc.identifier.scopus	eid_2-s2.0-85113298251	-
dc.identifier.volume	21	-
dc.identifier.issue	3	-
dc.identifier.spage	1498	-
dc.identifier.epage	1512	-
dc.identifier.eissn	1558-2248	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Multi-Agent Reinforcement Learning in NOMA-Aided UAV Networks for Cellular Offloading

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats