A Reliable Reinforcement Learning for Resource Allocation in Uplink NOMA-URLLC Networks

Ahsan, Waleed; Yi, Wenqiang; Liu, Yuanwei; Nallanathan, Arumugam

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TWC.2022.3144618
Scopus: eid_2-s2.0-85124229416
WOS: WOS:000841840300023
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Electrical & Electronic Engineering: Journal/Magazine Articles

Article: A Reliable Reinforcement Learning for Resource Allocation in Uplink NOMA-URLLC Networks

Title	A Reliable Reinforcement Learning for Resource Allocation in Uplink NOMA-URLLC Networks
Authors	Ahsan, Waleed Yi, Wenqiang Liu, Yuanwei Nallanathan, Arumugam
Keywords	Deep SARSA-λ learning non-orthogonal multiple access power allocation ultra-reliable low-latency communication user clustering
Issue Date	2022
Citation	IEEE Transactions on Wireless Communications, 2022, v. 21, n. 8, p. 5989-6002 How to Cite? DOI: http://dx.doi.org/10.1109/TWC.2022.3144618
Abstract	In this paper, we propose a deep state-action-reward-state-action (SARSA) λ learning approach for optimising the uplink resource allocation in non-orthogonal multiple access (NOMA) aided ultra-reliable low-latency communication (URLLC). To reduce the mean decoding error probability in time-varying network environments, this work designs a reliable learning algorithm for providing a long-term resource allocation, where the reward feedback is based on the instantaneous network performance. With the aid of the proposed algorithm, this paper addresses three main challenges of the reliable resource sharing in NOMA-URLLC networks: 1) user clustering; 2) Instantaneous feedback system; and 3) Optimal resource allocation. All of these designs interact with the considered communication environment. Lastly, we compare the performance of the proposed algorithm with conventional Q-learning and SARSA Q-learning algorithms. The simulation outcomes show that: 1) Compared with the traditional Q learning algorithms, the proposed solution is able to converge within 200 episodes for providing as low as 10-2 long-term mean error; 2) NOMA assisted URLLC outperforms traditional OMA systems in terms of decoding error probabilities; and 3) The proposed feedback system is efficient for the long-term learning process.
Persistent Identifier	http://hdl.handle.net/10722/349689
ISSN	1536-1276 2023 Impact Factor: 8.9 2023 SCImago Journal Rankings: 5.371
ISI Accession Number ID	WOS:000841840300023

DC Field	Value	Language
dc.contributor.author	Ahsan, Waleed	-
dc.contributor.author	Yi, Wenqiang	-
dc.contributor.author	Liu, Yuanwei	-
dc.contributor.author	Nallanathan, Arumugam	-
dc.date.accessioned	2024-10-17T07:00:09Z	-
dc.date.available	2024-10-17T07:00:09Z	-
dc.date.issued	2022	-
dc.identifier.citation	IEEE Transactions on Wireless Communications, 2022, v. 21, n. 8, p. 5989-6002	-
dc.identifier.issn	1536-1276	-
dc.identifier.uri	http://hdl.handle.net/10722/349689	-
dc.description.abstract	In this paper, we propose a deep state-action-reward-state-action (SARSA) λ learning approach for optimising the uplink resource allocation in non-orthogonal multiple access (NOMA) aided ultra-reliable low-latency communication (URLLC). To reduce the mean decoding error probability in time-varying network environments, this work designs a reliable learning algorithm for providing a long-term resource allocation, where the reward feedback is based on the instantaneous network performance. With the aid of the proposed algorithm, this paper addresses three main challenges of the reliable resource sharing in NOMA-URLLC networks: 1) user clustering; 2) Instantaneous feedback system; and 3) Optimal resource allocation. All of these designs interact with the considered communication environment. Lastly, we compare the performance of the proposed algorithm with conventional Q-learning and SARSA Q-learning algorithms. The simulation outcomes show that: 1) Compared with the traditional Q learning algorithms, the proposed solution is able to converge within 200 episodes for providing as low as 10-2 long-term mean error; 2) NOMA assisted URLLC outperforms traditional OMA systems in terms of decoding error probabilities; and 3) The proposed feedback system is efficient for the long-term learning process.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Transactions on Wireless Communications	-
dc.subject	Deep SARSA-λ learning	-
dc.subject	non-orthogonal multiple access	-
dc.subject	power allocation	-
dc.subject	ultra-reliable low-latency communication	-
dc.subject	user clustering	-
dc.title	A Reliable Reinforcement Learning for Resource Allocation in Uplink NOMA-URLLC Networks	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TWC.2022.3144618	-
dc.identifier.scopus	eid_2-s2.0-85124229416	-
dc.identifier.volume	21	-
dc.identifier.issue	8	-
dc.identifier.spage	5989	-
dc.identifier.epage	6002	-
dc.identifier.eissn	1558-2248	-
dc.identifier.isi	WOS:000841840300023	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: A Reliable Reinforcement Learning for Resource Allocation in Uplink NOMA-URLLC Networks

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats