Adaptive Video Streaming for Massive MIMO Networks via Approximate MDP and Reinforcement Learning

LAN, Q; Lv, B; Wang, R; Huang, K; Gong, Y

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TWC.2020.2995944
Scopus: eid_2-s2.0-85091151663
WOS: WOS:000568683900006
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Electrical & Electronic Engineering: Journal/Magazine Articles

Article: Adaptive Video Streaming for Massive MIMO Networks via Approximate MDP and Reinforcement Learning

Title	Adaptive Video Streaming for Massive MIMO Networks via Approximate MDP and Reinforcement Learning
Authors	LAN, Q Lv, B Wang, R Huang, K Gong, Y
Keywords	Streaming media Quality of experience Wireless communication Bit rate Massive MIMO
Issue Date	2020
Publisher	Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=7693
Citation	IEEE Transactions on Wireless Communications, 2020, v. 9 n. 19, p. 5716-5731 How to Cite? DOI: http://dx.doi.org/10.1109/TWC.2020.2995944
Abstract	The scheduling of downlink video streaming in a massive multiple-input multiple-output (MIMO) network is considered in this paper, where active users arrive randomly to request video contents of a finite playback duration via their service base stations (BSs). Each video content consisting of a sequence of segments can be transmitted to the requesting users with variable video bitrates. We formulate the joint control of transmitted segment number, frame allocation and segment bitrate in all the super frames (each comprising multiple frames) as an infinite-horizon Markov decision process (MDP). The maximization objective is a discounted measurement of the average Quality of Experience (QoE). Since there is no efficient method for scheduling design with random user arrivals and departures in the existing literature, a novel approximate MDP method is proposed to obtain a low-complexity scheduling policy, where a lower bound on its performance is derived. Specifically, we first introduce a baseline policy and derive its asymptotic value function. One-step policy iteration is then applied to improve this value function, yielding the mentioned low-complexity policy. Finally, we propose a novel and efficient reinforcement learning (RL) algorithm to evaluate the value function when the prior knowledge on user arrival intensity is absent.
Persistent Identifier	http://hdl.handle.net/10722/295883
ISSN	1536-1276 2021 Impact Factor: 8.346 2020 SCImago Journal Rankings: 2.010
ISI Accession Number ID	WOS:000568683900006

DC Field	Value	Language
dc.contributor.author	LAN, Q	-
dc.contributor.author	Lv, B	-
dc.contributor.author	Wang, R	-
dc.contributor.author	Huang, K	-
dc.contributor.author	Gong, Y	-
dc.date.accessioned	2021-02-08T08:15:23Z	-
dc.date.available	2021-02-08T08:15:23Z	-
dc.date.issued	2020	-
dc.identifier.citation	IEEE Transactions on Wireless Communications, 2020, v. 9 n. 19, p. 5716-5731	-
dc.identifier.issn	1536-1276	-
dc.identifier.uri	http://hdl.handle.net/10722/295883	-
dc.description.abstract	The scheduling of downlink video streaming in a massive multiple-input multiple-output (MIMO) network is considered in this paper, where active users arrive randomly to request video contents of a finite playback duration via their service base stations (BSs). Each video content consisting of a sequence of segments can be transmitted to the requesting users with variable video bitrates. We formulate the joint control of transmitted segment number, frame allocation and segment bitrate in all the super frames (each comprising multiple frames) as an infinite-horizon Markov decision process (MDP). The maximization objective is a discounted measurement of the average Quality of Experience (QoE). Since there is no efficient method for scheduling design with random user arrivals and departures in the existing literature, a novel approximate MDP method is proposed to obtain a low-complexity scheduling policy, where a lower bound on its performance is derived. Specifically, we first introduce a baseline policy and derive its asymptotic value function. One-step policy iteration is then applied to improve this value function, yielding the mentioned low-complexity policy. Finally, we propose a novel and efficient reinforcement learning (RL) algorithm to evaluate the value function when the prior knowledge on user arrival intensity is absent.	-
dc.language	eng	-
dc.publisher	Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=7693	-
dc.relation.ispartof	IEEE Transactions on Wireless Communications	-
dc.rights	IEEE Transactions on Wireless Communications. Copyright © Institute of Electrical and Electronics Engineers.	-
dc.rights	©20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	-
dc.subject	Streaming media	-
dc.subject	Quality of experience	-
dc.subject	Wireless communication	-
dc.subject	Bit rate	-
dc.subject	Massive MIMO	-
dc.title	Adaptive Video Streaming for Massive MIMO Networks via Approximate MDP and Reinforcement Learning	-
dc.type	Article	-
dc.identifier.email	Huang, K: huangkb@eee.hku.hk	-
dc.identifier.authority	Huang, K=rp01875	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TWC.2020.2995944	-
dc.identifier.scopus	eid_2-s2.0-85091151663	-
dc.identifier.hkuros	321256	-
dc.identifier.volume	9	-
dc.identifier.issue	19	-
dc.identifier.spage	5716	-
dc.identifier.epage	5731	-
dc.identifier.isi	WOS:000568683900006	-
dc.publisher.place	United States	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Adaptive Video Streaming for Massive MIMO Networks via Approximate MDP and Reinforcement Learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats