File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TWC.2020.2995944
- Scopus: eid_2-s2.0-85091151663
- WOS: WOS:000568683900006
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Adaptive Video Streaming for Massive MIMO Networks via Approximate MDP and Reinforcement Learning
Title | Adaptive Video Streaming for Massive MIMO Networks via Approximate MDP and Reinforcement Learning |
---|---|
Authors | |
Keywords | Streaming media Quality of experience Wireless communication Bit rate Massive MIMO |
Issue Date | 2020 |
Publisher | Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=7693 |
Citation | IEEE Transactions on Wireless Communications, 2020, v. 9 n. 19, p. 5716-5731 How to Cite? |
Abstract | The scheduling of downlink video streaming in a massive multiple-input multiple-output (MIMO) network is considered in this paper, where active users arrive randomly to request video contents of a finite playback duration via their service base stations (BSs). Each video content consisting of a sequence of segments can be transmitted to the requesting users with variable video bitrates. We formulate the joint control of transmitted segment number, frame allocation and segment bitrate in all the super frames (each comprising multiple frames) as an infinite-horizon Markov decision process (MDP). The maximization objective is a discounted measurement of the average Quality of Experience (QoE). Since there is no efficient method for scheduling design with random user arrivals and departures in the existing literature, a novel approximate MDP method is proposed to obtain a low-complexity scheduling policy, where a lower bound on its performance is derived. Specifically, we first introduce a baseline policy and derive its asymptotic value function. One-step policy iteration is then applied to improve this value function, yielding the mentioned low-complexity policy. Finally, we propose a novel and efficient reinforcement learning (RL) algorithm to evaluate the value function when the prior knowledge on user arrival intensity is absent. |
Persistent Identifier | http://hdl.handle.net/10722/295883 |
ISSN | 2023 Impact Factor: 8.9 2023 SCImago Journal Rankings: 5.371 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | LAN, Q | - |
dc.contributor.author | Lv, B | - |
dc.contributor.author | Wang, R | - |
dc.contributor.author | Huang, K | - |
dc.contributor.author | Gong, Y | - |
dc.date.accessioned | 2021-02-08T08:15:23Z | - |
dc.date.available | 2021-02-08T08:15:23Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | IEEE Transactions on Wireless Communications, 2020, v. 9 n. 19, p. 5716-5731 | - |
dc.identifier.issn | 1536-1276 | - |
dc.identifier.uri | http://hdl.handle.net/10722/295883 | - |
dc.description.abstract | The scheduling of downlink video streaming in a massive multiple-input multiple-output (MIMO) network is considered in this paper, where active users arrive randomly to request video contents of a finite playback duration via their service base stations (BSs). Each video content consisting of a sequence of segments can be transmitted to the requesting users with variable video bitrates. We formulate the joint control of transmitted segment number, frame allocation and segment bitrate in all the super frames (each comprising multiple frames) as an infinite-horizon Markov decision process (MDP). The maximization objective is a discounted measurement of the average Quality of Experience (QoE). Since there is no efficient method for scheduling design with random user arrivals and departures in the existing literature, a novel approximate MDP method is proposed to obtain a low-complexity scheduling policy, where a lower bound on its performance is derived. Specifically, we first introduce a baseline policy and derive its asymptotic value function. One-step policy iteration is then applied to improve this value function, yielding the mentioned low-complexity policy. Finally, we propose a novel and efficient reinforcement learning (RL) algorithm to evaluate the value function when the prior knowledge on user arrival intensity is absent. | - |
dc.language | eng | - |
dc.publisher | Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=7693 | - |
dc.relation.ispartof | IEEE Transactions on Wireless Communications | - |
dc.rights | IEEE Transactions on Wireless Communications. Copyright © Institute of Electrical and Electronics Engineers. | - |
dc.rights | ©20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | - |
dc.subject | Streaming media | - |
dc.subject | Quality of experience | - |
dc.subject | Wireless communication | - |
dc.subject | Bit rate | - |
dc.subject | Massive MIMO | - |
dc.title | Adaptive Video Streaming for Massive MIMO Networks via Approximate MDP and Reinforcement Learning | - |
dc.type | Article | - |
dc.identifier.email | Huang, K: huangkb@eee.hku.hk | - |
dc.identifier.authority | Huang, K=rp01875 | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/TWC.2020.2995944 | - |
dc.identifier.scopus | eid_2-s2.0-85091151663 | - |
dc.identifier.hkuros | 321256 | - |
dc.identifier.volume | 9 | - |
dc.identifier.issue | 19 | - |
dc.identifier.spage | 5716 | - |
dc.identifier.epage | 5731 | - |
dc.identifier.isi | WOS:000568683900006 | - |
dc.publisher.place | United States | - |