File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Learning Automata Based Q-Learning for Content Placement in Cooperative Caching

TitleLearning Automata Based Q-Learning for Content Placement in Cooperative Caching
Authors
Keywordscontent popularity prediction
Learning automata based Q-learning
quality of experience (QoE)
user mobility prediction
wireless cooperative caching
Issue Date2020
Citation
IEEE Transactions on Communications, 2020, v. 68, n. 6, p. 3667-3680 How to Cite?
AbstractAn optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing the sum mean opinion score (MOS) of mobile users. Firstly, as user mobility and content popularity have significant impacts on the user experience, a recurrent neural network (RNN) is invoked for user mobility prediction and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of user mobility prediction. Then, based on the predicted mobile users' positions and content popularity, a learning automata based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA based action selection scheme is capable of enabling every state to select the optimal action with arbitrary high probability if Q-learning is able to converge to the optimal Q value eventually. In the LAQL algorithm, a central processor acts as the intelligent agent, which allocate contents to BSs according to the reward or penalty from the feedback of the BSs and users, iteratively. To characterize the performance of the proposed LAQL algorithms, sum MOS of users is applied to define the reward function. Extensive simulation results reveal that: 1) the prediction error of RNNs based algorithm lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning algorithm; and 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%, respectively.
Persistent Identifierhttp://hdl.handle.net/10722/349437
ISSN
2023 Impact Factor: 7.2
2020 SCImago Journal Rankings: 1.468

 

DC FieldValueLanguage
dc.contributor.authorYang, Zhong-
dc.contributor.authorLiu, Yuanwei-
dc.contributor.authorChen, Yue-
dc.contributor.authorJiao, Lei-
dc.date.accessioned2024-10-17T06:58:31Z-
dc.date.available2024-10-17T06:58:31Z-
dc.date.issued2020-
dc.identifier.citationIEEE Transactions on Communications, 2020, v. 68, n. 6, p. 3667-3680-
dc.identifier.issn0090-6778-
dc.identifier.urihttp://hdl.handle.net/10722/349437-
dc.description.abstractAn optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing the sum mean opinion score (MOS) of mobile users. Firstly, as user mobility and content popularity have significant impacts on the user experience, a recurrent neural network (RNN) is invoked for user mobility prediction and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of user mobility prediction. Then, based on the predicted mobile users' positions and content popularity, a learning automata based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA based action selection scheme is capable of enabling every state to select the optimal action with arbitrary high probability if Q-learning is able to converge to the optimal Q value eventually. In the LAQL algorithm, a central processor acts as the intelligent agent, which allocate contents to BSs according to the reward or penalty from the feedback of the BSs and users, iteratively. To characterize the performance of the proposed LAQL algorithms, sum MOS of users is applied to define the reward function. Extensive simulation results reveal that: 1) the prediction error of RNNs based algorithm lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning algorithm; and 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%, respectively.-
dc.languageeng-
dc.relation.ispartofIEEE Transactions on Communications-
dc.subjectcontent popularity prediction-
dc.subjectLearning automata based Q-learning-
dc.subjectquality of experience (QoE)-
dc.subjectuser mobility prediction-
dc.subjectwireless cooperative caching-
dc.titleLearning Automata Based Q-Learning for Content Placement in Cooperative Caching-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/TCOMM.2020.2982136-
dc.identifier.scopuseid_2-s2.0-85086893911-
dc.identifier.volume68-
dc.identifier.issue6-
dc.identifier.spage3667-
dc.identifier.epage3680-
dc.identifier.eissn1558-0857-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats