Learning by reusing previous advice: a memory-based teacher–student framework

Zhu, C; Cai, Y; Hu, S; Leung, H; Chiu, KWD

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1007/s10458-022-09595-1
WOS: WOS:000905425700001

Supplementary

Citations:
- Web of Science: 0
Appears in Collections:
- Faculty of Education: Journal/Magazine Articles

Article: Learning by reusing previous advice: a memory-based teacher–student framework

Title	Learning by reusing previous advice: a memory-based teacher–student framework
Authors	Zhu, C Cai, Y Hu, S Leung, H Chiu, KWD
Issue Date	2022
Publisher	Springer. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=1387-2532
Citation	Autonomous Agents and Multi-Agent Systems, 2022, v. 37 n. 1, p. 14 How to Cite? DOI: http://dx.doi.org/10.1007/s10458-022-09595-1
Abstract	Reinforcement Learning (RL) has been widely used to solve sequential decision-making problems. However, it often suffers from slow learning speed in complex scenarios. Teacher–student frameworks address this issue by enabling agents to ask for and give advice so that a student agent can leverage the knowledge of a teacher agent to facilitate its learning. In this paper, we consider the effect of reusing previous advice, and propose a novel memory-based teacher–student framework such that student agents can memorize and reuse the previous advice from teacher agents. In particular, we propose two methods to decide whether previous advice should be reused: Q-Change per Step that reuses the advice if it leads to an increase in Q-values, and Decay Reusing Probability that reuses the advice with a decaying probability. The experiments on diverse RL tasks (Mario, Predator–Prey and Half Field Offense) confirm that our proposed framework significantly outperforms the existing frameworks in which previous advice is not reused.
Persistent Identifier	http://hdl.handle.net/10722/324859
ISI Accession Number ID	WOS:000905425700001

DC Field	Value	Language
dc.contributor.author	Zhu, C	-
dc.contributor.author	Cai, Y	-
dc.contributor.author	Hu, S	-
dc.contributor.author	Leung, H	-
dc.contributor.author	Chiu, KWD	-
dc.date.accessioned	2023-02-20T01:39:19Z	-
dc.date.available	2023-02-20T01:39:19Z	-
dc.date.issued	2022	-
dc.identifier.citation	Autonomous Agents and Multi-Agent Systems, 2022, v. 37 n. 1, p. 14	-
dc.identifier.uri	http://hdl.handle.net/10722/324859	-
dc.description.abstract	Reinforcement Learning (RL) has been widely used to solve sequential decision-making problems. However, it often suffers from slow learning speed in complex scenarios. Teacher–student frameworks address this issue by enabling agents to ask for and give advice so that a student agent can leverage the knowledge of a teacher agent to facilitate its learning. In this paper, we consider the effect of reusing previous advice, and propose a novel memory-based teacher–student framework such that student agents can memorize and reuse the previous advice from teacher agents. In particular, we propose two methods to decide whether previous advice should be reused: Q-Change per Step that reuses the advice if it leads to an increase in Q-values, and Decay Reusing Probability that reuses the advice with a decaying probability. The experiments on diverse RL tasks (Mario, Predator–Prey and Half Field Offense) confirm that our proposed framework significantly outperforms the existing frameworks in which previous advice is not reused.	-
dc.language	eng	-
dc.publisher	Springer. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=1387-2532	-
dc.relation.ispartof	Autonomous Agents and Multi-Agent Systems	-
dc.rights	This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/[insert DOI]	-
dc.title	Learning by reusing previous advice: a memory-based teacher–student framework	-
dc.type	Article	-
dc.identifier.email	Chiu, KWD: dchiu88@hku.hk	-
dc.identifier.doi	10.1007/s10458-022-09595-1	-
dc.identifier.hkuros	343753	-
dc.identifier.volume	37	-
dc.identifier.issue	1	-
dc.identifier.spage	14	-
dc.identifier.epage	14	-
dc.identifier.isi	WOS:000905425700001	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Learning by reusing previous advice: a memory-based teacher–student framework

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats