Stable and Effective Trainable Greedy Decoding for Sequence to Sequence Learning

Chen, Y; Cho, K; Bowman, SR; Li, VOK

File Download

There are no files associated with this item.

Supplementary

Citations:
Appears in Collections:
- Electrical & Electronic Engineering: Conference papers

Conference Paper: Stable and Effective Trainable Greedy Decoding for Sequence to Sequence Learning

Title	Stable and Effective Trainable Greedy Decoding for Sequence to Sequence Learning
Authors	Chen, Y Cho, K Bowman, SR Li, VOK
Keywords	NLP NMT Seq2Seq Beam search
Issue Date	2018
Citation	Sixth International Conference on Learning Representations (ICLR) Workshop, Vancouver, Canada, 30 April - 3 May 2018 How to Cite?
Abstract	We introduce a fast, general method to manipulate the behavior of the decoder in a sequence to sequence neural network model. We propose a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. We evaluate our model on the task of neural machine translation. In this task, we use beam search to decode sentences from the plain decoder for each training set input, rank them by BLEU score, and train the actor to encourage the decoder to generate the highest-BLEU output in a single greedy decoding operation without beam search. Experiments on several datasets and models show that our method yields substantial improvements in both translation quality and translation speed over its base system, with no additional data.
Persistent Identifier	http://hdl.handle.net/10722/261954

DC Field	Value	Language
dc.contributor.author	Chen, Y	-
dc.contributor.author	Cho, K	-
dc.contributor.author	Bowman, SR	-
dc.contributor.author	Li, VOK	-
dc.date.accessioned	2018-09-28T04:50:54Z	-
dc.date.available	2018-09-28T04:50:54Z	-
dc.date.issued	2018	-
dc.identifier.citation	Sixth International Conference on Learning Representations (ICLR) Workshop, Vancouver, Canada, 30 April - 3 May 2018	-
dc.identifier.uri	http://hdl.handle.net/10722/261954	-
dc.description.abstract	We introduce a fast, general method to manipulate the behavior of the decoder in a sequence to sequence neural network model. We propose a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. We evaluate our model on the task of neural machine translation. In this task, we use beam search to decode sentences from the plain decoder for each training set input, rank them by BLEU score, and train the actor to encourage the decoder to generate the highest-BLEU output in a single greedy decoding operation without beam search. Experiments on several datasets and models show that our method yields substantial improvements in both translation quality and translation speed over its base system, with no additional data.	-
dc.language	eng	-
dc.relation.ispartof	International Conference on Learning Representations (ICLR) Workshop	-
dc.subject	NLP	-
dc.subject	NMT	-
dc.subject	Seq2Seq	-
dc.subject	Beam search	-
dc.title	Stable and Effective Trainable Greedy Decoding for Sequence to Sequence Learning	-
dc.type	Conference_Paper	-
dc.identifier.email	Li, VOK: vli@eee.hku.hk	-
dc.identifier.authority	Li, VOK=rp00150	-
dc.identifier.hkuros	292172	-
dc.publisher.place	Vancouver, Canada	-

File Download

Supplementary

Conference Paper: Stable and Effective Trainable Greedy Decoding for Sequence to Sequence Learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats