File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: Stable and Effective Trainable Greedy Decoding for Sequence to Sequence Learning
Title | Stable and Effective Trainable Greedy Decoding for Sequence to Sequence Learning |
---|---|
Authors | |
Keywords | NLP NMT Seq2Seq Beam search |
Issue Date | 2018 |
Citation | Sixth International Conference on Learning Representations (ICLR) Workshop, Vancouver, Canada, 30 April - 3 May 2018 How to Cite? |
Abstract | We introduce a fast, general method to manipulate the behavior of the decoder in a sequence to sequence neural network model. We propose a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. We evaluate our model on the task of neural machine translation. In this task, we use beam search to decode sentences from the plain decoder for each training set input, rank them by BLEU score, and train the actor to encourage the decoder to generate the highest-BLEU output in a single greedy decoding operation without beam search. Experiments on several datasets and models show that our method yields substantial improvements in both translation quality and translation speed over its base system, with no additional data. |
Persistent Identifier | http://hdl.handle.net/10722/261954 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chen, Y | - |
dc.contributor.author | Cho, K | - |
dc.contributor.author | Bowman, SR | - |
dc.contributor.author | Li, VOK | - |
dc.date.accessioned | 2018-09-28T04:50:54Z | - |
dc.date.available | 2018-09-28T04:50:54Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Sixth International Conference on Learning Representations (ICLR) Workshop, Vancouver, Canada, 30 April - 3 May 2018 | - |
dc.identifier.uri | http://hdl.handle.net/10722/261954 | - |
dc.description.abstract | We introduce a fast, general method to manipulate the behavior of the decoder in a sequence to sequence neural network model. We propose a small neural network actor that observes and manipulates the hidden state of a previously-trained decoder. We evaluate our model on the task of neural machine translation. In this task, we use beam search to decode sentences from the plain decoder for each training set input, rank them by BLEU score, and train the actor to encourage the decoder to generate the highest-BLEU output in a single greedy decoding operation without beam search. Experiments on several datasets and models show that our method yields substantial improvements in both translation quality and translation speed over its base system, with no additional data. | - |
dc.language | eng | - |
dc.relation.ispartof | International Conference on Learning Representations (ICLR) Workshop | - |
dc.subject | NLP | - |
dc.subject | NMT | - |
dc.subject | Seq2Seq | - |
dc.subject | Beam search | - |
dc.title | Stable and Effective Trainable Greedy Decoding for Sequence to Sequence Learning | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Li, VOK: vli@eee.hku.hk | - |
dc.identifier.authority | Li, VOK=rp00150 | - |
dc.identifier.hkuros | 292172 | - |
dc.publisher.place | Vancouver, Canada | - |