File Download
Supplementary

postgraduate thesis: Knowledge transfer for improved neural machine translation

TitleKnowledge transfer for improved neural machine translation
Authors
Advisors
Advisor(s):Li, VOK
Issue Date2018
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Chen, Y. [陈云]. (2018). Knowledge transfer for improved neural machine translation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractBeing able to communicate seamlessly across human languages has long been associted with the general success of artificial intelligence. Despite great success in the field of neural machine translation (NMT) since its invention in 2014 (Sutskever et al., 2014; Bahdanau et al., 2015), translation quality and speed have not yet satisfied users, especially for resource-scarce language pairs and on-device real-time applications (Koehn and Knowles, 2017). The standard NMT builds a translation model from source to target with maximum likelihood estimation (MLE) on parallel source-target corpora and then use approximate decoding algorithms such as greedy decoding or beam search to translate source sentences at inference time. The training and decoding procedures are independent without the interaction with other NMT models. This thesis proposes new training and decoding strategies by interaction with other NMT models through knowledge transfer to improve neural machine translation, including the following topics: • Improving zero-resource neural machine translation (ZNMT): We propose two pivot-based approaches to tackle this problem: a) a teacher-student framework which transfers the knowledge from a high-resource model (teacher) to a zero-resource model (student) by training the student under the supervision of the teacher; b) a multi-agent communication game in which the zero-resource model learns by playing the game with the high-resource model. • Improving decoding efficiency for NMT: We propose a novel training strategy for training an actor-augmented decoder to optimize greedy decoding by transferring knowledge of a trained translation model. • Improving training strategy for NMT: We propose Born Again Networks (BANs) for training NMT. In a manner reminiscent to Minsky’s Sequence of Teaching Selves (Minsky, 1991), we train a sequence of models of identical capacity with improved performance by transferring the knowledge from its previous model. We compare our approaches with the state-of-the-art NMT systems on different datasets, such as IWSLT, Europarl and WMT, and language pairs, such as German-English, Finnish-English, French-English, and Spanish-English. Extensive experiments demonstrate that all of our proposed methods can achieve their individual goals by effectively transferring the knowledge from auxiliary models or tasks to the target NMT model. (341 words)
DegreeDoctor of Philosophy
SubjectMachine translating
Dept/ProgramElectrical and Electronic Engineering
Persistent Identifierhttp://hdl.handle.net/10722/265389

 

DC FieldValueLanguage
dc.contributor.advisorLi, VOK-
dc.contributor.authorChen, Yun-
dc.contributor.author陈云-
dc.date.accessioned2018-11-29T06:22:32Z-
dc.date.available2018-11-29T06:22:32Z-
dc.date.issued2018-
dc.identifier.citationChen, Y. [陈云]. (2018). Knowledge transfer for improved neural machine translation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/265389-
dc.description.abstractBeing able to communicate seamlessly across human languages has long been associted with the general success of artificial intelligence. Despite great success in the field of neural machine translation (NMT) since its invention in 2014 (Sutskever et al., 2014; Bahdanau et al., 2015), translation quality and speed have not yet satisfied users, especially for resource-scarce language pairs and on-device real-time applications (Koehn and Knowles, 2017). The standard NMT builds a translation model from source to target with maximum likelihood estimation (MLE) on parallel source-target corpora and then use approximate decoding algorithms such as greedy decoding or beam search to translate source sentences at inference time. The training and decoding procedures are independent without the interaction with other NMT models. This thesis proposes new training and decoding strategies by interaction with other NMT models through knowledge transfer to improve neural machine translation, including the following topics: • Improving zero-resource neural machine translation (ZNMT): We propose two pivot-based approaches to tackle this problem: a) a teacher-student framework which transfers the knowledge from a high-resource model (teacher) to a zero-resource model (student) by training the student under the supervision of the teacher; b) a multi-agent communication game in which the zero-resource model learns by playing the game with the high-resource model. • Improving decoding efficiency for NMT: We propose a novel training strategy for training an actor-augmented decoder to optimize greedy decoding by transferring knowledge of a trained translation model. • Improving training strategy for NMT: We propose Born Again Networks (BANs) for training NMT. In a manner reminiscent to Minsky’s Sequence of Teaching Selves (Minsky, 1991), we train a sequence of models of identical capacity with improved performance by transferring the knowledge from its previous model. We compare our approaches with the state-of-the-art NMT systems on different datasets, such as IWSLT, Europarl and WMT, and language pairs, such as German-English, Finnish-English, French-English, and Spanish-English. Extensive experiments demonstrate that all of our proposed methods can achieve their individual goals by effectively transferring the knowledge from auxiliary models or tasks to the target NMT model. (341 words)-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshMachine translating-
dc.titleKnowledge transfer for improved neural machine translation-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineElectrical and Electronic Engineering-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2018-
dc.identifier.mmsid991044058178103414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats