File Download
Supplementary

postgraduate thesis: Neural machine translation with a unified framework of transferable models

TitleNeural machine translation with a unified framework of transferable models
Authors
Advisors
Advisor(s):Li, VOKHuang, K
Issue Date2020
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Wang, Y. [王永]. (2020). Neural machine translation with a unified framework of transferable models. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractTeaching machines to communicate seamlessly across human languages is a longstanding challenge in artificial intelligence. Neural machine translation (NMT), which employs the encoder-decoder framework, has received increasing interests and achieved remarkable success recently. Unlike conventional pipelined statistical machine translation (SMT) with many separate components, NMT is conceptually simple and empirically powerful due to its end-to-end characteristic and strong flexibility during training and inference. This endows NMT with a particularly desirable ability to construct translation models with a unified framework in some practical scenarios, including multi-lingual, multi-domain and extremely low-resource. In this thesis, we focus on constructing NMT systems with an effective unified framework, which has the benefits of: 1) sharing knowledge across languages and domains effectively; 2) practical deployment with fewer parameters in the production. We present empirically effective and customized solutions for improving NMT systems with a unified framework. Extensive experiments on a variety of datasets demonstrate the effectiveness and universality of these proposed approaches. Contributions of this thesis include: 1) analyzing the issues of zero-shot translation quantitatively and successfully closing the gap of the performance between zero-shot translation and pivot-based translation; 2) proposing to explicitly transform domain knowledge for a multi-domain NMT model and achieving state-of-the-art performances in multi-domain NMT research; 3) proposing the large margin principle for the meta-learning algorithm, and pioneering the application of meta-learning to extremely low-resource translation in multi-lingual NMT successfully; 4) analyzing the underlying causes of why an NMT system with a unified framework enables knowledge sharing and transfer effectively from two aspects, namely, the model capacity in neural networks and the existence of redundant parameters in NMT systems. We discuss several promising research directions: 1) tackling the problems in real scenarios for low-resource language pairs; 2) incorporating the prevailing pre-training strategy into NMT systems; 3) closing the gap between parallel decoding and auto-regressive decoding; 4) removing the inductive bias of the decoder. We believe that these research directions hold great potential for future intelligent technologies.
DegreeDoctor of Philosophy
SubjectNeural networks (Computer science)
Dept/ProgramElectrical and Electronic Engineering
Persistent Identifierhttp://hdl.handle.net/10722/286006

 

DC FieldValueLanguage
dc.contributor.advisorLi, VOK-
dc.contributor.advisorHuang, K-
dc.contributor.authorWang, Yong-
dc.contributor.author王永-
dc.date.accessioned2020-08-25T08:43:53Z-
dc.date.available2020-08-25T08:43:53Z-
dc.date.issued2020-
dc.identifier.citationWang, Y. [王永]. (2020). Neural machine translation with a unified framework of transferable models. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/286006-
dc.description.abstractTeaching machines to communicate seamlessly across human languages is a longstanding challenge in artificial intelligence. Neural machine translation (NMT), which employs the encoder-decoder framework, has received increasing interests and achieved remarkable success recently. Unlike conventional pipelined statistical machine translation (SMT) with many separate components, NMT is conceptually simple and empirically powerful due to its end-to-end characteristic and strong flexibility during training and inference. This endows NMT with a particularly desirable ability to construct translation models with a unified framework in some practical scenarios, including multi-lingual, multi-domain and extremely low-resource. In this thesis, we focus on constructing NMT systems with an effective unified framework, which has the benefits of: 1) sharing knowledge across languages and domains effectively; 2) practical deployment with fewer parameters in the production. We present empirically effective and customized solutions for improving NMT systems with a unified framework. Extensive experiments on a variety of datasets demonstrate the effectiveness and universality of these proposed approaches. Contributions of this thesis include: 1) analyzing the issues of zero-shot translation quantitatively and successfully closing the gap of the performance between zero-shot translation and pivot-based translation; 2) proposing to explicitly transform domain knowledge for a multi-domain NMT model and achieving state-of-the-art performances in multi-domain NMT research; 3) proposing the large margin principle for the meta-learning algorithm, and pioneering the application of meta-learning to extremely low-resource translation in multi-lingual NMT successfully; 4) analyzing the underlying causes of why an NMT system with a unified framework enables knowledge sharing and transfer effectively from two aspects, namely, the model capacity in neural networks and the existence of redundant parameters in NMT systems. We discuss several promising research directions: 1) tackling the problems in real scenarios for low-resource language pairs; 2) incorporating the prevailing pre-training strategy into NMT systems; 3) closing the gap between parallel decoding and auto-regressive decoding; 4) removing the inductive bias of the decoder. We believe that these research directions hold great potential for future intelligent technologies.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshNeural networks (Computer science)-
dc.titleNeural machine translation with a unified framework of transferable models-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineElectrical and Electronic Engineering-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2020-
dc.identifier.mmsid991044264455403414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats