Representation learning for natural language processing

Zhou, Chunting; 周春婷

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b5807307

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Representation learning for natural language processing

Title	Representation learning for natural language processing
Authors	Zhou, Chunting 周春婷
Issue Date	2016
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Zhou, C. [周春婷]. (2016). Representation learning for natural language processing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Many applications in natural language processing (NLP) depend on extracting compact and informative features from unstructured texts. Distributed representations encode data as continuous vectors, by constructing a projection from semantics to points in high dimensional spaces. As an effective and efficient way of learning such representations, neural network based models have obtained great successes in various NLP applications. For different NLP scenarios, different types of neural networks can be carefully assembled to mimic human's behavioral patterns of understanding the real world or to exploit the semantics based on linguistic composition. This thesis focuses on representation learning at different granularities; different neural network models are designed that embed features of words, sentences and articles. Learned representations are applied in three selected natural language applications: sentiment classification, text classification and reading comprehension. The first part of this work learns word representations with a shallow neural network, in which global topical knowledge is incorporated into local contexts when a word interacts with its context words. The learned word vectors are evaluated on three tasks: word analogical reasoning, word similarity task and text classification. Experiments show that word embeddings are enhanced in terms of preserving their semantic information together with category knowledge. The second part of this work proposes a model called \C-LSTM" which can learn sentence representations through combining a convolutional neural network (CNN) and a long short-term memory (LSTM) recurrent neural network. The input to the LSTM network is sequential n-gram features extracted by the CNN, and thus at each time step the input has integrated a wide range of context information. Experiments on sentiment classification and question-type classification tasks demonstrate the remarkable performance of C-LSTM as compared with state-of-the-art models. The last part of this work proposes a context aware attention neural network model for reading comprehension. Two separate convolutional neural networks are employed to encode the articles, and the max pooling result of one CNN is used to attentively read the other CNN output. The attended representation of the article together with the query representation are used to predict the answer. Extensive experimental analyses have verified its effectiveness in extracting informative n-grams that match the query or are helpful to answer inference.
Degree	Master of Philosophy
Subject	Natural language processing (Computer science)
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/236583
HKU Library Item ID	b5807307

DC Field	Value	Language
dc.contributor.author	Zhou, Chunting	-
dc.contributor.author	周春婷	-
dc.date.accessioned	2016-11-28T23:28:12Z	-
dc.date.available	2016-11-28T23:28:12Z	-
dc.date.issued	2016	-
dc.identifier.citation	Zhou, C. [周春婷]. (2016). Representation learning for natural language processing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/236583	-
dc.description.abstract	Many applications in natural language processing (NLP) depend on extracting compact and informative features from unstructured texts. Distributed representations encode data as continuous vectors, by constructing a projection from semantics to points in high dimensional spaces. As an effective and efficient way of learning such representations, neural network based models have obtained great successes in various NLP applications. For different NLP scenarios, different types of neural networks can be carefully assembled to mimic human's behavioral patterns of understanding the real world or to exploit the semantics based on linguistic composition. This thesis focuses on representation learning at different granularities; different neural network models are designed that embed features of words, sentences and articles. Learned representations are applied in three selected natural language applications: sentiment classification, text classification and reading comprehension. The first part of this work learns word representations with a shallow neural network, in which global topical knowledge is incorporated into local contexts when a word interacts with its context words. The learned word vectors are evaluated on three tasks: word analogical reasoning, word similarity task and text classification. Experiments show that word embeddings are enhanced in terms of preserving their semantic information together with category knowledge. The second part of this work proposes a model called \C-LSTM" which can learn sentence representations through combining a convolutional neural network (CNN) and a long short-term memory (LSTM) recurrent neural network. The input to the LSTM network is sequential n-gram features extracted by the CNN, and thus at each time step the input has integrated a wide range of context information. Experiments on sentiment classification and question-type classification tasks demonstrate the remarkable performance of C-LSTM as compared with state-of-the-art models. The last part of this work proposes a context aware attention neural network model for reading comprehension. Two separate convolutional neural networks are employed to encode the articles, and the max pooling result of one CNN is used to attentively read the other CNN output. The attended representation of the article together with the query representation are used to predict the answer. Extensive experimental analyses have verified its effectiveness in extracting informative n-grams that match the query or are helpful to answer inference.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.subject.lcsh	Natural language processing (Computer science)	-
dc.title	Representation learning for natural language processing	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b5807307	-
dc.description.thesisname	Master of Philosophy	-
dc.description.thesislevel	Master	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b5807307	-
dc.identifier.mmsid	991020915849703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Representation learning for natural language processing

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats