File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Representation learning for natural language processing
Title | Representation learning for natural language processing |
---|---|
Authors | |
Issue Date | 2016 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Zhou, C. [周春婷]. (2016). Representation learning for natural language processing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Many applications in natural language processing (NLP) depend on extracting compact and informative features from unstructured texts. Distributed representations encode data as continuous vectors, by constructing a projection from semantics to points in high dimensional spaces. As an effective and efficient way of learning such representations, neural network based models have obtained great successes in various NLP applications. For different NLP scenarios, different types of neural networks can be carefully assembled to mimic human's behavioral patterns of understanding the real world or to exploit the semantics based on linguistic composition. This thesis focuses on representation learning at different granularities; different neural network models are designed that embed features of words, sentences and articles. Learned representations are applied in three selected natural language applications: sentiment classification, text classification and reading comprehension. The first part of this work learns word representations with a shallow neural network, in which global topical knowledge is incorporated into local contexts when a word interacts with its context words. The learned word vectors are evaluated on three tasks: word analogical reasoning, word similarity task and text classification. Experiments show that word embeddings are enhanced in terms of preserving their semantic information together with category knowledge. The second part of this work proposes a model called \C-LSTM" which can learn sentence representations through combining a convolutional neural network (CNN) and a long short-term memory (LSTM) recurrent neural network. The input to the LSTM network is sequential n-gram features extracted by the CNN, and thus at each time step the input has integrated a wide range of context information. Experiments on sentiment classification and question-type classification tasks demonstrate the remarkable performance of C-LSTM as compared with state-of-the-art models. The last part of this work proposes a context aware attention neural network model for reading comprehension. Two separate convolutional neural networks are employed to encode the articles, and the max pooling result of one CNN is used to attentively read the other CNN output. The attended representation of the article together with the query representation are used to predict the answer. Extensive experimental analyses have verified its effectiveness in extracting informative n-grams that match the query or are helpful to answer inference. |
Degree | Master of Philosophy |
Subject | Natural language processing (Computer science) |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/236583 |
HKU Library Item ID | b5807307 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Zhou, Chunting | - |
dc.contributor.author | 周春婷 | - |
dc.date.accessioned | 2016-11-28T23:28:12Z | - |
dc.date.available | 2016-11-28T23:28:12Z | - |
dc.date.issued | 2016 | - |
dc.identifier.citation | Zhou, C. [周春婷]. (2016). Representation learning for natural language processing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/236583 | - |
dc.description.abstract | Many applications in natural language processing (NLP) depend on extracting compact and informative features from unstructured texts. Distributed representations encode data as continuous vectors, by constructing a projection from semantics to points in high dimensional spaces. As an effective and efficient way of learning such representations, neural network based models have obtained great successes in various NLP applications. For different NLP scenarios, different types of neural networks can be carefully assembled to mimic human's behavioral patterns of understanding the real world or to exploit the semantics based on linguistic composition. This thesis focuses on representation learning at different granularities; different neural network models are designed that embed features of words, sentences and articles. Learned representations are applied in three selected natural language applications: sentiment classification, text classification and reading comprehension. The first part of this work learns word representations with a shallow neural network, in which global topical knowledge is incorporated into local contexts when a word interacts with its context words. The learned word vectors are evaluated on three tasks: word analogical reasoning, word similarity task and text classification. Experiments show that word embeddings are enhanced in terms of preserving their semantic information together with category knowledge. The second part of this work proposes a model called \C-LSTM" which can learn sentence representations through combining a convolutional neural network (CNN) and a long short-term memory (LSTM) recurrent neural network. The input to the LSTM network is sequential n-gram features extracted by the CNN, and thus at each time step the input has integrated a wide range of context information. Experiments on sentiment classification and question-type classification tasks demonstrate the remarkable performance of C-LSTM as compared with state-of-the-art models. The last part of this work proposes a context aware attention neural network model for reading comprehension. Two separate convolutional neural networks are employed to encode the articles, and the max pooling result of one CNN is used to attentively read the other CNN output. The attended representation of the article together with the query representation are used to predict the answer. Extensive experimental analyses have verified its effectiveness in extracting informative n-grams that match the query or are helpful to answer inference. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.subject.lcsh | Natural language processing (Computer science) | - |
dc.title | Representation learning for natural language processing | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5807307 | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5807307 | - |
dc.identifier.mmsid | 991020915849703414 | - |