File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1145/3397271.3401248
- Scopus: eid_2-s2.0-85090150507
- WOS: WOS:000722377700227
Supplementary
- Citations:
- Appears in Collections:
Conference Paper: Chinese Document Classification with Bi-directional Convolutional Language Model
Title | Chinese Document Classification with Bi-directional Convolutional Language Model |
---|---|
Authors | |
Issue Date | 2020 |
Publisher | Association for Computing Machinery. |
Citation | Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, Virtual Event, Xi'an, China, 25-30 July 2020, p. 1785-1788 How to Cite? |
Abstract | By setting a typeface, each character of the Chinese text can be converted to a glyph pixel matrix. We propose to conduct text classification with such glyph features using bi-directional convolution. Although the pixel embedding can be applied to all languages, it is much more convenient to be used to represent Chinese scripts due to the square shape of Chinese characters. We extract both the forward and backward n-gram features of the text via bi-directional convolutional operations and then concatenate them. A subsequent 1-dimensional max-over-time pooling is applied to the bi-directional feature maps, and then three fully connected layers are used for conducting text classification. The proposed model has a light-weight architecture that only contains a single-layer convolutional neural network. Experiments on several Chinese text classification datasets demonstrate surprisingly excellent results for the training speed and superior performance of the proposed model in comparison with traditional methods. |
Persistent Identifier | http://hdl.handle.net/10722/294826 |
ISBN | |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Liu, B | - |
dc.contributor.author | Yin, G | - |
dc.date.accessioned | 2020-12-21T11:49:07Z | - |
dc.date.available | 2020-12-21T11:49:07Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, Virtual Event, Xi'an, China, 25-30 July 2020, p. 1785-1788 | - |
dc.identifier.isbn | 9781450380164 | - |
dc.identifier.uri | http://hdl.handle.net/10722/294826 | - |
dc.description.abstract | By setting a typeface, each character of the Chinese text can be converted to a glyph pixel matrix. We propose to conduct text classification with such glyph features using bi-directional convolution. Although the pixel embedding can be applied to all languages, it is much more convenient to be used to represent Chinese scripts due to the square shape of Chinese characters. We extract both the forward and backward n-gram features of the text via bi-directional convolutional operations and then concatenate them. A subsequent 1-dimensional max-over-time pooling is applied to the bi-directional feature maps, and then three fully connected layers are used for conducting text classification. The proposed model has a light-weight architecture that only contains a single-layer convolutional neural network. Experiments on several Chinese text classification datasets demonstrate surprisingly excellent results for the training speed and superior performance of the proposed model in comparison with traditional methods. | - |
dc.language | eng | - |
dc.publisher | Association for Computing Machinery. | - |
dc.relation.ispartof | Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval | - |
dc.rights | Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Copyright © Association for Computing Machinery. | - |
dc.title | Chinese Document Classification with Bi-directional Convolutional Language Model | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Yin, G: gyin@hku.hk | - |
dc.identifier.authority | Yin, G=rp00831 | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1145/3397271.3401248 | - |
dc.identifier.scopus | eid_2-s2.0-85090150507 | - |
dc.identifier.hkuros | 320600 | - |
dc.identifier.spage | 1785 | - |
dc.identifier.epage | 1788 | - |
dc.identifier.isi | WOS:000722377700227 | - |
dc.publisher.place | New York, NY | - |