File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/SADFE51007.2020.00009
- Scopus: eid_2-s2.0-85092153485
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: Time and Location Topic Model for analyzing Linkg forum data
Title | Time and Location Topic Model for analyzing Linkg forum data |
---|---|
Authors | |
Keywords | data mining information retrieval learning (artificial intelligence) multilayer perceptrons natural language processing |
Issue Date | 2020 |
Publisher | IEEE, Computer Society. The Journal's web site is located at https://ieeexplore.ieee.org/xpl/conhome/1001100/all-proceedings |
Citation | Proceedings of 2020 13th International Conference on Systematic Approaches to Digital Forensic Engineering (SADFE), Virtual Conference, New York, NY, USA, 15 May 2020, p. 32-37 How to Cite? |
Abstract | Open Source Intelligence (OSINT) is a choice for collecting information today for law enforcement to monitor illegal activities and allocate police resources effectively. However, massive amounts of public information cannot be analyzed by humans alone and so automatic pre-processing must be performed in advance. In traditional text analysis, the common word segmentation tools do not match the needs in special fields and special words (such as proper nouns, dialects, acronyms, metaphors, and so on). In the context of the Chinese language, we consider the problem of automatically determining the time and location of major public gatherings and demonstrations using public available information. As experimental scenario, we use the Lihkg online forum from August 1st to October 10th, 2019 as a corpus, and propose a topic vectorization method based on character embedding and Chinese word segmentation, using MLP (multi-layer perceptron) neural network as a location topic model. The result proves that the method and the model can correctly identify the time and the location of discussed activities by learning the existing location corpus. |
Persistent Identifier | http://hdl.handle.net/10722/289853 |
ISBN |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Shen, A | - |
dc.contributor.author | Chow, KP | - |
dc.date.accessioned | 2020-10-22T08:18:24Z | - |
dc.date.available | 2020-10-22T08:18:24Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Proceedings of 2020 13th International Conference on Systematic Approaches to Digital Forensic Engineering (SADFE), Virtual Conference, New York, NY, USA, 15 May 2020, p. 32-37 | - |
dc.identifier.isbn | 9781728188447 | - |
dc.identifier.uri | http://hdl.handle.net/10722/289853 | - |
dc.description.abstract | Open Source Intelligence (OSINT) is a choice for collecting information today for law enforcement to monitor illegal activities and allocate police resources effectively. However, massive amounts of public information cannot be analyzed by humans alone and so automatic pre-processing must be performed in advance. In traditional text analysis, the common word segmentation tools do not match the needs in special fields and special words (such as proper nouns, dialects, acronyms, metaphors, and so on). In the context of the Chinese language, we consider the problem of automatically determining the time and location of major public gatherings and demonstrations using public available information. As experimental scenario, we use the Lihkg online forum from August 1st to October 10th, 2019 as a corpus, and propose a topic vectorization method based on character embedding and Chinese word segmentation, using MLP (multi-layer perceptron) neural network as a location topic model. The result proves that the method and the model can correctly identify the time and the location of discussed activities by learning the existing location corpus. | - |
dc.language | eng | - |
dc.publisher | IEEE, Computer Society. The Journal's web site is located at https://ieeexplore.ieee.org/xpl/conhome/1001100/all-proceedings | - |
dc.relation.ispartof | 2020 13th International Conference on Systematic Approaches to Digital Forensic Engineering (SADFE) | - |
dc.rights | International Conference on Systematic Approaches to Digital Forensic Engineering (SADFE). Copyright © IEEE, Computer Society. | - |
dc.rights | ©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | - |
dc.subject | data mining | - |
dc.subject | information retrieval | - |
dc.subject | learning (artificial intelligence) | - |
dc.subject | multilayer perceptrons | - |
dc.subject | natural language processing | - |
dc.title | Time and Location Topic Model for analyzing Linkg forum data | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Chow, KP: chow@cs.hku.hk | - |
dc.identifier.authority | Chow, KP=rp00111 | - |
dc.description.nature | postprint | - |
dc.identifier.doi | 10.1109/SADFE51007.2020.00009 | - |
dc.identifier.scopus | eid_2-s2.0-85092153485 | - |
dc.identifier.hkuros | 317163 | - |
dc.identifier.spage | 32 | - |
dc.identifier.epage | 37 | - |
dc.publisher.place | United States | - |