File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)

Conference Paper: Detecting comments showing risk for suicide in YouTube

TitleDetecting comments showing risk for suicide in YouTube
Authors
KeywordsCantonese
Sentiment analysis
Social media
Suicide
Text mining
Issue Date2018
PublisherSpringer.
Citation
FTC 2018: Proceedings of the Future Technologies Conference (FTC) 2018, Vancouver, Canada, 13-14 November 2018, v. 1, p. 385-400 How to Cite?
AbstractNatural language processing (NLP) with Cantonese, a mixture of Traditional Chinese, borrowed characters to represent spoken terms, and English, is largely under developed. To apply NLP to detect social media posts showing suicide risk, which is a rare event in regular population, is even more challenging. This paper tried different text mining methods to classify comments in Cantonese on YouTube whether they indicate suicidal risk. Based on word vector feature, classification algorithms such as SVM, AdaBoost, Random Forest, and LSTM are employed to detect the comments’ risk level. To address the imbalance issue of the data, both re-sampling and focal loss methods are used. Based on improvement on both data and algorithm level, the LSTM algorithm can achieve more satisfied testing classification results (84.3% and 84.5% g-mean, respectively). The study demonstrates the potential of automatically detected suicide risk in Cantonese social media posts.
Persistent Identifierhttp://hdl.handle.net/10722/275989
ISBN
ISSN
ISI Accession Number ID
Series/Report no.Advances in Intelligent Systems and Computing (AISC) ; v. 880

 

DC FieldValueLanguage
dc.contributor.authorGao, J-
dc.contributor.authorCheng, Q-
dc.contributor.authorYu, PLH-
dc.date.accessioned2019-09-10T02:53:42Z-
dc.date.available2019-09-10T02:53:42Z-
dc.date.issued2018-
dc.identifier.citationFTC 2018: Proceedings of the Future Technologies Conference (FTC) 2018, Vancouver, Canada, 13-14 November 2018, v. 1, p. 385-400-
dc.identifier.isbn978-3-030-02685-1-
dc.identifier.issn2194-5357-
dc.identifier.urihttp://hdl.handle.net/10722/275989-
dc.description.abstractNatural language processing (NLP) with Cantonese, a mixture of Traditional Chinese, borrowed characters to represent spoken terms, and English, is largely under developed. To apply NLP to detect social media posts showing suicide risk, which is a rare event in regular population, is even more challenging. This paper tried different text mining methods to classify comments in Cantonese on YouTube whether they indicate suicidal risk. Based on word vector feature, classification algorithms such as SVM, AdaBoost, Random Forest, and LSTM are employed to detect the comments’ risk level. To address the imbalance issue of the data, both re-sampling and focal loss methods are used. Based on improvement on both data and algorithm level, the LSTM algorithm can achieve more satisfied testing classification results (84.3% and 84.5% g-mean, respectively). The study demonstrates the potential of automatically detected suicide risk in Cantonese social media posts.-
dc.languageeng-
dc.publisherSpringer.-
dc.relation.ispartofProceedings of the Future Technologies Conference (FTC) 2018-
dc.relation.ispartofseriesAdvances in Intelligent Systems and Computing (AISC) ; v. 880-
dc.subjectCantonese-
dc.subjectSentiment analysis-
dc.subjectSocial media-
dc.subjectSuicide-
dc.subjectText mining-
dc.titleDetecting comments showing risk for suicide in YouTube-
dc.typeConference_Paper-
dc.identifier.emailCheng, Q: chengqj@connect.hku.hk-
dc.identifier.emailYu, PLH: plhyu@hku.hk-
dc.identifier.authorityCheng, Q=rp02018-
dc.identifier.authorityYu, PLH=rp00835-
dc.identifier.doi10.1007/978-3-030-02686-8_30-
dc.identifier.scopuseid_2-s2.0-85055902360-
dc.identifier.hkuros303886-
dc.identifier.volume1-
dc.identifier.spage385-
dc.identifier.epage400-
dc.identifier.eissn2194-5365-
dc.identifier.isiWOS:000505677000030-
dc.publisher.placeCham, Switzerland-
dc.identifier.issnl2194-5365-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats