File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Extracting categorical topics from tweets using topic model

TitleExtracting categorical topics from tweets using topic model
Authors
KeywordsGibbs Sampling
Topic Model
Twitter
Issue Date2013
PublisherSpringer.
Citation
9th Asia Information Retrieval Societies Conference (AIRS 2013), Singapore, 9-11 December 2013. In Banchs, RE, Silvestri, F, Liu, T, et al. (Eds.), Information Retrieval Technology: 9th Asia Information Retrieval Societies Conference, AIRS 2013, Singapore, December 9-11, 2013. Proceedings, p. 86-96. Berlin: Springer, 2013 How to Cite?
AbstractOver the past few years, microblogging websites, such as Twitter, are growing increasingly popular. Different with traditional medias, tweets are structured data and with a lot of noisy words. Topic modeling algorithms for traditional medias have been studied well, but our understanding of Twitter still remains limited and few algorithms are specially designed to mine Twitter data according to its own characteristics. Previous studies usually employ only one type of topic to analyze hot topics of the Twitter community and are greatly affected by the large amount of noisy words in tweets. We have observed that, in the Twitter community, users tend to discuss two types of topics actually. One mainly focuses on their personal lives and the other on hot issues of the society. These two types of topics usually yield different distributions. In this paper, we introduce the Categorical Topic Model. This model incorporates the features of Twitter data to divide topics into two types in semantic and introduce a word distribution for background words to filter out noisy words. Our model is able to discover different types of topics efficiently, indicate which topics are interested by an user and find hot issues of the Twitter community. Employing the Gibbs sampling, we compare our model with Latent Dirichlet Allocation and Author Topic Model on the TREC2011 data set and examples of discovered public topics and personal topics are also discussed in our paper. © 2013 Springer-Verlag.
Persistent Identifierhttp://hdl.handle.net/10722/311383
ISBN
ISSN
2023 SCImago Journal Rankings: 0.606
Series/Report no.Lecture Notes in Computer Science ; 8281

 

DC FieldValueLanguage
dc.contributor.authorZheng, Lei-
dc.contributor.authorHan, Kai-
dc.date.accessioned2022-03-22T11:53:48Z-
dc.date.available2022-03-22T11:53:48Z-
dc.date.issued2013-
dc.identifier.citation9th Asia Information Retrieval Societies Conference (AIRS 2013), Singapore, 9-11 December 2013. In Banchs, RE, Silvestri, F, Liu, T, et al. (Eds.), Information Retrieval Technology: 9th Asia Information Retrieval Societies Conference, AIRS 2013, Singapore, December 9-11, 2013. Proceedings, p. 86-96. Berlin: Springer, 2013-
dc.identifier.isbn9783642450679-
dc.identifier.issn0302-9743-
dc.identifier.urihttp://hdl.handle.net/10722/311383-
dc.description.abstractOver the past few years, microblogging websites, such as Twitter, are growing increasingly popular. Different with traditional medias, tweets are structured data and with a lot of noisy words. Topic modeling algorithms for traditional medias have been studied well, but our understanding of Twitter still remains limited and few algorithms are specially designed to mine Twitter data according to its own characteristics. Previous studies usually employ only one type of topic to analyze hot topics of the Twitter community and are greatly affected by the large amount of noisy words in tweets. We have observed that, in the Twitter community, users tend to discuss two types of topics actually. One mainly focuses on their personal lives and the other on hot issues of the society. These two types of topics usually yield different distributions. In this paper, we introduce the Categorical Topic Model. This model incorporates the features of Twitter data to divide topics into two types in semantic and introduce a word distribution for background words to filter out noisy words. Our model is able to discover different types of topics efficiently, indicate which topics are interested by an user and find hot issues of the Twitter community. Employing the Gibbs sampling, we compare our model with Latent Dirichlet Allocation and Author Topic Model on the TREC2011 data set and examples of discovered public topics and personal topics are also discussed in our paper. © 2013 Springer-Verlag.-
dc.languageeng-
dc.publisherSpringer.-
dc.relation.ispartofInformation Retrieval Technology: 9th Asia Information Retrieval Societies Conference, AIRS 2013, Singapore, December 9-11, 2013. Proceedings-
dc.relation.ispartofseriesLecture Notes in Computer Science ; 8281-
dc.subjectGibbs Sampling-
dc.subjectTopic Model-
dc.subjectTwitter-
dc.titleExtracting categorical topics from tweets using topic model-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/978-3-642-45068-6_8-
dc.identifier.scopuseid_2-s2.0-84893247675-
dc.identifier.spage86-
dc.identifier.epage96-
dc.identifier.eissn1611-3349-
dc.publisher.placeBerlin-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats