File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Subspace clustering of text documents with feature weighting k-means algorithm

TitleSubspace clustering of text documents with feature weighting k-means algorithm
Authors
KeywordsFeature Weighting
Text Mining
Subspace Clustering
Cluster Interpretation
High Dimensional Data
Issue Date2005
PublisherSpringer.
Citation
9th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2005), Hanoi, Vietnam, 18-20 May 2005. In Advances in Knowledge Discovery and Data Mining: 9th Pacific-Asia Conference, PAKDD 2005, Hanoi, Vietnam, May 18-20, 2005: Proceedings, 2005, p. 802-812 How to Cite?
AbstractThis paper presents a new method to solve the problem of clustering large and complex text data. The method is based on a new subspace clustering algorithm that automatically calculates the feature weights in the k-means clustering process. In clustering sparse text data the feature weights are used to discover clusters from subspaces of the document vector space and identify key words that represent the semantics of the clusters. We present a modification of the published algorithm to solve the sparsity problem that occurs in text clustering. Experimental results on real-world text data have shown that the new method outper-formed the Standard K Means and Bisection-KMeans algorithms, while still maintaining efficiency of the k-means clustering process. © Springer-Verlag Berlin Heidelberg 2005.
Persistent Identifierhttp://hdl.handle.net/10722/276781
ISBN
ISSN
2023 SCImago Journal Rankings: 0.606
Series/Report no.Lecture Notes in Computer Science ; 3518

 

DC FieldValueLanguage
dc.contributor.authorJing, Liping-
dc.contributor.authorNg, Michael K.-
dc.contributor.authorXu, Jun-
dc.contributor.authorHuang, Joshua Zhexue-
dc.date.accessioned2019-09-18T08:34:38Z-
dc.date.available2019-09-18T08:34:38Z-
dc.date.issued2005-
dc.identifier.citation9th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2005), Hanoi, Vietnam, 18-20 May 2005. In Advances in Knowledge Discovery and Data Mining: 9th Pacific-Asia Conference, PAKDD 2005, Hanoi, Vietnam, May 18-20, 2005: Proceedings, 2005, p. 802-812-
dc.identifier.isbn9783540260769-
dc.identifier.issn0302-9743-
dc.identifier.urihttp://hdl.handle.net/10722/276781-
dc.description.abstractThis paper presents a new method to solve the problem of clustering large and complex text data. The method is based on a new subspace clustering algorithm that automatically calculates the feature weights in the k-means clustering process. In clustering sparse text data the feature weights are used to discover clusters from subspaces of the document vector space and identify key words that represent the semantics of the clusters. We present a modification of the published algorithm to solve the sparsity problem that occurs in text clustering. Experimental results on real-world text data have shown that the new method outper-formed the Standard K Means and Bisection-KMeans algorithms, while still maintaining efficiency of the k-means clustering process. © Springer-Verlag Berlin Heidelberg 2005.-
dc.languageeng-
dc.publisherSpringer.-
dc.relation.ispartofAdvances in Knowledge Discovery and Data Mining: 9th Pacific-Asia Conference, PAKDD 2005, Hanoi, Vietnam, May 18-20, 2005: Proceedings-
dc.relation.ispartofseriesLecture Notes in Computer Science ; 3518-
dc.subjectFeature Weighting-
dc.subjectText Mining-
dc.subjectSubspace Clustering-
dc.subjectCluster Interpretation-
dc.subjectHigh Dimensional Data-
dc.titleSubspace clustering of text documents with feature weighting k-means algorithm-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/11430919_94-
dc.identifier.scopuseid_2-s2.0-26944481948-
dc.identifier.spage802-
dc.identifier.epage812-
dc.identifier.eissn1611-3349-
dc.publisher.placeBerlin-
dc.identifier.issnl0302-9743-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats