File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: HARP: A practical projected clustering algorithm

TitleHARP: A practical projected clustering algorithm
Authors
KeywordsBioinformatics
Clustering
Data mining
Mining methods and algorithms
Issue Date2004
PublisherIEEE. The Journal's web site is located at http://www.computer.org/tkde
Citation
IEEE Transactions on Knowledge and Data Engineering, 2004, v. 16 n. 11, p. 1387-1397 How to Cite?
AbstractIn high-dimensional data, clusters can exist in subspaces that hide themselves from traditional clustering methods. A number of algorithms have been proposed to Identify such projected clusters, but most of them rely on some user parameters to guide the clustering process. The clustering accuracy can be seriously degraded If incorrect values are used. Unfortunately, in real situations, it is rarely possible for users to supply the parameter values accurately, which causes practical difficulties in applying these algorithms to real data. In this paper, we analyze the major challenges of projected clustering and suggest why these algorithms need to depend heavily on user parameters. Based on the analysis, we propose a new algorithm that exploits the clustering status to adjust the internal thresholds dynamically without the assistance of user parameters. According to the results of extensive experiments on real and synthetic data, the new method has excellent accuracy and usability. It outperformed the other algorithms even when correct parameter values were artificially supplied to them. The encouraging results suggest that projected clustering can be a practical tool for various kinds of real applications.
Persistent Identifierhttp://hdl.handle.net/10722/43624
ISSN
2021 Impact Factor: 9.235
2020 SCImago Journal Rankings: 1.360
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorYip, KYen_HK
dc.contributor.authorCheung, DWen_HK
dc.contributor.authorNg, MKen_HK
dc.date.accessioned2007-03-23T04:50:43Z-
dc.date.available2007-03-23T04:50:43Z-
dc.date.issued2004en_HK
dc.identifier.citationIEEE Transactions on Knowledge and Data Engineering, 2004, v. 16 n. 11, p. 1387-1397en_HK
dc.identifier.issn1041-4347en_HK
dc.identifier.urihttp://hdl.handle.net/10722/43624-
dc.description.abstractIn high-dimensional data, clusters can exist in subspaces that hide themselves from traditional clustering methods. A number of algorithms have been proposed to Identify such projected clusters, but most of them rely on some user parameters to guide the clustering process. The clustering accuracy can be seriously degraded If incorrect values are used. Unfortunately, in real situations, it is rarely possible for users to supply the parameter values accurately, which causes practical difficulties in applying these algorithms to real data. In this paper, we analyze the major challenges of projected clustering and suggest why these algorithms need to depend heavily on user parameters. Based on the analysis, we propose a new algorithm that exploits the clustering status to adjust the internal thresholds dynamically without the assistance of user parameters. According to the results of extensive experiments on real and synthetic data, the new method has excellent accuracy and usability. It outperformed the other algorithms even when correct parameter values were artificially supplied to them. The encouraging results suggest that projected clustering can be a practical tool for various kinds of real applications.en_HK
dc.format.extent573425 bytes-
dc.format.extent26624 bytes-
dc.format.mimetypeapplication/pdf-
dc.format.mimetypeapplication/msword-
dc.languageengen_HK
dc.publisherIEEE. The Journal's web site is located at http://www.computer.org/tkdeen_HK
dc.relation.ispartofIEEE Transactions on Knowledge and Data Engineeringen_HK
dc.rights©2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.-
dc.subjectBioinformaticsen_HK
dc.subjectClusteringen_HK
dc.subjectData miningen_HK
dc.subjectMining methods and algorithmsen_HK
dc.titleHARP: A practical projected clustering algorithmen_HK
dc.typeArticleen_HK
dc.identifier.emailCheung, DW:dcheung@cs.hku.hken_HK
dc.identifier.authorityCheung, DW=rp00101en_HK
dc.description.naturepublished_or_final_versionen_HK
dc.identifier.doi10.1109/TKDE.2004.74en_HK
dc.identifier.scopuseid_2-s2.0-13844297591en_HK
dc.identifier.hkuros103205-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-13844297591&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume16en_HK
dc.identifier.issue11en_HK
dc.identifier.spage1387en_HK
dc.identifier.epage1397en_HK
dc.identifier.isiWOS:000223977300006-
dc.publisher.placeUnited Statesen_HK
dc.identifier.scopusauthoridYip, KY=7101909946en_HK
dc.identifier.scopusauthoridCheung, DW=34567902600en_HK
dc.identifier.scopusauthoridNg, MK=7202076432en_HK
dc.identifier.citeulike6337870-
dc.identifier.issnl1041-4347-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats