File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Clustering under perturbation resilience

TitleClustering under perturbation resilience
Authors
KeywordsClustering
K-median clustering
Min-sum clustering
Perturbation resilience
Issue Date2016
Citation
SIAM Journal on Computing, 2016, v. 45, n. 1, p. 102-155 How to Cite?
AbstractMotivated by the fact that distances between data points in many real-world clustering instances are often based on heuristic measures, Bilu and Linial [Proceedings of the Symposium on Innovations in Computer Science, 2010] proposed analyzing objective based clustering problems under the assumption that the optimum clustering to the objective is preserved under small multiplicative perturbations to distances between points. The hope is that by exploiting the structure in such instances, one can overcome worst case hardness results. In this paper, we provide several results within this framework. For center-based objectives, we present an algorithm that can optimally cluster instances resilient to perturbations of factor (1 + √2), solving an open problem of Awasthi, Blum, and Sheffet [Proceedings of the IEEE Annual Symposium on Foundations of Computer Science, 2010]. For k-median, a center-based objective of special interest, we additionally give algorithms for a more relaxed assumption in which we allow the optimal solution to change in a small fraction of the points after perturbation. We give the first bounds known for k-median under this more realistic and more general assumption. We also provide positive results for min-sum clustering, which is typically a harder objective than center-based objectives from an approximability standpoint. Our algorithms are based on new linkage criteria that may be of independent interest. Additionally, we give sublinear-time algorithms, showing algorithms that can return an implicit clustering from access to only a small random sample.
Persistent Identifierhttp://hdl.handle.net/10722/341173
ISSN
2023 Impact Factor: 1.2
2023 SCImago Journal Rankings: 2.143
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorBalcan, Maria Florina-
dc.contributor.authorLiang, Yingyu-
dc.date.accessioned2024-03-13T08:40:44Z-
dc.date.available2024-03-13T08:40:44Z-
dc.date.issued2016-
dc.identifier.citationSIAM Journal on Computing, 2016, v. 45, n. 1, p. 102-155-
dc.identifier.issn0097-5397-
dc.identifier.urihttp://hdl.handle.net/10722/341173-
dc.description.abstractMotivated by the fact that distances between data points in many real-world clustering instances are often based on heuristic measures, Bilu and Linial [Proceedings of the Symposium on Innovations in Computer Science, 2010] proposed analyzing objective based clustering problems under the assumption that the optimum clustering to the objective is preserved under small multiplicative perturbations to distances between points. The hope is that by exploiting the structure in such instances, one can overcome worst case hardness results. In this paper, we provide several results within this framework. For center-based objectives, we present an algorithm that can optimally cluster instances resilient to perturbations of factor (1 + √2), solving an open problem of Awasthi, Blum, and Sheffet [Proceedings of the IEEE Annual Symposium on Foundations of Computer Science, 2010]. For k-median, a center-based objective of special interest, we additionally give algorithms for a more relaxed assumption in which we allow the optimal solution to change in a small fraction of the points after perturbation. We give the first bounds known for k-median under this more realistic and more general assumption. We also provide positive results for min-sum clustering, which is typically a harder objective than center-based objectives from an approximability standpoint. Our algorithms are based on new linkage criteria that may be of independent interest. Additionally, we give sublinear-time algorithms, showing algorithms that can return an implicit clustering from access to only a small random sample.-
dc.languageeng-
dc.relation.ispartofSIAM Journal on Computing-
dc.subjectClustering-
dc.subjectK-median clustering-
dc.subjectMin-sum clustering-
dc.subjectPerturbation resilience-
dc.titleClustering under perturbation resilience-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1137/140981575-
dc.identifier.scopuseid_2-s2.0-84960326461-
dc.identifier.volume45-
dc.identifier.issue1-
dc.identifier.spage102-
dc.identifier.epage155-
dc.identifier.eissn1095-7111-
dc.identifier.isiWOS:000371229000006-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats