PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells

Stassen, SV; SIU, DMD; LEE, KCM; Ho, JWK; So, HKH; Tsia, KK

File Download

content.pdf

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1093/bioinformatics/btaa042
Scopus: eid_2-s2.0-85084379617
PMID: 31971583
WOS: WOS:000537450900018
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Biomedical Sciences: Journal/Magazine Articles
- Electrical & Electronic Engineering: Journal/Magazine Articles

Article: PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells

Title	PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells
Authors	Stassen, SV SIU, DMD LEE, KCM Ho, JWK So, HKH Tsia, KK
Issue Date	2020
Publisher	Oxford University Press (OUP): Policy B - Oxford Open Option B. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
Citation	Bioinformatics, 2020, v. 36, p. 2778-2786 How to Cite? DOI: http://dx.doi.org/10.1093/bioinformatics/btaa042
Abstract	MOTIVATION: New single-cell technologies continue to fuel the explosive growth in the scale of heterogeneous single-cell data. However, existing computational methods are inadequately scalable to large datasets and therefore cannot uncover the complex cellular heterogeneity. RESULTS: We introduce a highly scalable graph-based clustering algorithm PARC - Phenotyping by Accelerated Refined Community-partitioning - for large-scale, high-dimensional single-cell data (> 1 million cells). Using large single cell flow and mass cytometry, RNA-seq and imaging-based biophysical data, we demonstrate that PARC consistently outperforms state-of-the-art clustering algorithms without sub-sampling of cells, including Phenograph, FlowSOM, and Flock, in terms of both speed and ability to robustly detect rare cell populations. For example, PARC can cluster a single cell data set of 1.1M cells within 13 minutes, compared to > 2 hours for the next fastest graph-clustering algorithm. Our work presents a scalable algorithm to cope with increasingly large-scale single-cell analysis. AVAILABILITY: https://github.com/ShobiStassen/PARC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. © The Author(s) 2020. Published by Oxford University Press.
Persistent Identifier	http://hdl.handle.net/10722/281997
ISSN	1367-4803 2023 Impact Factor: 4.4 2023 SCImago Journal Rankings: 2.574
ISI Accession Number ID	WOS:000537450900018

DC Field	Value	Language
dc.contributor.author	Stassen, SV	-
dc.contributor.author	SIU, DMD	-
dc.contributor.author	LEE, KCM	-
dc.contributor.author	Ho, JWK	-
dc.contributor.author	So, HKH	-
dc.contributor.author	Tsia, KK	-
dc.date.accessioned	2020-04-19T03:33:55Z	-
dc.date.available	2020-04-19T03:33:55Z	-
dc.date.issued	2020	-
dc.identifier.citation	Bioinformatics, 2020, v. 36, p. 2778-2786	-
dc.identifier.issn	1367-4803	-
dc.identifier.uri	http://hdl.handle.net/10722/281997	-
dc.description.abstract	MOTIVATION: New single-cell technologies continue to fuel the explosive growth in the scale of heterogeneous single-cell data. However, existing computational methods are inadequately scalable to large datasets and therefore cannot uncover the complex cellular heterogeneity. RESULTS: We introduce a highly scalable graph-based clustering algorithm PARC - Phenotyping by Accelerated Refined Community-partitioning - for large-scale, high-dimensional single-cell data (> 1 million cells). Using large single cell flow and mass cytometry, RNA-seq and imaging-based biophysical data, we demonstrate that PARC consistently outperforms state-of-the-art clustering algorithms without sub-sampling of cells, including Phenograph, FlowSOM, and Flock, in terms of both speed and ability to robustly detect rare cell populations. For example, PARC can cluster a single cell data set of 1.1M cells within 13 minutes, compared to > 2 hours for the next fastest graph-clustering algorithm. Our work presents a scalable algorithm to cope with increasingly large-scale single-cell analysis. AVAILABILITY: https://github.com/ShobiStassen/PARC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. © The Author(s) 2020. Published by Oxford University Press.	-
dc.language	eng	-
dc.publisher	Oxford University Press (OUP): Policy B - Oxford Open Option B. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/	-
dc.relation.ispartof	Bioinformatics	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.title	PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells	-
dc.type	Article	-
dc.identifier.email	Ho, JWK: jwkho@hku.hk	-
dc.identifier.email	So, HKH: hso@eee.hku.hk	-
dc.identifier.email	Tsia, KK: tsia@hku.hk	-
dc.identifier.authority	Ho, JWK=rp02436	-
dc.identifier.authority	So, HKH=rp00169	-
dc.identifier.authority	Tsia, KK=rp01389	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.1093/bioinformatics/btaa042	-
dc.identifier.pmid	31971583	-
dc.identifier.scopus	eid_2-s2.0-85084379617	-
dc.identifier.hkuros	309737	-
dc.identifier.hkuros	314128	-
dc.identifier.volume	36	-
dc.identifier.spage	2778	-
dc.identifier.epage	2786	-
dc.identifier.isi	WOS:000537450900018	-
dc.publisher.place	United Kingdom	-
dc.identifier.issnl	1367-4803	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats