Robust probabilistic modeling for single-cell multimodal mosaic integration and imputation via scVAEIT

Du, Jin Hong; Cai, Zhanrui; Roeder, Kathryn

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1073/pnas.2214414119
Scopus: eid_2-s2.0-85143464002
PMID: 36459654
WOS: WOS:001036941500003
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Faculty of Business & Economics: Journal/Magazine Articles

Article: Robust probabilistic modeling for single-cell multimodal mosaic integration and imputation via scVAEIT

Title	Robust probabilistic modeling for single-cell multimodal mosaic integration and imputation via scVAEIT
Authors	Du, Jin Hong Cai, Zhanrui Roeder, Kathryn
Keywords	deep generative models mosaic integration multiomics transfer learning
Issue Date	2022
Citation	Proceedings of the National Academy of Sciences of the United States of America, 2022, v. 119, n. 49, article no. e2214414119 How to Cite? DOI: http://dx.doi.org/10.1073/pnas.2214414119
Abstract	Recent advances in single-cell technologies enable joint profiling of multiple omics. These profiles can reveal the complex interplay of different regulatory layers in single cells; still, new challenges arise when integrating datasets with some features shared across experiments and others exclusive to a single source; combining information across these sources is called mosaic integration. The difficulties lie in imputing missing molecular layers to build a self-consistent atlas, finding a common latent space, and transferring learning to new data sources robustly. Existing mosaic integration approaches based on matrix factorization cannot efficiently adapt to nonlinear embeddings for the latent cell space and are not designed for accurate imputation of missing molecular layers. By contrast, we propose a probabilistic variational autoencoder model, scVAEIT, to integrate and impute multimodal datasets with mosaic measurements. A key advance is the use of a missing mask for learning the conditional distribution of unobserved modalities and features, which makes scVAEIT flexible to combine different panels of measurements from multimodal datasets accurately and in an end-to-end manner. Imputing the masked features serves as a supervised learning procedure while preventing overfitting by regularization. Focusing on gene expression, protein abundance, and chromatin accessibility, we validate that scVAEIT robustly imputes the missing modalities and features of cells biologically different from the training data. scVAEIT also adjusts for batch effects while maintaining the biological variation, which provides better latent representations for the integrated datasets. We demonstrate that scVAEIT significantly improves integration and imputation across unseen cell types, different technologies, and different tissues.
Persistent Identifier	http://hdl.handle.net/10722/328843
ISSN	0027-8424 2023 Impact Factor: 9.4 2023 SCImago Journal Rankings: 3.737
ISI Accession Number ID	WOS:001036941500003

DC Field	Value	Language
dc.contributor.author	Du, Jin Hong	-
dc.contributor.author	Cai, Zhanrui	-
dc.contributor.author	Roeder, Kathryn	-
dc.date.accessioned	2023-07-22T06:24:33Z	-
dc.date.available	2023-07-22T06:24:33Z	-
dc.date.issued	2022	-
dc.identifier.citation	Proceedings of the National Academy of Sciences of the United States of America, 2022, v. 119, n. 49, article no. e2214414119	-
dc.identifier.issn	0027-8424	-
dc.identifier.uri	http://hdl.handle.net/10722/328843	-
dc.description.abstract	Recent advances in single-cell technologies enable joint profiling of multiple omics. These profiles can reveal the complex interplay of different regulatory layers in single cells; still, new challenges arise when integrating datasets with some features shared across experiments and others exclusive to a single source; combining information across these sources is called mosaic integration. The difficulties lie in imputing missing molecular layers to build a self-consistent atlas, finding a common latent space, and transferring learning to new data sources robustly. Existing mosaic integration approaches based on matrix factorization cannot efficiently adapt to nonlinear embeddings for the latent cell space and are not designed for accurate imputation of missing molecular layers. By contrast, we propose a probabilistic variational autoencoder model, scVAEIT, to integrate and impute multimodal datasets with mosaic measurements. A key advance is the use of a missing mask for learning the conditional distribution of unobserved modalities and features, which makes scVAEIT flexible to combine different panels of measurements from multimodal datasets accurately and in an end-to-end manner. Imputing the masked features serves as a supervised learning procedure while preventing overfitting by regularization. Focusing on gene expression, protein abundance, and chromatin accessibility, we validate that scVAEIT robustly imputes the missing modalities and features of cells biologically different from the training data. scVAEIT also adjusts for batch effects while maintaining the biological variation, which provides better latent representations for the integrated datasets. We demonstrate that scVAEIT significantly improves integration and imputation across unseen cell types, different technologies, and different tissues.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of the National Academy of Sciences of the United States of America	-
dc.subject	deep generative models	-
dc.subject	mosaic integration	-
dc.subject	multiomics	-
dc.subject	transfer learning	-
dc.title	Robust probabilistic modeling for single-cell multimodal mosaic integration and imputation via scVAEIT	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1073/pnas.2214414119	-
dc.identifier.pmid	36459654	-
dc.identifier.scopus	eid_2-s2.0-85143464002	-
dc.identifier.volume	119	-
dc.identifier.issue	49	-
dc.identifier.spage	article no. e2214414119	-
dc.identifier.epage	article no. e2214414119	-
dc.identifier.eissn	1091-6490	-
dc.identifier.isi	WOS:001036941500003	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Robust probabilistic modeling for single-cell multimodal mosaic integration and imputation via scVAEIT

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats