File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Leveraging data-driven self-consistency for high-fidelity gene expression recovery

TitleLeveraging data-driven self-consistency for high-fidelity gene expression recovery
Authors
Issue Date21-Nov-2022
PublisherNature Research
Citation
Nature Communications, 2022, v. 13, n. 1 How to Cite?
Abstract

Single cell RNA sequencing is a promising technique to determine the states of individual cells and classify novel cell subtypes. In current sequence data analysis, however, genes with low expressions are omitted, which leads to inaccurate gene counts and hinders downstream analysis. Recovering these omitted expression values presents a challenge because of the large size of the data. Here, we introduce a data-driven gene expression recovery framework, referred to as self-consistent expression recovery machine (SERM), to impute the missing expressions. Using a neural network, the technique first learns the underlying data distribution from a subset of the noisy data. It then recovers the overall expression data by imposing a self-consistency on the expression matrix, thus ensuring that the expression levels are similarly distributed in different parts of the matrix. We show that SERM improves the accuracy of gene imputation with orders of magnitude enhancement in computational efficiency in comparison to the state-of-the-art imputation techniques.

Recovering dropout-affected gene expression values is a challenging problem in bioinformatics. Here, the authors propose a data-driven framework, that first learns the underlying data distribution and then recovers the expression values by imposing a self-consistency on the expression matrix.


Persistent Identifierhttp://hdl.handle.net/10722/331868
ISSN
2021 Impact Factor: 17.694
2020 SCImago Journal Rankings: 5.559

 

DC FieldValueLanguage
dc.contributor.authorIslam, MT-
dc.contributor.authorWang, J-
dc.contributor.authorRen, H-
dc.contributor.authorLi, X-
dc.contributor.authorKhuzani, MB-
dc.contributor.authorSang, S-
dc.contributor.authorYu, L-
dc.contributor.authorShen, L-
dc.contributor.authorZhao, W-
dc.contributor.authorXing, L-
dc.date.accessioned2023-09-28T04:59:14Z-
dc.date.available2023-09-28T04:59:14Z-
dc.date.issued2022-11-21-
dc.identifier.citationNature Communications, 2022, v. 13, n. 1-
dc.identifier.issn2041-1723-
dc.identifier.urihttp://hdl.handle.net/10722/331868-
dc.description.abstract<p>Single cell RNA sequencing is a promising technique to determine the states of individual cells and classify novel cell subtypes. In current sequence data analysis, however, genes with low expressions are omitted, which leads to inaccurate gene counts and hinders downstream analysis. Recovering these omitted expression values presents a challenge because of the large size of the data. Here, we introduce a data-driven gene expression recovery framework, referred to as self-consistent expression recovery machine (SERM), to impute the missing expressions. Using a neural network, the technique first learns the underlying data distribution from a subset of the noisy data. It then recovers the overall expression data by imposing a self-consistency on the expression matrix, thus ensuring that the expression levels are similarly distributed in different parts of the matrix. We show that SERM improves the accuracy of gene imputation with orders of magnitude enhancement in computational efficiency in comparison to the state-of-the-art imputation techniques.</p><p>Recovering dropout-affected gene expression values is a challenging problem in bioinformatics. Here, the authors propose a data-driven framework, that first learns the underlying data distribution and then recovers the expression values by imposing a self-consistency on the expression matrix.</p>-
dc.languageeng-
dc.publisherNature Research-
dc.relation.ispartofNature Communications-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.titleLeveraging data-driven self-consistency for high-fidelity gene expression recovery-
dc.typeArticle-
dc.identifier.doi10.1038/s41467-022-34595-w-
dc.identifier.scopuseid_2-s2.0-85142375367-
dc.identifier.volume13-
dc.identifier.issue1-
dc.identifier.eissn2041-1723-
dc.identifier.issnl2041-1723-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats