IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels

Peng, Y; Leung, HCM; Yiu, SM; Lv, MJ; Zhu, XG; Chin, FYL

File Download

content.pdf

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1093/bioinformatics/btt219
Scopus: eid_2-s2.0-84879912851
PMID: 23813001
WOS: WOS:000321746100036
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels

Title	IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels
Authors	Peng, Y Leung, HCM Yiu, SM Lv, MJ Zhu, XG Chin, FYL
Issue Date	2013
Publisher	Oxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
Citation	Bioinformatics, 2013, v. 29 n. 13, p. i326-i334 How to Cite? DOI: http://dx.doi.org/10.1093/bioinformatics/btt219
Abstract	Motivation: RNA sequencing based on next-generation sequencing technology is effective for analyzing transcriptomes. Like de novo genome assembly, de novo transcriptome assembly does not rely on any reference genome or additional annotation information, but is more difficult. In particular, isoforms can have very uneven expression levels (e.g. 1:100), which make it very difficult to identify low-expressed isoforms. One challenge is to remove erroneous vertices/edges with high multiplicity (produced by high-expressed isoforms) in the de Bruijn graph without removing correct ones with not-so-high multiplicity from low-expressed isoforms. Failing to do so will result in the loss of low-expressed isoforms or having complicated subgraphs with transcripts of different genes mixed together due to erroneous vertices/edges. Contributions: Unlike existing tools, which remove erroneous vertices/edges with multiplicities lower than a global threshold, we use a probabilistic progressive approach to iteratively remove them with local thresholds. This enables us to decompose the graph into disconnected components, each containing a few genes, if not a single gene, while retaining many correct vertices/edges of low-expressed isoforms. Combined with existing techniques, IDBA-Tran is able to assemble both high-expressed and low-expressed transcripts and outperform existing assemblers in terms of sensitivity and specificity for both simulated and real data. Availability:http://www.cs.hku.hk/∼alse/idba_tran. Supplementary information:Supplementary data are available at Bioinformatics online.
Persistent Identifier	http://hdl.handle.net/10722/187134
ISSN	1367-4803 2023 Impact Factor: 4.4 2023 SCImago Journal Rankings: 2.574
PubMed Central ID	PMC3694675
ISI Accession Number ID	WOS:000321746100036

DC Field	Value	Language
dc.contributor.author	Peng, Y	-
dc.contributor.author	Leung, HCM	-
dc.contributor.author	Yiu, SM	-
dc.contributor.author	Lv, MJ	-
dc.contributor.author	Zhu, XG	-
dc.contributor.author	Chin, FYL	-
dc.date.accessioned	2013-08-20T12:30:54Z	-
dc.date.available	2013-08-20T12:30:54Z	-
dc.date.issued	2013	-
dc.identifier.citation	Bioinformatics, 2013, v. 29 n. 13, p. i326-i334	-
dc.identifier.issn	1367-4803	-
dc.identifier.uri	http://hdl.handle.net/10722/187134	-
dc.description.abstract	Motivation: RNA sequencing based on next-generation sequencing technology is effective for analyzing transcriptomes. Like de novo genome assembly, de novo transcriptome assembly does not rely on any reference genome or additional annotation information, but is more difficult. In particular, isoforms can have very uneven expression levels (e.g. 1:100), which make it very difficult to identify low-expressed isoforms. One challenge is to remove erroneous vertices/edges with high multiplicity (produced by high-expressed isoforms) in the de Bruijn graph without removing correct ones with not-so-high multiplicity from low-expressed isoforms. Failing to do so will result in the loss of low-expressed isoforms or having complicated subgraphs with transcripts of different genes mixed together due to erroneous vertices/edges. Contributions: Unlike existing tools, which remove erroneous vertices/edges with multiplicities lower than a global threshold, we use a probabilistic progressive approach to iteratively remove them with local thresholds. This enables us to decompose the graph into disconnected components, each containing a few genes, if not a single gene, while retaining many correct vertices/edges of low-expressed isoforms. Combined with existing techniques, IDBA-Tran is able to assemble both high-expressed and low-expressed transcripts and outperform existing assemblers in terms of sensitivity and specificity for both simulated and real data. Availability:http://www.cs.hku.hk/∼alse/idba_tran. Supplementary information:Supplementary data are available at Bioinformatics online.	-
dc.language	eng	-
dc.publisher	Oxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/	-
dc.relation.ispartof	Bioinformatics	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.title	IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels	-
dc.type	Article	-
dc.identifier.email	Leung, HCM: cmleung2@cs.hku.hk	-
dc.identifier.email	Yiu, SM: smyiu@cs.hku.hk	-
dc.identifier.email	Chin, FYL: chin@cs.hku.hk	-
dc.identifier.authority	Leung, HCM=rp00144	-
dc.identifier.authority	Yiu, SM=rp00207	-
dc.identifier.authority	Chin, FYL=rp00105	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.1093/bioinformatics/btt219	-
dc.identifier.pmid	23813001	-
dc.identifier.pmcid	PMC3694675	-
dc.identifier.scopus	eid_2-s2.0-84879912851	-
dc.identifier.hkuros	219091	-
dc.identifier.volume	29	-
dc.identifier.issue	13	-
dc.identifier.spage	i326	-
dc.identifier.epage	i334	-
dc.identifier.isi	WOS:000321746100036	-
dc.publisher.place	United Kingdom	-
dc.identifier.issnl	1367-4803	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats