File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth

TitleIDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth
Authors
Issue Date2012
PublisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
Citation
Bioinformatics, 2012, v. 28 n. 11, p. 1420-1428 How to Cite?
AbstractMOTIVATION: Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. RESULTS: We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. AVAILABILITY: The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/alse/idba_ud
Persistent Identifierhttp://hdl.handle.net/10722/152505
ISSN
2023 Impact Factor: 4.4
2023 SCImago Journal Rankings: 2.574
ISI Accession Number ID
Funding AgencyGrant Number
HKGRFHKU 7116/08E
HKU 719709E
HKU Genomics SRT
Funding Information:

This work was supported in part by HKGRF funding (HKU 7116/08E, HKU 719709E) and HKU Genomics SRT funding.

References
Grants

 

DC FieldValueLanguage
dc.contributor.authorPeng, Yen_US
dc.contributor.authorLeung, HCMen_US
dc.contributor.authorYiu, SMen_US
dc.contributor.authorChin, FYLen_US
dc.date.accessioned2012-06-26T06:39:47Z-
dc.date.available2012-06-26T06:39:47Z-
dc.date.issued2012en_US
dc.identifier.citationBioinformatics, 2012, v. 28 n. 11, p. 1420-1428en_US
dc.identifier.issn1367-4803en_US
dc.identifier.urihttp://hdl.handle.net/10722/152505-
dc.description.abstractMOTIVATION: Next-generation sequencing allows us to sequence reads from a microbial environment using single-cell sequencing or metagenomic sequencing technologies. However, both technologies suffer from the problem that sequencing depth of different regions of a genome or genomes from different species are highly uneven. Most existing genome assemblers usually have an assumption that sequencing depths are even. These assemblers fail to construct correct long contigs. RESULTS: We introduce the IDBA-UD algorithm that is based on the de Bruijn graph approach for assembling reads from single-cell sequencing or metagenomic sequencing technologies with uneven sequencing depths. Several non-trivial techniques have been employed to tackle the problems. Instead of using a simple threshold, we use multiple depthrelative thresholds to remove erroneous k-mers in both low-depth and high-depth regions. The technique of local assembly with paired-end information is used to solve the branch problem of low-depth short repeat regions. To speed up the process, an error correction step is conducted to correct reads of high-depth regions that can be aligned to highconfident contigs. Comparison of the performances of IDBA-UD and existing assemblers (Velvet, Velvet-SC, SOAPdenovo and Meta-IDBA) for different datasets, shows that IDBA-UD can reconstruct longer contigs with higher accuracy. AVAILABILITY: The IDBA-UD toolkit is available at our website http://www.cs.hku.hk/alse/idba_uden_US
dc.languageengen_US
dc.publisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/en_US
dc.relation.ispartofBioinformaticsen_US
dc.rightsThis is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version Bioinformatics, 2012, v. 28 n. 11, p. 1420-1428 is available online at: http://bioinformatics.oxfordjournals.org/content/28/11/1420-
dc.titleIDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depthen_US
dc.typeArticleen_US
dc.identifier.emailLeung, HCM: cmleung2@cs.hku.hken_US
dc.identifier.emailYiu, SM: smyiu@cs.hku.hk-
dc.identifier.emailChin, FYL: chin@cs.hku.hk-
dc.identifier.authorityChin, FYL=rp00105en_US
dc.description.naturepostprinten_US
dc.identifier.doi10.1093/bioinformatics/bts174en_US
dc.identifier.pmid22495754-
dc.identifier.scopuseid_2-s2.0-84861760530en_US
dc.identifier.hkuros202752-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-84861760530&selection=ref&src=s&origin=recordpageen_US
dc.identifier.volume28en_US
dc.identifier.issue11en_US
dc.identifier.spage1420en_US
dc.identifier.epage1428en_US
dc.identifier.eissn1460-2059-
dc.identifier.isiWOS:000304537000002-
dc.publisher.placeUnited Kingdomen_US
dc.relation.projectAlgorithms for Inferring k-articulated Phylogenetic Network-
dc.identifier.scopusauthoridChin, FYL=7005101915en_US
dc.identifier.scopusauthoridYiu, SM=55146840600en_US
dc.identifier.scopusauthoridLeung, HCM=55236908900en_US
dc.identifier.scopusauthoridPeng, Y=30267885400en_US
dc.identifier.citeulike10559166-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats