File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1007/978-3-642-12683-3_28
- Scopus: eid_2-s2.0-78650270346
- Find via
Conference Paper: IDBA - A practical iterative De Bruijn graph De Novo assembler
Title | IDBA - A practical iterative De Bruijn graph De Novo assembler |
---|---|
Authors | |
Keywords | De Bruijn graph De Novo assembly High throughput short reads Mate-pair String graph |
Issue Date | 2010 |
Publisher | Springer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/ |
Citation | The 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440 How to Cite? |
Abstract | The de Bruijn graph assembly approach breaks reads into k-mers before assembling them into contigs. The string graph approach forms contigs by connecting two reads with k or more overlapping nucleotides. Both approaches must deal with the following problems: false-positive vertices, due to erroneous reads; gap problem, due to non-uniform coverage; branching problem, due to erroneous reads and repeat regions. A proper choice of k is crucial but for single k there is always a trade-off: a small k favors the situation of erroneous reads and non-uniform coverage, and a large k favors short repeat regions. We propose an iterative de Bruijn graph approach iterating from small to large k exploring the advantages of the in between values. Our IDBA outperforms the existing algorithms by constructing longer contigs with similar accuracy and using less memory, both with real and simulated data. The running time of the algorithm is comparable to existing algorithms. © Springer-Verlag Berlin Heidelberg 2010. |
Description | LNCS v. 6044 is conference proceedings of 14th RECOMB 2010 |
Persistent Identifier | http://hdl.handle.net/10722/129571 |
ISSN | 2020 SCImago Journal Rankings: 0.249 |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Peng, Y | en_HK |
dc.contributor.author | Leung, HCM | en_HK |
dc.contributor.author | Yiu, SM | en_HK |
dc.contributor.author | Chin, FYL | en_HK |
dc.date.accessioned | 2010-12-23T08:39:23Z | - |
dc.date.available | 2010-12-23T08:39:23Z | - |
dc.date.issued | 2010 | en_HK |
dc.identifier.citation | The 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440 | en_HK |
dc.identifier.issn | 0302-9743 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/129571 | - |
dc.description | LNCS v. 6044 is conference proceedings of 14th RECOMB 2010 | - |
dc.description.abstract | The de Bruijn graph assembly approach breaks reads into k-mers before assembling them into contigs. The string graph approach forms contigs by connecting two reads with k or more overlapping nucleotides. Both approaches must deal with the following problems: false-positive vertices, due to erroneous reads; gap problem, due to non-uniform coverage; branching problem, due to erroneous reads and repeat regions. A proper choice of k is crucial but for single k there is always a trade-off: a small k favors the situation of erroneous reads and non-uniform coverage, and a large k favors short repeat regions. We propose an iterative de Bruijn graph approach iterating from small to large k exploring the advantages of the in between values. Our IDBA outperforms the existing algorithms by constructing longer contigs with similar accuracy and using less memory, both with real and simulated data. The running time of the algorithm is comparable to existing algorithms. © Springer-Verlag Berlin Heidelberg 2010. | en_HK |
dc.language | eng | en_US |
dc.publisher | Springer Verlag. The Journal's web site is located at http://springerlink.com/content/105633/ | en_HK |
dc.relation.ispartof | Lecture Notes in Computer Science | en_HK |
dc.rights | The original publication is available at www.springerlink.com | - |
dc.subject | De Bruijn graph | en_HK |
dc.subject | De Novo assembly | en_HK |
dc.subject | High throughput short reads | en_HK |
dc.subject | Mate-pair | en_HK |
dc.subject | String graph | en_HK |
dc.title | IDBA - A practical iterative De Bruijn graph De Novo assembler | en_HK |
dc.type | Conference_Paper | en_HK |
dc.identifier.email | Leung, HCM:cmleung2@cs.hku.hk | en_HK |
dc.identifier.email | Yiu, SM:smyiu@cs.hku.hk | en_HK |
dc.identifier.email | Chin, FYL:chin@cs.hku.hk | en_HK |
dc.identifier.authority | Leung, HCM=rp00144 | en_HK |
dc.identifier.authority | Yiu, SM=rp00207 | en_HK |
dc.identifier.authority | Chin, FYL=rp00105 | en_HK |
dc.description.nature | postprint | - |
dc.identifier.doi | 10.1007/978-3-642-12683-3_28 | en_HK |
dc.identifier.scopus | eid_2-s2.0-78650270346 | en_HK |
dc.identifier.hkuros | 178332 | en_US |
dc.identifier.hkuros | 169727 | - |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-78650270346&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 6044 LNBI | en_HK |
dc.identifier.spage | 426 | en_HK |
dc.identifier.epage | 440 | en_HK |
dc.publisher.place | Germany | en_HK |
dc.description.other | The 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010), Lisbon, Portugal, 25-28 April 2010. In Lecture Notes in Computer Science, 2010, v. 6044, p. 426-440 | - |
dc.identifier.scopusauthorid | Peng, Y=30267885400 | en_HK |
dc.identifier.scopusauthorid | Leung, HCM=35233742700 | en_HK |
dc.identifier.scopusauthorid | Yiu, SM=7003282240 | en_HK |
dc.identifier.scopusauthorid | Chin, FYL=7005101915 | en_HK |
dc.identifier.citeulike | 7896392 | - |
dc.identifier.issnl | 0302-9743 | - |