File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM Approach

TitlePERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM Approach
Authors
KeywordsGenome assembly
Greedy-like prediction
Look ahead technology
Variable overlap sizes
Issue Date2013
PublisherACM.
Citation
ACM-BCB2013: Washington, DC, USA, September 22-25, 2013. In Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics (ACM-BCB 2013), p. 161-170 How to Cite?
AbstractSince the read lengths of high throughput sequencing (HTS) technologies are short, de novo assembly which plays significant roles in many applications remains a great challenge. Most of the state-of-the-art approaches base on de Bruijn graph strategy and overlap-layout strategy. However, these approaches which depend on k-mers or read overlaps do not fully utilize information of single-end and paired-end reads when resolving branches, e.g. the number and positions of reads supporting each possible extension are not taken into account when resolving branches. We present PERGA (Paired-End Reads Guided Assembler), a novel sequence-reads-guided de novo assembly approach, which adopts greedy-like prediction strategy for assembling reads to contigs and scaffolds. Instead of using single-end reads to construct contig, PERGA uses paired-end reads and different read overlap size thresholds ranging from Omax to Omin to resolve the gaps and branches. Moreover, by constructing a decision model using machine learning approach based on branch features, PERGA can determine the correct extension in 99.7% of cases. When the correct extension cannot be determined, PERGA will try to extend the contigs by all feasible extensions and determine the correct extension by using look ahead technology. We evaluated PERGA on both simulated Illumina data sets and real data sets, and it constructed longer and more correct contigs and scaffolds than other state-of-the-art assemblers IDBA-UD, Velvet, ABySS, SGA and CABOG. Availability: https://github.com/hitbio/PERGA
Persistent Identifierhttp://hdl.handle.net/10722/195935
ISBN

 

DC FieldValueLanguage
dc.contributor.authorZhu, Xen_US
dc.contributor.authorLeung, HCMen_US
dc.contributor.authorChin, FYLen_US
dc.contributor.authorYiu, SMen_US
dc.contributor.authorQuan, Gen_US
dc.contributor.authorLiu, Ben_US
dc.contributor.authorWang, Yen_US
dc.date.accessioned2014-03-21T02:23:32Z-
dc.date.available2014-03-21T02:23:32Z-
dc.date.issued2013en_US
dc.identifier.citationACM-BCB2013: Washington, DC, USA, September 22-25, 2013. In Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics (ACM-BCB 2013), p. 161-170en_US
dc.identifier.isbn9781450324342en_US
dc.identifier.urihttp://hdl.handle.net/10722/195935-
dc.description.abstractSince the read lengths of high throughput sequencing (HTS) technologies are short, de novo assembly which plays significant roles in many applications remains a great challenge. Most of the state-of-the-art approaches base on de Bruijn graph strategy and overlap-layout strategy. However, these approaches which depend on k-mers or read overlaps do not fully utilize information of single-end and paired-end reads when resolving branches, e.g. the number and positions of reads supporting each possible extension are not taken into account when resolving branches. We present PERGA (Paired-End Reads Guided Assembler), a novel sequence-reads-guided de novo assembly approach, which adopts greedy-like prediction strategy for assembling reads to contigs and scaffolds. Instead of using single-end reads to construct contig, PERGA uses paired-end reads and different read overlap size thresholds ranging from Omax to Omin to resolve the gaps and branches. Moreover, by constructing a decision model using machine learning approach based on branch features, PERGA can determine the correct extension in 99.7% of cases. When the correct extension cannot be determined, PERGA will try to extend the contigs by all feasible extensions and determine the correct extension by using look ahead technology. We evaluated PERGA on both simulated Illumina data sets and real data sets, and it constructed longer and more correct contigs and scaffolds than other state-of-the-art assemblers IDBA-UD, Velvet, ABySS, SGA and CABOG. Availability: https://github.com/hitbio/PERGAen_US
dc.languageengen_US
dc.publisherACM.en_US
dc.relation.ispartofProceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics (ACM-BCB 2013)en_US
dc.subjectGenome assembly-
dc.subjectGreedy-like prediction-
dc.subjectLook ahead technology-
dc.subjectVariable overlap sizes-
dc.titlePERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM Approachen_US
dc.typeConference_Paperen_US
dc.identifier.emailLeung, HCM: cmleung2@cs.hku.hken_US
dc.identifier.emailChin, FYL: chin@cs.hku.hken_US
dc.identifier.emailYiu, SM: smyiu@cs.hku.hken_US
dc.identifier.authorityLeung, HCM=rp00144en_US
dc.identifier.authorityChin, FYL=rp00105en_US
dc.identifier.authorityYiu, SM=rp00207en_US
dc.identifier.doi10.1145/2506583.2506612en_US
dc.identifier.scopuseid_2-s2.0-84888142004-
dc.identifier.hkuros228333en_US
dc.identifier.spage161en_US
dc.identifier.epage170en_US
dc.publisher.placeNew York, NYen_US
dc.customcontrol.immutableyiu 140623-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats