File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Practical aspects of compressed suffix arrays and FM-index in searching DNA sequences

TitlePractical aspects of compressed suffix arrays and FM-index in searching DNA sequences
Authors
Issue Date2004
Citation
Proceedings Of The Sixth Workshop On Algorithm Engineering And Experiments And The First Workshop On Analytic Algorithms And Combinatorics, 2004, p. 31-38 How to Cite?
AbstractSearching patterns in the DNA sequence is an important step in biological research. To speed up the search process, one can index the DNA sequence. However, classical indexing data structures like suffix trees and suffix arrays are not feasible for indexing DNA sequences due to main memory requirement, as DNA sequences can be very long. In this paper, we evaluate the performance of two compressed data structures, Compressed Suffix Array (CSA) and FM-index, in the context of searching and indexing DNA sequences. Our results show that CSA is better than FM-index for searching long patterns. We also investigate other practical aspects of the data structures such as the memory requirement for building the indexes.
Persistent Identifierhttp://hdl.handle.net/10722/93076
References

 

DC FieldValueLanguage
dc.contributor.authorHon, WKen_HK
dc.contributor.authorLam, TWen_HK
dc.contributor.authorSung, WKen_HK
dc.contributor.authorTse, WLen_HK
dc.contributor.authorWong, CKen_HK
dc.contributor.authorYiu, SMen_HK
dc.date.accessioned2010-09-25T14:50:10Z-
dc.date.available2010-09-25T14:50:10Z-
dc.date.issued2004en_HK
dc.identifier.citationProceedings Of The Sixth Workshop On Algorithm Engineering And Experiments And The First Workshop On Analytic Algorithms And Combinatorics, 2004, p. 31-38en_HK
dc.identifier.urihttp://hdl.handle.net/10722/93076-
dc.description.abstractSearching patterns in the DNA sequence is an important step in biological research. To speed up the search process, one can index the DNA sequence. However, classical indexing data structures like suffix trees and suffix arrays are not feasible for indexing DNA sequences due to main memory requirement, as DNA sequences can be very long. In this paper, we evaluate the performance of two compressed data structures, Compressed Suffix Array (CSA) and FM-index, in the context of searching and indexing DNA sequences. Our results show that CSA is better than FM-index for searching long patterns. We also investigate other practical aspects of the data structures such as the memory requirement for building the indexes.en_HK
dc.languageengen_HK
dc.relation.ispartofProceedings of the Sixth Workshop on Algorithm Engineering and Experiments and the First Workshop on Analytic Algorithms and Combinatoricsen_HK
dc.titlePractical aspects of compressed suffix arrays and FM-index in searching DNA sequencesen_HK
dc.typeConference_Paperen_HK
dc.identifier.emailLam, TW:twlam@cs.hku.hken_HK
dc.identifier.emailYiu, SM:smyiu@cs.hku.hken_HK
dc.identifier.authorityLam, TW=rp00135en_HK
dc.identifier.authorityYiu, SM=rp00207en_HK
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.scopuseid_2-s2.0-8344235972en_HK
dc.identifier.hkuros103185en_HK
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-8344235972&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.spage31en_HK
dc.identifier.epage38en_HK
dc.identifier.scopusauthoridHon, WK=7004282818en_HK
dc.identifier.scopusauthoridLam, TW=7202523165en_HK
dc.identifier.scopusauthoridSung, WK=13310059700en_HK
dc.identifier.scopusauthoridTse, WL=35992065800en_HK
dc.identifier.scopusauthoridWong, CK=7404953816en_HK
dc.identifier.scopusauthoridYiu, SM=7003282240en_HK

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats