File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Approximate string matching using compressed suffix arrays

TitleApproximate string matching using compressed suffix arrays
Authors
Issue Date2006
PublisherElsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcs
Citation
Theoretical Computer Science, 2006, v. 352 n. 1-3, p. 240-249 How to Cite?
AbstractLet T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster. We give a solution using an O(nlogn) bits indexing data structure with O(|A|kmk·max(k,logn) +occ) query time, where occ is the number of occurrences. The best previous result requires O(nlogn) bits indexing data structure and gives O(|A|kmk+2+occ) query time. Our solution also allows us to exploit compressed suffix arrays to reduce the indexing space to O(n) bits, while increasing the query time by an O(logn) factor only. © 2005 Elsevier B.V. All right reserved.
Persistent Identifierhttp://hdl.handle.net/10722/152330
ISSN
2021 Impact Factor: 1.002
2020 SCImago Journal Rankings: 0.464
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorHuynh, TNDen_US
dc.contributor.authorHon, WKen_US
dc.contributor.authorLam, TWen_US
dc.contributor.authorSung, WKen_US
dc.date.accessioned2012-06-26T06:37:14Z-
dc.date.available2012-06-26T06:37:14Z-
dc.date.issued2006en_US
dc.identifier.citationTheoretical Computer Science, 2006, v. 352 n. 1-3, p. 240-249en_US
dc.identifier.issn0304-3975en_US
dc.identifier.urihttp://hdl.handle.net/10722/152330-
dc.description.abstractLet T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabet A. The k-difference (k-mismatch, respectively) problem is to find all occurrences of P in T that have edit distance (Hamming distance, respectively) at most k from P. In this paper we investigate a well-studied case in which T is fixed and preprocessed into an indexing data structure so that any pattern query can be answered faster. We give a solution using an O(nlogn) bits indexing data structure with O(|A|kmk·max(k,logn) +occ) query time, where occ is the number of occurrences. The best previous result requires O(nlogn) bits indexing data structure and gives O(|A|kmk+2+occ) query time. Our solution also allows us to exploit compressed suffix arrays to reduce the indexing space to O(n) bits, while increasing the query time by an O(logn) factor only. © 2005 Elsevier B.V. All right reserved.en_US
dc.languageengen_US
dc.publisherElsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcsen_US
dc.relation.ispartofTheoretical Computer Scienceen_US
dc.titleApproximate string matching using compressed suffix arraysen_US
dc.typeArticleen_US
dc.identifier.emailLam, TW:twlam@cs.hku.hken_US
dc.identifier.authorityLam, TW=rp00135en_US
dc.description.naturelink_to_subscribed_fulltexten_US
dc.identifier.doi10.1016/j.tcs.2005.11.022en_US
dc.identifier.scopuseid_2-s2.0-32644436921en_US
dc.identifier.hkuros130777-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-32644436921&selection=ref&src=s&origin=recordpageen_US
dc.identifier.volume352en_US
dc.identifier.issue1-3en_US
dc.identifier.spage240en_US
dc.identifier.epage249en_US
dc.identifier.isiWOS:000235826900018-
dc.publisher.placeNetherlandsen_US
dc.identifier.scopusauthoridHuynh, TND=8846778300en_US
dc.identifier.scopusauthoridHon, WK=7004282818en_US
dc.identifier.scopusauthoridLam, TW=7202523165en_US
dc.identifier.scopusauthoridSung, WK=13310059700en_US
dc.identifier.issnl0304-3975-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats