File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.tcs.2011.03.004
- Scopus: eid_2-s2.0-79956348246
- WOS: WOS:000292077200015
- Find via
Supplementary
-
Bookmarks:
- CiteULike: 1
- Citations:
- Appears in Collections:
Article: Cache-oblivious index for approximate string matching
Title | Cache-oblivious index for approximate string matching | ||||||||
---|---|---|---|---|---|---|---|---|---|
Authors | |||||||||
Keywords | Approximate queries Cache-oblivious I/O model Indexing String matching | ||||||||
Issue Date | 2011 | ||||||||
Publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcs | ||||||||
Citation | Theoretical Computer Science, 2011, v. 412 n. 29, p. 3579-3588 How to Cite? | ||||||||
Abstract | This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((nlog kn)B) disk pages and finds all k-error matches with O((|P|+occ)B+log knloglog Bn) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω (|P|+occ+poly(logn)) I/Os. The second index reduces the space to O((nlogn)B) disk pages, and the I/O complexity is O((|P|+occ)B+log k(k+1)nloglogn) . © 2011 Elsevier B.V. All rights reserved. | ||||||||
Persistent Identifier | http://hdl.handle.net/10722/140789 | ||||||||
ISSN | 2023 Impact Factor: 0.9 2023 SCImago Journal Rankings: 0.570 | ||||||||
ISI Accession Number ID |
Funding Information: A preliminary version appears in Proceedings of 18th Annual Symposium on Combinatorial Pattern Matching (CPM), pages 40-51, 2007. This research is supported in part by Taiwan NSC Grants 96-2221-E-007-082 and 99-2221-E-007-123 (Wing-Kai Hon), Hong Kong RGC Grant 7140/06E (Tak-Wah Lam), and US NSF Grants CCF-0621457 and CCF-1017623 (Rahul Shah and Jeffrey Scott Vitter). | ||||||||
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hon, WK | en_HK |
dc.contributor.author | Lam, TW | en_HK |
dc.contributor.author | Shah, R | en_HK |
dc.contributor.author | Tam, SL | en_HK |
dc.contributor.author | Vitter, JS | en_HK |
dc.date.accessioned | 2011-09-23T06:19:20Z | - |
dc.date.available | 2011-09-23T06:19:20Z | - |
dc.date.issued | 2011 | en_HK |
dc.identifier.citation | Theoretical Computer Science, 2011, v. 412 n. 29, p. 3579-3588 | en_HK |
dc.identifier.issn | 0304-3975 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/140789 | - |
dc.description.abstract | This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((nlog kn)B) disk pages and finds all k-error matches with O((|P|+occ)B+log knloglog Bn) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω (|P|+occ+poly(logn)) I/Os. The second index reduces the space to O((nlogn)B) disk pages, and the I/O complexity is O((|P|+occ)B+log k(k+1)nloglogn) . © 2011 Elsevier B.V. All rights reserved. | en_HK |
dc.language | eng | en_US |
dc.publisher | Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/tcs | en_HK |
dc.relation.ispartof | Theoretical Computer Science | en_HK |
dc.rights | NOTICE: this is the author’s version of a work that was accepted for publication in Theoretical Computer Science. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Theoretical Computer Science, 2011, v. 412 n. 29, p. 3579-3588. DOI: http://dx.doi.org/10.1016/j.tcs.2011.03.004 | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject | Approximate queries | en_HK |
dc.subject | Cache-oblivious | en_HK |
dc.subject | I/O model | en_HK |
dc.subject | Indexing | en_HK |
dc.subject | String matching | en_HK |
dc.title | Cache-oblivious index for approximate string matching | en_HK |
dc.type | Article | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0304-3975&volume=412&issue=29&spage=3579&epage=3588&date=2011&atitle=Cache-oblivious+index+for+approximate+string+matching | - |
dc.identifier.email | Lam, TW:twlam@cs.hku.hk | en_HK |
dc.identifier.authority | Lam, TW=rp00135 | en_HK |
dc.description.nature | postprint | - |
dc.identifier.doi | 10.1016/j.tcs.2011.03.004 | en_HK |
dc.identifier.scopus | eid_2-s2.0-79956348246 | en_HK |
dc.identifier.hkuros | 192200 | en_US |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-79956348246&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 412 | en_HK |
dc.identifier.issue | 29 | en_HK |
dc.identifier.spage | 3579 | en_HK |
dc.identifier.epage | 3588 | en_HK |
dc.identifier.isi | WOS:000292077200015 | - |
dc.publisher.place | Netherlands | en_HK |
dc.identifier.scopusauthorid | Hon, WK=7004282818 | en_HK |
dc.identifier.scopusauthorid | Lam, TW=7202523165 | en_HK |
dc.identifier.scopusauthorid | Shah, R=35365088300 | en_HK |
dc.identifier.scopusauthorid | Tam, SL=14042926200 | en_HK |
dc.identifier.scopusauthorid | Vitter, JS=7005508549 | en_HK |
dc.identifier.citeulike | 9193876 | - |
dc.identifier.issnl | 0304-3975 | - |