File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1504/IJDMB.2017.091354
- Scopus: eid_2-s2.0-85046281894
- WOS: WOS:000434131100001
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Accurate annotation of metagenomic data without species-level reference
Title | Accurate annotation of metagenomic data without species-level reference |
---|---|
Authors | |
Keywords | Accurate Binning Fast annotation Metagenomic data analysis |
Issue Date | 2017 |
Publisher | Inderscience Publishers. The Journal's web site is located at http://www.inderscience.com/ijdmb |
Citation | International Journal of Data Mining and Bioinformatics, 2017, v. 19 n. 4, p. 283-297 How to Cite? |
Abstract | Taxonomic annotation is a critical first step for analysis of metagenomic data. Despite a lot of tools being developed, the accuracy is still not satisfactory, in particular, when a close species-level reference does not exist in the database. In this paper, we propose a novel annotation tool, MetaAnnotator, to annotate metagenomic reads, which outperforms all existing tools significantly when only genus-level references exist in the database. From our experiments, MetaAnnotator can assign 87.5% reads correctly (67.5% reads are assigned to the exact genus) with only 8.5% reads wrongly assigned. The best existing tool (MetaCluster-TA) can only achieve 73.4% correct read assignment (with only 50.9% reads assigned to the exact genus and 22.6% reads wrongly assigned). The speed of MetaAnnotator is also the second faster (1 hour for 20 million reads). The core concepts behind MetaAnnotator includes: (i) we only consider exact k-mers in coding regions of the references as they should be more significant and accurate; (ii) to assign reads to taxonomy nodes, we construct genome and taxonomy specific probabilistic models from the reference database; and (iii) using the BWT data structure to speed up the k-mer matching process. |
Persistent Identifier | http://hdl.handle.net/10722/260852 |
ISSN | 2023 Impact Factor: 0.2 2023 SCImago Journal Rankings: 0.173 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yao, H | - |
dc.contributor.author | Lam, TW | - |
dc.contributor.author | Ting, HF | - |
dc.contributor.author | Yiu, SM | - |
dc.contributor.author | Wang, YD | - |
dc.contributor.author | Liu, B | - |
dc.date.accessioned | 2018-09-14T08:48:31Z | - |
dc.date.available | 2018-09-14T08:48:31Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | International Journal of Data Mining and Bioinformatics, 2017, v. 19 n. 4, p. 283-297 | - |
dc.identifier.issn | 1748-5673 | - |
dc.identifier.uri | http://hdl.handle.net/10722/260852 | - |
dc.description.abstract | Taxonomic annotation is a critical first step for analysis of metagenomic data. Despite a lot of tools being developed, the accuracy is still not satisfactory, in particular, when a close species-level reference does not exist in the database. In this paper, we propose a novel annotation tool, MetaAnnotator, to annotate metagenomic reads, which outperforms all existing tools significantly when only genus-level references exist in the database. From our experiments, MetaAnnotator can assign 87.5% reads correctly (67.5% reads are assigned to the exact genus) with only 8.5% reads wrongly assigned. The best existing tool (MetaCluster-TA) can only achieve 73.4% correct read assignment (with only 50.9% reads assigned to the exact genus and 22.6% reads wrongly assigned). The speed of MetaAnnotator is also the second faster (1 hour for 20 million reads). The core concepts behind MetaAnnotator includes: (i) we only consider exact k-mers in coding regions of the references as they should be more significant and accurate; (ii) to assign reads to taxonomy nodes, we construct genome and taxonomy specific probabilistic models from the reference database; and (iii) using the BWT data structure to speed up the k-mer matching process. | - |
dc.language | eng | - |
dc.publisher | Inderscience Publishers. The Journal's web site is located at http://www.inderscience.com/ijdmb | - |
dc.relation.ispartof | International Journal of Data Mining and Bioinformatics | - |
dc.rights | International Journal of Data Mining and Bioinformatics. Copyright © Inderscience Publishers. | - |
dc.subject | Accurate | - |
dc.subject | Binning | - |
dc.subject | Fast annotation | - |
dc.subject | Metagenomic data analysis | - |
dc.title | Accurate annotation of metagenomic data without species-level reference | - |
dc.type | Article | - |
dc.identifier.email | Lam, TW: twlam@cs.hku.hk | - |
dc.identifier.email | Ting, HF: hfting@cs.hku.hk | - |
dc.identifier.email | Yiu, SM: smyiu@cs.hku.hk | - |
dc.identifier.authority | Lam, TW=rp00135 | - |
dc.identifier.authority | Ting, HF=rp00177 | - |
dc.identifier.authority | Yiu, SM=rp00207 | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1504/IJDMB.2017.091354 | - |
dc.identifier.scopus | eid_2-s2.0-85046281894 | - |
dc.identifier.hkuros | 290695 | - |
dc.identifier.volume | 19 | - |
dc.identifier.issue | 4 | - |
dc.identifier.spage | 283 | - |
dc.identifier.epage | 297 | - |
dc.identifier.isi | WOS:000434131100001 | - |
dc.publisher.place | United Kingdom | - |
dc.identifier.issnl | 1748-5673 | - |