File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Accurate bioinformatic software for metagenomic sequencing
Title | Accurate bioinformatic software for metagenomic sequencing |
---|---|
Authors | |
Advisors | |
Issue Date | 2022 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Lui, W. W. [雷匯宏]. (2022). Accurate bioinformatic software for metagenomic sequencing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Microbial identification and accurate antimicrobial resistance (AMR) detection are critical for acute infectious disease treatment. Understanding the composition of pathogens in patient samples allows potential association of the composition with detected antimicrobial resistance genes (ARG). Variants in antimicrobial resistance genes are critical in the characterization of AMR phenotypes. Oxford Nanopore Technology (ONT) provides long-read, real-time sequencing for rapid screening of metagenomics. Long metagenomic reads provide ultra-discovery power even with limited sequencing throughput. However, the performance has been held back by a high per-base error rate.
For high accuracy and sensitivity of rapid AMR and pathogen detection, we developed an alignment-based ONT-specific metagenomic analysis tool, called MegaPath-Nano. Our tool performs 1) multi-level filtering against contamination reads and noisy alignments, 2) alignment-based taxonomic classification using RefSeq, with an alignment-reassignment algorithm to tackle the challenge of nonunique alignments, and 3) downstream drug-level AMR detection with five major AMR databases. Compared with existing k-mer and statistical-model compositional analysis methods, alignment-based MegaPath-Nano has proved accurate in taxonomic classification and abundance estimation in taxonomies using both standard reference datasets and antimicrobial-susceptibility-tested patient isolate datasets. The AMR detection of MegaPath-Nano was also found to be the most comprehensive and accurate of the state-of-the-art tools.
To avoid information loss and the resulting errors in variant-calling high-depth or high-coverage-deviation ONT data in targeted sequencing or metagenomic sequencing, the variant caller Clair-ensemble was used to support accurate variantcalling with multi-depth or extremely high depth in targeted sequencing data. The multi-model calling function enables precise variant-calling. To evaluate its performance and practicality, Clair-ensemble was tested on both TB clinical isolate and reference DNA samples. Clair-ensemble achieved >99% recall and accuracy for SNV calling. |
Degree | Master of Philosophy |
Subject | Bioinformatics |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/318384 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Luo, R | - |
dc.contributor.advisor | Lam, TW | - |
dc.contributor.author | Lui, Wui Wang | - |
dc.contributor.author | 雷匯宏 | - |
dc.date.accessioned | 2022-10-10T08:18:51Z | - |
dc.date.available | 2022-10-10T08:18:51Z | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | Lui, W. W. [雷匯宏]. (2022). Accurate bioinformatic software for metagenomic sequencing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/318384 | - |
dc.description.abstract | Microbial identification and accurate antimicrobial resistance (AMR) detection are critical for acute infectious disease treatment. Understanding the composition of pathogens in patient samples allows potential association of the composition with detected antimicrobial resistance genes (ARG). Variants in antimicrobial resistance genes are critical in the characterization of AMR phenotypes. Oxford Nanopore Technology (ONT) provides long-read, real-time sequencing for rapid screening of metagenomics. Long metagenomic reads provide ultra-discovery power even with limited sequencing throughput. However, the performance has been held back by a high per-base error rate. For high accuracy and sensitivity of rapid AMR and pathogen detection, we developed an alignment-based ONT-specific metagenomic analysis tool, called MegaPath-Nano. Our tool performs 1) multi-level filtering against contamination reads and noisy alignments, 2) alignment-based taxonomic classification using RefSeq, with an alignment-reassignment algorithm to tackle the challenge of nonunique alignments, and 3) downstream drug-level AMR detection with five major AMR databases. Compared with existing k-mer and statistical-model compositional analysis methods, alignment-based MegaPath-Nano has proved accurate in taxonomic classification and abundance estimation in taxonomies using both standard reference datasets and antimicrobial-susceptibility-tested patient isolate datasets. The AMR detection of MegaPath-Nano was also found to be the most comprehensive and accurate of the state-of-the-art tools. To avoid information loss and the resulting errors in variant-calling high-depth or high-coverage-deviation ONT data in targeted sequencing or metagenomic sequencing, the variant caller Clair-ensemble was used to support accurate variantcalling with multi-depth or extremely high depth in targeted sequencing data. The multi-model calling function enables precise variant-calling. To evaluate its performance and practicality, Clair-ensemble was tested on both TB clinical isolate and reference DNA samples. Clair-ensemble achieved >99% recall and accuracy for SNV calling. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Bioinformatics | - |
dc.title | Accurate bioinformatic software for metagenomic sequencing | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2022 | - |
dc.identifier.mmsid | 991044600196903414 | - |