File Download
Supplementary

postgraduate thesis: Accurate bioinformatic software for metagenomic sequencing

TitleAccurate bioinformatic software for metagenomic sequencing
Authors
Advisors
Advisor(s):Luo, RLam, TW
Issue Date2022
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Lui, W. W. [雷匯宏]. (2022). Accurate bioinformatic software for metagenomic sequencing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractMicrobial identification and accurate antimicrobial resistance (AMR) detection are critical for acute infectious disease treatment. Understanding the composition of pathogens in patient samples allows potential association of the composition with detected antimicrobial resistance genes (ARG). Variants in antimicrobial resistance genes are critical in the characterization of AMR phenotypes. Oxford Nanopore Technology (ONT) provides long-read, real-time sequencing for rapid screening of metagenomics. Long metagenomic reads provide ultra-discovery power even with limited sequencing throughput. However, the performance has been held back by a high per-base error rate. For high accuracy and sensitivity of rapid AMR and pathogen detection, we developed an alignment-based ONT-specific metagenomic analysis tool, called MegaPath-Nano. Our tool performs 1) multi-level filtering against contamination reads and noisy alignments, 2) alignment-based taxonomic classification using RefSeq, with an alignment-reassignment algorithm to tackle the challenge of nonunique alignments, and 3) downstream drug-level AMR detection with five major AMR databases. Compared with existing k-mer and statistical-model compositional analysis methods, alignment-based MegaPath-Nano has proved accurate in taxonomic classification and abundance estimation in taxonomies using both standard reference datasets and antimicrobial-susceptibility-tested patient isolate datasets. The AMR detection of MegaPath-Nano was also found to be the most comprehensive and accurate of the state-of-the-art tools. To avoid information loss and the resulting errors in variant-calling high-depth or high-coverage-deviation ONT data in targeted sequencing or metagenomic sequencing, the variant caller Clair-ensemble was used to support accurate variantcalling with multi-depth or extremely high depth in targeted sequencing data. The multi-model calling function enables precise variant-calling. To evaluate its performance and practicality, Clair-ensemble was tested on both TB clinical isolate and reference DNA samples. Clair-ensemble achieved >99% recall and accuracy for SNV calling.
DegreeMaster of Philosophy
SubjectBioinformatics
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/318384

 

DC FieldValueLanguage
dc.contributor.advisorLuo, R-
dc.contributor.advisorLam, TW-
dc.contributor.authorLui, Wui Wang-
dc.contributor.author雷匯宏-
dc.date.accessioned2022-10-10T08:18:51Z-
dc.date.available2022-10-10T08:18:51Z-
dc.date.issued2022-
dc.identifier.citationLui, W. W. [雷匯宏]. (2022). Accurate bioinformatic software for metagenomic sequencing. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/318384-
dc.description.abstractMicrobial identification and accurate antimicrobial resistance (AMR) detection are critical for acute infectious disease treatment. Understanding the composition of pathogens in patient samples allows potential association of the composition with detected antimicrobial resistance genes (ARG). Variants in antimicrobial resistance genes are critical in the characterization of AMR phenotypes. Oxford Nanopore Technology (ONT) provides long-read, real-time sequencing for rapid screening of metagenomics. Long metagenomic reads provide ultra-discovery power even with limited sequencing throughput. However, the performance has been held back by a high per-base error rate. For high accuracy and sensitivity of rapid AMR and pathogen detection, we developed an alignment-based ONT-specific metagenomic analysis tool, called MegaPath-Nano. Our tool performs 1) multi-level filtering against contamination reads and noisy alignments, 2) alignment-based taxonomic classification using RefSeq, with an alignment-reassignment algorithm to tackle the challenge of nonunique alignments, and 3) downstream drug-level AMR detection with five major AMR databases. Compared with existing k-mer and statistical-model compositional analysis methods, alignment-based MegaPath-Nano has proved accurate in taxonomic classification and abundance estimation in taxonomies using both standard reference datasets and antimicrobial-susceptibility-tested patient isolate datasets. The AMR detection of MegaPath-Nano was also found to be the most comprehensive and accurate of the state-of-the-art tools. To avoid information loss and the resulting errors in variant-calling high-depth or high-coverage-deviation ONT data in targeted sequencing or metagenomic sequencing, the variant caller Clair-ensemble was used to support accurate variantcalling with multi-depth or extremely high depth in targeted sequencing data. The multi-model calling function enables precise variant-calling. To evaluate its performance and practicality, Clair-ensemble was tested on both TB clinical isolate and reference DNA samples. Clair-ensemble achieved >99% recall and accuracy for SNV calling.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshBioinformatics-
dc.titleAccurate bioinformatic software for metagenomic sequencing-
dc.typePG_Thesis-
dc.description.thesisnameMaster of Philosophy-
dc.description.thesislevelMaster-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2022-
dc.identifier.mmsid991044600196903414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats