File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: HLA typing from next generation sequencing data : strategies, methods, and applications
Title | HLA typing from next generation sequencing data : strategies, methods, and applications |
---|---|
Authors | |
Issue Date | 2015 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Huang, Y. [黄亚志]. (2015). HLA typing from next generation sequencing data : strategies, methods, and applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5736651. |
Abstract | HLA genes play a key role in the human immune system. As a result, HLA typing is very important for both clinical laboratories and biomedical research. However, HLA typing has always been challenging due to the complexity of this group of genes, including existence of large number of alleles for most HLA genes, major sequence difference between these alleles, sequence similarity among the paralogous HLA genes, and long range linkage disequilibrium (LD) in this region.
With the development of NGS, large amount of sequencing data are becoming widely available. Although most of them were not generated for this purpose, they still provide valuable resources for HLA typing. NGS data might be useful in many aspects such as preliminary screening for potential organ donors, for individuals that are potentially susceptible to adverse drug responses, risk prediction for complex diseases, and population genetic studies.
However, due to the complexity of the HLA loci, the large amount of NGS data has not been made informative on HLA genotypes. Many efforts have been made on HLA typing by mining NGS data, including the alignment-based method that relies on counting the number of short reads aligned to each specific allele, the assembly- and scoring-based method that takes into account of good quality contigs and their scores for each candidate HLA allele. These methods capitalize on the increasing accessibility and affordability of NGS sequencing and have greatly reduced the time and cost required to make an HLA call comparing to traditional standard PCR-based solutions. Unfortunately, these methods are only capable of achieving low-digit resolution and perform poorly at higher-digit resolution, which is required for clinical applications.
In this study, I introduce a novel approach for accurate HLA typing at high-digit resolution based on a strategy of comparing sequence reads to a comprehensive reference panel containing all the known HLA alleles for high efficiency mapping, followed by assembly of the mapped reads to contigs, stepwise matching and designation of the contigs to HLA alleles and decision on HLA allele calling. Testing of the method on a set of public and internal whole exome sequencing (WES) data demonstrated that this new method is capable of reporting HLA alleles at a high-digit resolution with great accuracy.
I also conducted a preliminary analysis of WES data from a set of NGS samples generated in-house. HLA calling results demonstrated consistent allele frequencies to those recorded in the Allele Frequency Net Database (AFND) of the same population. In addition, I also used the typing results from the NGS samples to design population-specific primers and probes. Finally, two set of samples with Crohn’s disease and tuberculosis were studied using this method. Results showed that the previous findings related to HLA and these diseases were effectively replicated in our dataset using the proposed method. Certain interesting results that were not yet reported were also observed in our dataset. These preliminary results highlighted the potential applications of this method for HLA calling from NGS data. |
Degree | Doctor of Philosophy |
Subject | HLA histocompatibility antigens Nucleotide sequence |
Dept/Program | Paediatrics and Adolescent Medicine |
Persistent Identifier | http://hdl.handle.net/10722/235792 |
HKU Library Item ID | b5736651 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Huang, Yazhi | - |
dc.contributor.author | 黄亚志 | - |
dc.date.accessioned | 2016-10-21T23:26:02Z | - |
dc.date.available | 2016-10-21T23:26:02Z | - |
dc.date.issued | 2015 | - |
dc.identifier.citation | Huang, Y. [黄亚志]. (2015). HLA typing from next generation sequencing data : strategies, methods, and applications. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5736651. | - |
dc.identifier.uri | http://hdl.handle.net/10722/235792 | - |
dc.description.abstract | HLA genes play a key role in the human immune system. As a result, HLA typing is very important for both clinical laboratories and biomedical research. However, HLA typing has always been challenging due to the complexity of this group of genes, including existence of large number of alleles for most HLA genes, major sequence difference between these alleles, sequence similarity among the paralogous HLA genes, and long range linkage disequilibrium (LD) in this region. With the development of NGS, large amount of sequencing data are becoming widely available. Although most of them were not generated for this purpose, they still provide valuable resources for HLA typing. NGS data might be useful in many aspects such as preliminary screening for potential organ donors, for individuals that are potentially susceptible to adverse drug responses, risk prediction for complex diseases, and population genetic studies. However, due to the complexity of the HLA loci, the large amount of NGS data has not been made informative on HLA genotypes. Many efforts have been made on HLA typing by mining NGS data, including the alignment-based method that relies on counting the number of short reads aligned to each specific allele, the assembly- and scoring-based method that takes into account of good quality contigs and their scores for each candidate HLA allele. These methods capitalize on the increasing accessibility and affordability of NGS sequencing and have greatly reduced the time and cost required to make an HLA call comparing to traditional standard PCR-based solutions. Unfortunately, these methods are only capable of achieving low-digit resolution and perform poorly at higher-digit resolution, which is required for clinical applications. In this study, I introduce a novel approach for accurate HLA typing at high-digit resolution based on a strategy of comparing sequence reads to a comprehensive reference panel containing all the known HLA alleles for high efficiency mapping, followed by assembly of the mapped reads to contigs, stepwise matching and designation of the contigs to HLA alleles and decision on HLA allele calling. Testing of the method on a set of public and internal whole exome sequencing (WES) data demonstrated that this new method is capable of reporting HLA alleles at a high-digit resolution with great accuracy. I also conducted a preliminary analysis of WES data from a set of NGS samples generated in-house. HLA calling results demonstrated consistent allele frequencies to those recorded in the Allele Frequency Net Database (AFND) of the same population. In addition, I also used the typing results from the NGS samples to design population-specific primers and probes. Finally, two set of samples with Crohn’s disease and tuberculosis were studied using this method. Results showed that the previous findings related to HLA and these diseases were effectively replicated in our dataset using the proposed method. Certain interesting results that were not yet reported were also observed in our dataset. These preliminary results highlighted the potential applications of this method for HLA calling from NGS data. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | HLA histocompatibility antigens | - |
dc.subject.lcsh | Nucleotide sequence | - |
dc.title | HLA typing from next generation sequencing data : strategies, methods, and applications | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5736651 | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Paediatrics and Adolescent Medicine | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5736651 | - |
dc.identifier.mmsid | 991019345039703414 | - |