File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Statistical analysis of parent-of-origin effects on the X chromosome and DNA methylation on age prediction
Title | Statistical analysis of parent-of-origin effects on the X chromosome and DNA methylation on age prediction |
---|---|
Authors | |
Advisors | Advisor(s):Fung, TWK |
Issue Date | 2019 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Lau, P. Y. [劉沛彥]. (2019). Statistical analysis of parent-of-origin effects on the X chromosome and DNA methylation on age prediction. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Parent-of-origin effects, which describe an occurrence where the expression of a gene depends on its parental origin, are an important phenomenon in epigenetics. Statistical methods for detecting parent-of-origin effects on autosomes have been investigated for 20 years, but the development of statistical methods for detecting parent-of-origin effects on the X chromosome is relatively new. In the literature, a class of Q-XPAT-type tests are the only tests for the parent-of-origin effects for quantitative traits on the X chromosome. In this thesis, two simple and powerful classes of tests are proposed to detect parent-of-origin effects for quantitative trait values on X chromosome. The proposed tests can accommodate complete and incomplete nuclear families with any number of daughters. The simulation study shows that our proposed tests produce empirical type I error rates that close to their respective nominal levels, as well as powers that are larger than the Q-XPAT-type tests. The proposed tests are applied to a real data set on Turner's syndrome, and the proposed tests give a more significant finding than the Q-C-XPAT.
In forensic investigation, retrieving biological information from DNA evidence is a promising field of interest. One of the applications is on the estimation of the age of the donor based on DNA methylation. A large number of studies focused on age prediction using the 450K Human Methylation Beadchip. Various marker selection methods and prediction models have been considered. However, there is a lack of research evaluating different high-dimensional variable selection methods of CpG sites with various statistical and machine learning models for age prediction. The aim of this study is to evaluate five variable selection methods (forward selection, LASSO, elastic net, SCAD and ISIS) combined with a classical statistical model and machine learning models based on the mean absolute deviation (MAD) and the root-mean-square error (RMSE). Publicly available 450K data set containing 991 whole blood samples (age 19 – 101 years) was used. The result showed that the multiple linear regression model with 16 markers selected from the forward selection method performed very well in age prediction (MAD = 3.76 years and RMSE = 5.01 years). The highly advanced ultrahigh dimensional variable selection methods and sophisticated machine learning algorithms appeared unnecessary for age prediction based on DNA methylation. |
Degree | Master of Philosophy |
Subject | X chromosome DNA - Methylation - Statistical methods |
Dept/Program | Statistics and Actuarial Science |
Persistent Identifier | http://hdl.handle.net/10722/286594 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Fung, TWK | - |
dc.contributor.author | Lau, Pui Yin | - |
dc.contributor.author | 劉沛彥 | - |
dc.date.accessioned | 2020-09-02T05:47:32Z | - |
dc.date.available | 2020-09-02T05:47:32Z | - |
dc.date.issued | 2019 | - |
dc.identifier.citation | Lau, P. Y. [劉沛彥]. (2019). Statistical analysis of parent-of-origin effects on the X chromosome and DNA methylation on age prediction. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/286594 | - |
dc.description.abstract | Parent-of-origin effects, which describe an occurrence where the expression of a gene depends on its parental origin, are an important phenomenon in epigenetics. Statistical methods for detecting parent-of-origin effects on autosomes have been investigated for 20 years, but the development of statistical methods for detecting parent-of-origin effects on the X chromosome is relatively new. In the literature, a class of Q-XPAT-type tests are the only tests for the parent-of-origin effects for quantitative traits on the X chromosome. In this thesis, two simple and powerful classes of tests are proposed to detect parent-of-origin effects for quantitative trait values on X chromosome. The proposed tests can accommodate complete and incomplete nuclear families with any number of daughters. The simulation study shows that our proposed tests produce empirical type I error rates that close to their respective nominal levels, as well as powers that are larger than the Q-XPAT-type tests. The proposed tests are applied to a real data set on Turner's syndrome, and the proposed tests give a more significant finding than the Q-C-XPAT. In forensic investigation, retrieving biological information from DNA evidence is a promising field of interest. One of the applications is on the estimation of the age of the donor based on DNA methylation. A large number of studies focused on age prediction using the 450K Human Methylation Beadchip. Various marker selection methods and prediction models have been considered. However, there is a lack of research evaluating different high-dimensional variable selection methods of CpG sites with various statistical and machine learning models for age prediction. The aim of this study is to evaluate five variable selection methods (forward selection, LASSO, elastic net, SCAD and ISIS) combined with a classical statistical model and machine learning models based on the mean absolute deviation (MAD) and the root-mean-square error (RMSE). Publicly available 450K data set containing 991 whole blood samples (age 19 – 101 years) was used. The result showed that the multiple linear regression model with 16 markers selected from the forward selection method performed very well in age prediction (MAD = 3.76 years and RMSE = 5.01 years). The highly advanced ultrahigh dimensional variable selection methods and sophisticated machine learning algorithms appeared unnecessary for age prediction based on DNA methylation. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | X chromosome | - |
dc.subject.lcsh | DNA - Methylation - Statistical methods | - |
dc.title | Statistical analysis of parent-of-origin effects on the X chromosome and DNA methylation on age prediction | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Statistics and Actuarial Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2019 | - |
dc.identifier.mmsid | 991044158738403414 | - |