File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Statistical methods for Mendelian randomization and causal mediation analysis
Title | Statistical methods for Mendelian randomization and causal mediation analysis |
---|---|
Authors | |
Advisors | |
Issue Date | 2023 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Xu, S. [徐思琦]. (2023). Statistical methods for Mendelian randomization and causal mediation analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | This thesis focuses on two topics about causal inference: Mendelian randomization (MR) and causal mediation analysis.
MR utilizes genetic variants as instrumental variables (IVs) to estimate the causal effect of an exposure variable on an outcome of interest even in the presence of unmeasured confounders. However, the estimation and inference of the causal effect between the exposure variable and the outcome of interest could be biased in the presence of horizontal pleiotropy in human genome, which is a phenomenon that the genetic variants directly affect the outcome not mediated by the exposure variable. In this thesis, a novel MR approach named MRCIP is proposed to account for two common types of horizontal pleiotropy simultaneously, namely correlated pleiotropy and idiosyncratic pleiotropy. The correlated pleiotropy is modeled by a random-effect model, and the idiosyncratic pleiotropy is handled by a weighting method based on the transformed Pearson residuals. The improved performance of MRCIP is demonstrated by extensive simulation studies and the analysis of two real data for the causal effect of triglycerides on coronary artery disease and the causal effect of low-density lipoprotein cholesterol on Alzheimer’s disease.
In addition to horizontal pleiotropy, MR estimators could also be biased by weak IVs, which occur when the genetic variants are only weakly associated with the exposure variable. The popular inverse-variance weighted (IVW) estimator has also been found to suffer from substantial bias in the presence of weak IVs. In this thesis, a penalized IVW (pIVW) estimator is proposed to handle the weak-IV issue by adjusting the original IVW estimator through a penalized log-likelihood function, which prevents the denominator of the ratio estimator from being close to zero in the presence of weak IVs and thus provides improved estimation. The theoretical results show that the pIVW estimator has smaller bias and variance than the recently proposed debiased IVW (dIVW) estimator under some regularity conditions, which is further verified by extensive simulation studies and real data analysis for the causal effects of five obesity-related exposures on three COVID-19 outcomes.
Causal mediation analysis decomposes the causal effect of an exposure on an outcome of interest into an indirect effect through a mediator and a direct effect not through the mediator. Tchetgen Tchetgen and Shpitser (2012) proposed the multiply-robust estimators for the natural direct and indirect effects based on the efficient influence functions of the potential outcomes, but which require plug-in estimates for some nuisance functions in the efficient influence functions. Therefore, these causal effect estimators might be biased when some of the nuisance functions are not correctly specified. This thesis presents a new method called DeepMed to reduce bias in estimating natural direct and indirect effects by utilizing deep neural networks to cross-fit the nuisance functions in the efficient influence functions. The improved performance of DeepMed over the other competing machine learning methods is shown by extensive simulation studies and the analysis of two real data on algorithm fairness of recidivism risk prediction against race and income fairness against gender. |
Degree | Doctor of Philosophy |
Subject | Genetics - Statistical methods Mediation (Statistics) |
Dept/Program | Statistics and Actuarial Science |
Persistent Identifier | http://hdl.handle.net/10722/335054 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Fung, TWK | - |
dc.contributor.advisor | Liu, Z | - |
dc.contributor.author | Xu, Siqi | - |
dc.contributor.author | 徐思琦 | - |
dc.date.accessioned | 2023-10-24T08:58:42Z | - |
dc.date.available | 2023-10-24T08:58:42Z | - |
dc.date.issued | 2023 | - |
dc.identifier.citation | Xu, S. [徐思琦]. (2023). Statistical methods for Mendelian randomization and causal mediation analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/335054 | - |
dc.description.abstract | This thesis focuses on two topics about causal inference: Mendelian randomization (MR) and causal mediation analysis. MR utilizes genetic variants as instrumental variables (IVs) to estimate the causal effect of an exposure variable on an outcome of interest even in the presence of unmeasured confounders. However, the estimation and inference of the causal effect between the exposure variable and the outcome of interest could be biased in the presence of horizontal pleiotropy in human genome, which is a phenomenon that the genetic variants directly affect the outcome not mediated by the exposure variable. In this thesis, a novel MR approach named MRCIP is proposed to account for two common types of horizontal pleiotropy simultaneously, namely correlated pleiotropy and idiosyncratic pleiotropy. The correlated pleiotropy is modeled by a random-effect model, and the idiosyncratic pleiotropy is handled by a weighting method based on the transformed Pearson residuals. The improved performance of MRCIP is demonstrated by extensive simulation studies and the analysis of two real data for the causal effect of triglycerides on coronary artery disease and the causal effect of low-density lipoprotein cholesterol on Alzheimer’s disease. In addition to horizontal pleiotropy, MR estimators could also be biased by weak IVs, which occur when the genetic variants are only weakly associated with the exposure variable. The popular inverse-variance weighted (IVW) estimator has also been found to suffer from substantial bias in the presence of weak IVs. In this thesis, a penalized IVW (pIVW) estimator is proposed to handle the weak-IV issue by adjusting the original IVW estimator through a penalized log-likelihood function, which prevents the denominator of the ratio estimator from being close to zero in the presence of weak IVs and thus provides improved estimation. The theoretical results show that the pIVW estimator has smaller bias and variance than the recently proposed debiased IVW (dIVW) estimator under some regularity conditions, which is further verified by extensive simulation studies and real data analysis for the causal effects of five obesity-related exposures on three COVID-19 outcomes. Causal mediation analysis decomposes the causal effect of an exposure on an outcome of interest into an indirect effect through a mediator and a direct effect not through the mediator. Tchetgen Tchetgen and Shpitser (2012) proposed the multiply-robust estimators for the natural direct and indirect effects based on the efficient influence functions of the potential outcomes, but which require plug-in estimates for some nuisance functions in the efficient influence functions. Therefore, these causal effect estimators might be biased when some of the nuisance functions are not correctly specified. This thesis presents a new method called DeepMed to reduce bias in estimating natural direct and indirect effects by utilizing deep neural networks to cross-fit the nuisance functions in the efficient influence functions. The improved performance of DeepMed over the other competing machine learning methods is shown by extensive simulation studies and the analysis of two real data on algorithm fairness of recidivism risk prediction against race and income fairness against gender. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Genetics - Statistical methods | - |
dc.subject.lcsh | Mediation (Statistics) | - |
dc.title | Statistical methods for Mendelian randomization and causal mediation analysis | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Statistics and Actuarial Science | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2023 | - |
dc.identifier.mmsid | 991044731386603414 | - |