File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: A comparison of principal component methods between multiple phenotype regression and multiple SNP regression in genetic association studies

TitleA comparison of principal component methods between multiple phenotype regression and multiple SNP regression in genetic association studies
Authors
KeywordsDimension reduction
Eigen-values
Hypothesis testing
Minimum p-value test
Multiple phenotypes
Issue Date2020
PublisherInstitute of Mathematical Statistics. The Journal's web site is located at http://www.imstat.org/aoas/
Citation
The Annals of Applied Statistics, 2020, v. 14 n. 1, p. 433-451 How to Cite?
AbstractPrincipal component analysis (PCA) is a popular method for dimension reduction in unsupervised multivariate analysis. However, existing ad hoc uses of PCA in both multivariate regression (multiple outcomes) and multiple regression (multiple predictors) lack theoretical justification. The differences in the statistical properties of PCAs in these two regression settings are not well understood. In this paper we provide theoretical results on the power of PCA in genetic association testings in both multiple phenotype and SNP-set settings. The multiple phenotype setting refers to the case when one is interested in studying the association between a single SNP and multiple phenotypes as outcomes. The SNP-set setting refers to the case when one is interested in studying the association between multiple SNPs in a SNP set and a single phenotype as the outcome. We demonstrate analytically that the properties of the PC-based analysis in these two regression settings are substantially different. We show that the lower order PCs, that is, PCs with large eigenvalues, are generally preferred and lead to a higher power in the SNP-set setting, while the higher-order PCs, that is, PCs with small eigenvalues, are generally preferred in the multiple phenotype setting. We also investigate the power of three other popular statistical methods, the Wald test, the variance component test and the minimum p-value test, in both multiple phenotype and SNP-set settings. We use theoretical power, simulation studies, and two real data analyses to validate our findings.
Persistent Identifierhttp://hdl.handle.net/10722/284608
ISSN
2021 Impact Factor: 1.959
2020 SCImago Journal Rankings: 1.674
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorLiu, Z-
dc.contributor.authorBarnett, I-
dc.contributor.authorLin, X-
dc.date.accessioned2020-08-07T09:00:05Z-
dc.date.available2020-08-07T09:00:05Z-
dc.date.issued2020-
dc.identifier.citationThe Annals of Applied Statistics, 2020, v. 14 n. 1, p. 433-451-
dc.identifier.issn1932-6157-
dc.identifier.urihttp://hdl.handle.net/10722/284608-
dc.description.abstractPrincipal component analysis (PCA) is a popular method for dimension reduction in unsupervised multivariate analysis. However, existing ad hoc uses of PCA in both multivariate regression (multiple outcomes) and multiple regression (multiple predictors) lack theoretical justification. The differences in the statistical properties of PCAs in these two regression settings are not well understood. In this paper we provide theoretical results on the power of PCA in genetic association testings in both multiple phenotype and SNP-set settings. The multiple phenotype setting refers to the case when one is interested in studying the association between a single SNP and multiple phenotypes as outcomes. The SNP-set setting refers to the case when one is interested in studying the association between multiple SNPs in a SNP set and a single phenotype as the outcome. We demonstrate analytically that the properties of the PC-based analysis in these two regression settings are substantially different. We show that the lower order PCs, that is, PCs with large eigenvalues, are generally preferred and lead to a higher power in the SNP-set setting, while the higher-order PCs, that is, PCs with small eigenvalues, are generally preferred in the multiple phenotype setting. We also investigate the power of three other popular statistical methods, the Wald test, the variance component test and the minimum p-value test, in both multiple phenotype and SNP-set settings. We use theoretical power, simulation studies, and two real data analyses to validate our findings.-
dc.languageeng-
dc.publisherInstitute of Mathematical Statistics. The Journal's web site is located at http://www.imstat.org/aoas/-
dc.relation.ispartofThe Annals of Applied Statistics-
dc.subjectDimension reduction-
dc.subjectEigen-values-
dc.subjectHypothesis testing-
dc.subjectMinimum p-value test-
dc.subjectMultiple phenotypes-
dc.titleA comparison of principal component methods between multiple phenotype regression and multiple SNP regression in genetic association studies-
dc.typeArticle-
dc.identifier.emailLiu, Z: zhhliu@hku.hk-
dc.identifier.authorityLiu, Z=rp02429-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.1214/19-AOAS1312-
dc.identifier.scopuseid_2-s2.0-85083693939-
dc.identifier.hkuros312168-
dc.identifier.volume14-
dc.identifier.issue1-
dc.identifier.spage433-
dc.identifier.epage451-
dc.identifier.isiWOS:000527373000020-
dc.publisher.placeUnited States-
dc.identifier.issnl1932-6157-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats