File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets

TitleEvaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets
Authors
KeywordsBiomedicine
Human Genetics
Molecular Medicine
Internal Medicine
Metabolic Diseases
Issue Date2012
PublisherSpringer Verlag. The Journal's web site is located at http://link.springer.de/link/service/journals/00439/index.htm
Citation
Human Genetics, 2012, v. 131 n. 5, p. 747-756 How to Cite?
AbstractCurrent genome-wide association studies (GWAS) use commercial genotyping microarrays that can assay over a million single nucleotide polymorphisms (SNPs). The number of SNPs is further boosted by advanced statistical genotype-imputation algorithms and large SNP databases for reference human populations. The testing of a huge number of SNPs needs to be taken into account in the interpretation of statistical significance in such genome-wide studies, but this is complicated by the non-independence of SNPs because of linkage disequilibrium (LD). Several previous groups have proposed the use of the effective number of independent markers (M e) for the adjustment of multiple testing, but current methods of calculation for M e are limited in accuracy or computational speed. Here, we report a more robust and fast method to calculate Me. Applying this efficient method [implemented in a free software tool named Genetic type 1 error calculator (GEC)], we systematically examined the Me, and the corresponding p-value thresholds required to control the genome-wide type 1 error rate at 0.05, for 13 Illumina or Affymetrix genotyping arrays, as well as for HapMap Project and 1000 Genomes Project datasets which are widely used in genotype imputation as reference panels. Our results suggested the use of a p-value threshold of ∼10 -7 as the criterion for genome-wide significance for early commercial genotyping arrays, but slightly more stringent p-value thresholds ∼5 × 10 -8 for current or merged commercial genotyping arrays, ∼10 -8 for all common SNPs in the 1000 Genomes Project dataset and ∼5 × 10 -8 for the common SNPs only within genes. © The Author(s) 2011.
Persistent Identifierhttp://hdl.handle.net/10722/147134
ISSN
2021 Impact Factor: 5.881
2020 SCImago Journal Rankings: 2.351
ISI Accession Number ID
Funding AgencyGrant Number
HKU7672/06 M
201007176166
European CommunityHEALTH-F2-2010-241909
University of Hong Kong Strategic Research Theme on Genomics
Funding Information:

We thank HapMap Project for the LD data and 1000 Genomes Project for the genotype data used in this project. This work was funded HKU 7672/06 M, the Small Project Funding HKU 201007176166, the European Community's Seventh Framework Programme under grant agreement No. HEALTH-F2-2010-241909 and The University of Hong Kong Strategic Research Theme on Genomics.

References
Grants

 

DC FieldValueLanguage
dc.contributor.authorLi, MXen_HK
dc.contributor.authorYeung, JMYen_HK
dc.contributor.authorCherny, SSen_HK
dc.contributor.authorSham, PCen_HK
dc.date.accessioned2012-05-28T08:20:27Z-
dc.date.available2012-05-28T08:20:27Z-
dc.date.issued2012en_HK
dc.identifier.citationHuman Genetics, 2012, v. 131 n. 5, p. 747-756en_HK
dc.identifier.issn0340-6717en_HK
dc.identifier.urihttp://hdl.handle.net/10722/147134-
dc.description.abstractCurrent genome-wide association studies (GWAS) use commercial genotyping microarrays that can assay over a million single nucleotide polymorphisms (SNPs). The number of SNPs is further boosted by advanced statistical genotype-imputation algorithms and large SNP databases for reference human populations. The testing of a huge number of SNPs needs to be taken into account in the interpretation of statistical significance in such genome-wide studies, but this is complicated by the non-independence of SNPs because of linkage disequilibrium (LD). Several previous groups have proposed the use of the effective number of independent markers (M e) for the adjustment of multiple testing, but current methods of calculation for M e are limited in accuracy or computational speed. Here, we report a more robust and fast method to calculate Me. Applying this efficient method [implemented in a free software tool named Genetic type 1 error calculator (GEC)], we systematically examined the Me, and the corresponding p-value thresholds required to control the genome-wide type 1 error rate at 0.05, for 13 Illumina or Affymetrix genotyping arrays, as well as for HapMap Project and 1000 Genomes Project datasets which are widely used in genotype imputation as reference panels. Our results suggested the use of a p-value threshold of ∼10 -7 as the criterion for genome-wide significance for early commercial genotyping arrays, but slightly more stringent p-value thresholds ∼5 × 10 -8 for current or merged commercial genotyping arrays, ∼10 -8 for all common SNPs in the 1000 Genomes Project dataset and ∼5 × 10 -8 for the common SNPs only within genes. © The Author(s) 2011.en_HK
dc.languageengen_US
dc.publisherSpringer Verlag. The Journal's web site is located at http://link.springer.de/link/service/journals/00439/index.htmen_HK
dc.relation.ispartofHuman Geneticsen_HK
dc.rightsThe Author(s)en_US
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.en_US
dc.subjectBiomedicineen_US
dc.subjectHuman Geneticsen_US
dc.subjectMolecular Medicineen_US
dc.subjectInternal Medicineen_US
dc.subjectMetabolic Diseasesen_US
dc.titleEvaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasetsen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://www.springerlink.com/link-out/?id=2104&code=XU848J7775R2R755&MUD=MPen_US
dc.identifier.emailCherny, SS: cherny@hku.hken_HK
dc.identifier.emailSham, PC: pcsham@hku.hken_HK
dc.identifier.authorityCherny, SS=rp00232en_HK
dc.identifier.authoritySham, PC=rp00459en_HK
dc.description.naturepublished_or_final_versionen_US
dc.identifier.doi10.1007/s00439-011-1118-2en_HK
dc.identifier.scopuseid_2-s2.0-84862260334en_HK
dc.identifier.hkuros202751-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-84862260334&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume131en_HK
dc.identifier.issue5en_HK
dc.identifier.spage747en_HK
dc.identifier.epage756en_HK
dc.identifier.eissn1432-1203en_US
dc.identifier.isiWOS:000302816700010-
dc.publisher.placeGermanyen_HK
dc.description.otherSpringer Open Choice, 28 May 2012en_US
dc.relation.projectDevelopment of a bioinformatics tool to optimize the experimental design of targeted next-generation sequencing studies-
dc.identifier.scopusauthoridLi, MX=35205389900en_HK
dc.identifier.scopusauthoridYeung, JMY=36818580500en_HK
dc.identifier.scopusauthoridCherny, SS=7004670001en_HK
dc.identifier.scopusauthoridSham, PC=34573429300en_HK
dc.identifier.citeulike10118284-
dc.identifier.issnl0340-6717-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats