File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: FastPval: A fast and memory efficient program to calculate very low P-values from empirical distribution

TitleFastPval: A fast and memory efficient program to calculate very low P-values from empirical distribution
Authors
Issue Date2010
PublisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/
Citation
Bioinformatics, 2010, v. 26 n. 22, p. 2897-2899 How to Cite?
AbstractMotivation: Resampling methods, such as permutation and bootstrap, have been widely used to generate an empirical distribution for assessing the statistical significance of a measurement. However, to obtain a very low P-value, a large size of resampling is required, where computing speed, memory and storage consumption become bottlenecks, and sometimes become impossible, even on a computer cluster. Results: We have developed a multiple stage P-value calculating program called FastPval that can efficiently calculate very low (up to 10-9) P-values from a large number of resampled measurements. With only two input files and a few parameter settings from the users, the program can compute P-values from empirical distribution very efficiently, even on a personal computer. When tested on the order of 109 resampled data, our method only uses 52.94% the time used by the conventional method, implemented by standard quicksort and binary search algorithms, and consumes only 0.11% of the memory and storage. Furthermore, our method can be applied to extra large datasets that the conventional method fails to calculate. The accuracy of the method was tested on data generated from Normal, Poison and Gumbel distributions and was found to be no different from the exact ranking approach. © The Author(s) 2010. Published by Oxford University Press.
Persistent Identifierhttp://hdl.handle.net/10722/137123
ISSN
2021 Impact Factor: 6.931
2020 SCImago Journal Rankings: 3.599
PubMed Central ID
ISI Accession Number ID
Funding AgencyGrant Number
CRCG
Genomic SRT of the University of Hong Kong
Research Grants Council of Hong KongGRF 778609M
AoE M-04/04
Funding Information:

Internal funds from the CRCG and the Genomic SRT of the University of Hong Kong; GRF 778609M and AoE M-04/04 from the Research Grants Council of Hong Kong.

References

 

DC FieldValueLanguage
dc.contributor.authorLi, MJen_HK
dc.contributor.authorSham, PCen_HK
dc.contributor.authorWang, Jen_HK
dc.date.accessioned2011-08-22T08:34:37Z-
dc.date.available2011-08-22T08:34:37Z-
dc.date.issued2010en_HK
dc.identifier.citationBioinformatics, 2010, v. 26 n. 22, p. 2897-2899en_HK
dc.identifier.issn1367-4803en_HK
dc.identifier.urihttp://hdl.handle.net/10722/137123-
dc.description.abstractMotivation: Resampling methods, such as permutation and bootstrap, have been widely used to generate an empirical distribution for assessing the statistical significance of a measurement. However, to obtain a very low P-value, a large size of resampling is required, where computing speed, memory and storage consumption become bottlenecks, and sometimes become impossible, even on a computer cluster. Results: We have developed a multiple stage P-value calculating program called FastPval that can efficiently calculate very low (up to 10-9) P-values from a large number of resampled measurements. With only two input files and a few parameter settings from the users, the program can compute P-values from empirical distribution very efficiently, even on a personal computer. When tested on the order of 109 resampled data, our method only uses 52.94% the time used by the conventional method, implemented by standard quicksort and binary search algorithms, and consumes only 0.11% of the memory and storage. Furthermore, our method can be applied to extra large datasets that the conventional method fails to calculate. The accuracy of the method was tested on data generated from Normal, Poison and Gumbel distributions and was found to be no different from the exact ranking approach. © The Author(s) 2010. Published by Oxford University Press.en_HK
dc.languageeng-
dc.publisherOxford University Press. The Journal's web site is located at http://bioinformatics.oxfordjournals.org/en_HK
dc.relation.ispartofBioinformaticsen_HK
dc.subject.meshComputational Biology - methods-
dc.subject.meshDatabases, Factual-
dc.subject.meshModels, Statistical-
dc.subject.meshSoftware-
dc.titleFastPval: A fast and memory efficient program to calculate very low P-values from empirical distributionen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1367-4803&volume=26&issue=22&spage=2897&epage=2899&date=2010&atitle=FastPval:+a+fast+and+memory+efficient+program+to+calculate+very+low+P-values+from+empirical+distribution-
dc.identifier.emailSham, PC: pcsham@hku.hken_HK
dc.identifier.emailWang, J: junwen@hku.hken_HK
dc.identifier.authoritySham, PC=rp00459en_HK
dc.identifier.authorityWang, J=rp00280en_HK
dc.description.naturepublished_or_final_versionen_US
dc.identifier.doi10.1093/bioinformatics/btq540en_HK
dc.identifier.pmid20861029-
dc.identifier.pmcidPMC2971576-
dc.identifier.scopuseid_2-s2.0-78149251209en_HK
dc.identifier.hkuros189643-
dc.identifier.hkuros192075en_US
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-78149251209&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume26en_HK
dc.identifier.issue22en_HK
dc.identifier.spage2897en_HK
dc.identifier.epage2899en_HK
dc.identifier.eissn1460-2059-
dc.identifier.isiWOS:000283919800014-
dc.publisher.placeUnited Kingdomen_HK
dc.identifier.scopusauthoridLi, MJ=37016520600en_HK
dc.identifier.scopusauthoridSham, PC=34573429300en_HK
dc.identifier.scopusauthoridWang, J=8950599500en_HK
dc.identifier.citeulike7911914-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats