Ultrafast and scalable variant annotation and prioritization with big functional genomics data

Huang, Dandan; Yi, Xianfu; Zhou, Yao; Yao, Hongcheng; Xu, Hang; Wang, Jianhua; Zhang, Shijie; Nong, Wenyan; Wang, Panwen; Shi, Lei; Xuan, Chenghao; Li, Miaoxin; Wang, Junwen; Li, Weidong; Kwan, Hoi Shan; Sham, Pak Chung; Wang, Kai; Li, Mulin Jun

File Download

content.pdf

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1101/gr.267997.120
Scopus: eid_2-s2.0-85097112939
PMID: 33060171
WOS: WOS:000596027700001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Faculty of Dentistry: Journal/Magazine Articles

Article: Ultrafast and scalable variant annotation and prioritization with big functional genomics data

Title	Ultrafast and scalable variant annotation and prioritization with big functional genomics data
Authors	Huang, Dandan Yi, Xianfu Zhou, Yao Yao, Hongcheng Xu, Hang Wang, Jianhua Zhang, Shijie Nong, Wenyan Wang, Panwen Shi, Lei Xuan, Chenghao Li, Miaoxin Wang, Junwen Li, Weidong Kwan, Hoi Shan Sham, Pak Chung Wang, Kai Li, Mulin Jun
Issue Date	2020
Citation	Genome Research, 2020, v. 31, n. 12, p. 1789-1801 How to Cite? DOI: http://dx.doi.org/10.1101/gr.267997.120
Abstract	The advances of large-scale genomics studies have enabled compilation of cell type–specific, genome-wide DNA functional elements at high resolution. With the growing volume of functional annotation data and sequencing variants, existing variant annotation algorithms lack the efficiency and scalability to process big genomic data, particularly when annotating whole-genome sequencing variants against a huge database with billions of genomic features. Here, we develop VarNote to rapidly annotate genome-scale variants in large and complex functional annotation resources. Equipped with a novel index system and a parallel random-sweep searching algorithm, VarNote shows substantial performance improvements (two to three orders of magnitude) over existing algorithms at different scales. It supports both region-based and allele-specific annotations and introduces advanced functions for the flexible extraction of annotations. By integrating massive base-wise and context-dependent annotations in the VarNote framework, we introduce three efficient and accurate pipelines to prioritize the causal regulatory variants for common diseases, Mendelian disorders, and cancers.
Persistent Identifier	http://hdl.handle.net/10722/324509
ISSN	1088-9051 2023 Impact Factor: 6.2 2023 SCImago Journal Rankings: 4.403
PubMed Central ID	PMC7706736
ISI Accession Number ID	WOS:000596027700001

DC Field	Value	Language
dc.contributor.author	Huang, Dandan	-
dc.contributor.author	Yi, Xianfu	-
dc.contributor.author	Zhou, Yao	-
dc.contributor.author	Yao, Hongcheng	-
dc.contributor.author	Xu, Hang	-
dc.contributor.author	Wang, Jianhua	-
dc.contributor.author	Zhang, Shijie	-
dc.contributor.author	Nong, Wenyan	-
dc.contributor.author	Wang, Panwen	-
dc.contributor.author	Shi, Lei	-
dc.contributor.author	Xuan, Chenghao	-
dc.contributor.author	Li, Miaoxin	-
dc.contributor.author	Wang, Junwen	-
dc.contributor.author	Li, Weidong	-
dc.contributor.author	Kwan, Hoi Shan	-
dc.contributor.author	Sham, Pak Chung	-
dc.contributor.author	Wang, Kai	-
dc.contributor.author	Li, Mulin Jun	-
dc.date.accessioned	2023-02-03T07:03:35Z	-
dc.date.available	2023-02-03T07:03:35Z	-
dc.date.issued	2020	-
dc.identifier.citation	Genome Research, 2020, v. 31, n. 12, p. 1789-1801	-
dc.identifier.issn	1088-9051	-
dc.identifier.uri	http://hdl.handle.net/10722/324509	-
dc.description.abstract	The advances of large-scale genomics studies have enabled compilation of cell type–specific, genome-wide DNA functional elements at high resolution. With the growing volume of functional annotation data and sequencing variants, existing variant annotation algorithms lack the efficiency and scalability to process big genomic data, particularly when annotating whole-genome sequencing variants against a huge database with billions of genomic features. Here, we develop VarNote to rapidly annotate genome-scale variants in large and complex functional annotation resources. Equipped with a novel index system and a parallel random-sweep searching algorithm, VarNote shows substantial performance improvements (two to three orders of magnitude) over existing algorithms at different scales. It supports both region-based and allele-specific annotations and introduces advanced functions for the flexible extraction of annotations. By integrating massive base-wise and context-dependent annotations in the VarNote framework, we introduce three efficient and accurate pipelines to prioritize the causal regulatory variants for common diseases, Mendelian disorders, and cancers.	-
dc.language	eng	-
dc.relation.ispartof	Genome Research	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.title	Ultrafast and scalable variant annotation and prioritization with big functional genomics data	-
dc.type	Article	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.1101/gr.267997.120	-
dc.identifier.pmid	33060171	-
dc.identifier.pmcid	PMC7706736	-
dc.identifier.scopus	eid_2-s2.0-85097112939	-
dc.identifier.volume	31	-
dc.identifier.issue	12	-
dc.identifier.spage	1789	-
dc.identifier.epage	1801	-
dc.identifier.eissn	1549-5469	-
dc.identifier.isi	WOS:000596027700001	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Ultrafast and scalable variant annotation and prioritization with big functional genomics data

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats