File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Copy number variation analysis based on AluScan sequences

TitleCopy number variation analysis based on AluScan sequences
Authors
KeywordsAluScan sequencing
Cancer classification
CNV calling
Machine learning
Issue Date2014
Citation
Journal of Clinical Bioinformatics, 2014, v. 4 n. 1, article no. 15 How to Cite?
AbstractBACKGROUND: AluScan combines inter-Alu PCR using multiple Alu-based primers with opposite orientations and next-generation sequencing to capture a huge number of Alu-proximal genomic sequences for investigation. Its requirement of only sub-microgram quantities of DNA facilitates the examination of large numbers of samples. However, the special features of AluScan data rendered difficult the calling of copy number variation (CNV) directly using the calling algorithms designed for whole genome sequencing (WGS) or exome sequencing. RESULTS: In this study, an AluScanCNV package has been assembled for efficient CNV calling from AluScan sequencing data employing a Geary-Hinkley transformation (GHT) of read-depth ratios between either paired test-control samples, or between test samples and a reference template constructed from reference samples, to call the localized CNVs, followed by use of a GISTIC-like algorithm to identify recurrent CNVs and circular binary segmentation (CBS) to reveal large extended CNVs. To evaluate the utility of CNVs called from AluScan data, the AluScans from 23 non-cancer and 38 cancer genomes were analyzed in this study. The glioma samples analyzed yielded the familiar extended copy-number losses on chromosomes 1p and 9. Also, the recurrent somatic CNVs identified from liver cancer samples were similar to those reported for liver cancer WGS with respect to a striking enrichment of copy-number gains in chromosomes 1q and 8q. When localized or recurrent CNV-features capable of distinguishing between liver and non-liver cancer samples were selected by correlation-based machine learning, a highly accurate separation of the liver and non-liver cancer classes was attained. CONCLUSIONS: The results obtained from non-cancer and cancerous tissues indicated that the AluScanCNV package can be employed to call localized, recurrent and extended CNVs from AluScan sequences. Moreover, both the localized and recurrent CNVs identified by this method could be subjected to machine-learning selection to yield distinguishing CNV-features that were capable of separating between liver cancers and other types of cancers. Since the method is applicable to any human DNA sample with or without the availability of a paired control, it can also be employed to analyze the constitutional CNVs of individuals.
Persistent Identifierhttp://hdl.handle.net/10722/207761
PubMed Central ID

 

DC FieldValueLanguage
dc.contributor.authorYang, YFen_US
dc.contributor.authorDing, XFen_US
dc.contributor.authorChen, Len_US
dc.contributor.authorMat, WKen_US
dc.contributor.authorXu, MZen_US
dc.contributor.authorChen, JFen_US
dc.contributor.authorWang, JMen_US
dc.contributor.authorXu, Len_US
dc.contributor.authorPoon, WSen_US
dc.contributor.authorKwong, Aen_US
dc.contributor.authorLeung, GKKen_US
dc.contributor.authorTan, TCen_US
dc.contributor.authorYu, CHen_US
dc.contributor.authorKe, YBen_US
dc.contributor.authorXu, XYen_US
dc.contributor.authorKe, XYen_US
dc.contributor.authorMa, RCen_US
dc.contributor.authorChan, JCen_US
dc.date.accessioned2015-01-19T10:16:33Z-
dc.date.available2015-01-19T10:16:33Z-
dc.date.issued2014en_US
dc.identifier.citationJournal of Clinical Bioinformatics, 2014, v. 4 n. 1, article no. 15en_US
dc.identifier.urihttp://hdl.handle.net/10722/207761-
dc.description.abstractBACKGROUND: AluScan combines inter-Alu PCR using multiple Alu-based primers with opposite orientations and next-generation sequencing to capture a huge number of Alu-proximal genomic sequences for investigation. Its requirement of only sub-microgram quantities of DNA facilitates the examination of large numbers of samples. However, the special features of AluScan data rendered difficult the calling of copy number variation (CNV) directly using the calling algorithms designed for whole genome sequencing (WGS) or exome sequencing. RESULTS: In this study, an AluScanCNV package has been assembled for efficient CNV calling from AluScan sequencing data employing a Geary-Hinkley transformation (GHT) of read-depth ratios between either paired test-control samples, or between test samples and a reference template constructed from reference samples, to call the localized CNVs, followed by use of a GISTIC-like algorithm to identify recurrent CNVs and circular binary segmentation (CBS) to reveal large extended CNVs. To evaluate the utility of CNVs called from AluScan data, the AluScans from 23 non-cancer and 38 cancer genomes were analyzed in this study. The glioma samples analyzed yielded the familiar extended copy-number losses on chromosomes 1p and 9. Also, the recurrent somatic CNVs identified from liver cancer samples were similar to those reported for liver cancer WGS with respect to a striking enrichment of copy-number gains in chromosomes 1q and 8q. When localized or recurrent CNV-features capable of distinguishing between liver and non-liver cancer samples were selected by correlation-based machine learning, a highly accurate separation of the liver and non-liver cancer classes was attained. CONCLUSIONS: The results obtained from non-cancer and cancerous tissues indicated that the AluScanCNV package can be employed to call localized, recurrent and extended CNVs from AluScan sequences. Moreover, both the localized and recurrent CNVs identified by this method could be subjected to machine-learning selection to yield distinguishing CNV-features that were capable of separating between liver cancers and other types of cancers. Since the method is applicable to any human DNA sample with or without the availability of a paired control, it can also be employed to analyze the constitutional CNVs of individuals.en_US
dc.languageengen_US
dc.relation.ispartofJournal of Clinical Bioinformaticsen_US
dc.subjectAluScan sequencing-
dc.subjectCancer classification-
dc.subjectCNV calling-
dc.subjectMachine learning-
dc.titleCopy number variation analysis based on AluScan sequencesen_US
dc.typeArticleen_US
dc.identifier.emailKwong, A: avakwong@hkucc.hku.hken_US
dc.identifier.emailLeung, GKK: gilberto@hku.hken_US
dc.identifier.authorityKwong, A=rp01734en_US
dc.identifier.authorityLeung, GKK=rp00522en_US
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.1186/s13336-014-0015-zen_US
dc.identifier.pmid25558350-
dc.identifier.pmcidPMC4273479-
dc.identifier.scopuseid_2-s2.0-84988838320-
dc.identifier.hkuros242127en_US
dc.identifier.volume4en_US
dc.identifier.issue1en_US
dc.identifier.spagearticle no. 15en_US
dc.identifier.epagearticle no. 15en_US
dc.identifier.eissn2043-9113-
dc.identifier.issnl2043-9113-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats