Robust Benchmark Structural Variant Calls of An Asian Using the State-of-Art Long Fragment Sequencing Technologies
By
Xiao Du,
Lili Li,
Fan Liang,
Sanyang Liu,
Wenxin Zhang,
Shuai Sun,
Yuhui Sun,
Fei Fan,
Linying Wang,
Xinming Liang,
Weijin Qiu,
Guangyi Fan,
Ou Wang,
Weifei Yang,
Jiezhong Zhang,
Yuhui Xiao,
Yang Wang,
Depeng Wang,
Shoufang Qu,
Fang Chen,
Jie Huang
Posted 12 Aug 2020
bioRxiv DOI: 10.1101/2020.08.10.245308
The importance of structural variants (SVs) on phenotypes and human diseases is now recognized. Although a variety of SV detection platforms and strategies that vary in sensitivity and specificity have been developed, few benchmarking procedures are available to confidently assess their performances in biological and clinical research. To facilitate the validation and application of those approaches, our work established an Asian reference material comprising identified benchmark regions and high-confidence SV calls. We established a high-confidence SV callset with 8,938 SVs in an EBV immortalized B lymphocyte line, by integrating four alignment-based SV callers [from 109x PacBio continuous long read (CLR), 22x PacBio circular consensus sequencing (CCS) reads, 104x Oxford Nanopore long reads, and 114x optical mapping platform (Bionano)] and one de novo assembly-based SV caller using CCS reads. A total of 544 randomly selected SVs were validated by PCR and Sanger sequencing, proofing the robustness of our SV calls. Combining trio-binning based haplotype assemblies, we established an SV benchmark for identification of false negatives and false positives by constructing the continuous high confident regions (CHCRs), which cover 1.46Gb and 6,882 SVs supported by at least one diploid haplotype assembly. Establishing high-confidence SV calls for a benchmark sample that has been characterized by multiple technologies provides a valuable resource for investigating SVs in human biology, disease, and clinical diagnosis. ### Competing Interest Statement The authors have declared no competing interest.
Download data
- Downloaded 699 times
- Download rankings, all-time:
- Site-wide: 55,072
- In genomics: 4,020
- Year to date:
- Site-wide: 64,375
- Since beginning of last month:
- Site-wide: 49,274
Altmetric data
Downloads over time
Distribution of downloads per paper, site-wide
PanLingua
News
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!