Genotyping of structural variation using PacBio high-fidelity sequencing
By
Zhiliang Zhang,
Jijin Zhang,
Lipeng Kang,
Xuebing Qiu,
Beirui Niu,
Aoyue Bi,
Xuebo Zhao,
Daxing Xu,
Jing Wang,
Changbin Yin,
Xiangdong Fu,
Fei Lu
Posted 31 Oct 2021
bioRxiv DOI: 10.1101/2021.10.28.466362
Background: Structural variations (SVs) pervade the genome and contribute substantially to the phenotypic diversity of species. However, most SVs were ineffectively assayed because of the complexity of plant genomes and the limitations of sequencing technologies. Recent advancement of third-generation sequencing technologies, particularly the PacBio high-fidelity (HiFi) sequencing, which generates both long and highly accurate reads, offers an unprecedented opportunity to characterize SVs and reveal their functionality. Since HiFi sequencing is new, it is crucial to evaluate HiFi reads in SV detection before applying the technology at scale. Results: We sequenced wheat genomes using HiFi, then conducted a comprehensive evaluation of SV detection using mainstream long-read aligners and SV callers. The results showed the accuracy of SV discovery depends more on aligners rather than callers. For aligners, pbmm2 and NGMLR provided the most accurate results while detecting deletion and insertion, respectively. Likewise, cuteSV and SVIM achieved the best performance across all SV callers. We demonstrated that the combination of the aligners and callers mentioned above is optimal for SV detection. Furthermore, we evaluated the impact of sequencing depth on the accuracy of SV detection. The results showed that low-coverage HiFi sequencing is capable of generating high-quality SV genotyping. Conclusions: This study provides a robust benchmark of SV discovery with HiFi reads, showing the remarkable potential of long-read sequencing to investigate structural variations in plant genomes. The high accuracy SV discovery from low-coverage HiFi sequencing indicates that skim HiFi sequencing is an ideal approach to study structural variations at the population level.
Download data
- Downloaded 443 times
- Download rankings, all-time:
- Site-wide: 95,529
- In bioinformatics: 9,405
- Year to date:
- Site-wide: 17,435
- Since beginning of last month:
- Site-wide: 21,445
Altmetric data
Downloads over time
Distribution of downloads per paper, site-wide
PanLingua
News
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!