Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 66,889 bioRxiv papers from 294,495 authors.
Efficient genotype compression and analysis of large genetic variation datasets
The economy of human genome sequencing has catalyzed ambitious efforts to interrogate the genomes of large cohorts in search of new insight into the genetic basis of disease. This manuscript introduces Genotype Query Tools (GQT) as a new indexing strategy and toolset that addresses an analytical bottleneck by enabling interactive analyses based on genotypes, phenotypes and sample relationships. Speed improvements are achieved by operating directly on a compressed genotype index without decompression. GQT?s data compression ratios increase favorably with cohort size and relative analysis performance improves in kind. We demonstrate substantial performance improvements over state-of-theart tools using datasets from the 1000 Genomes Project (46 fold), the Exome Aggregation Consortium (443 fold), and simulated datasets of up to 100,000 genomes (218 fold). Furthermore, we show that this indexing strategy facilitates population and statistical genetics measures such as principal component analysis and burden tests. Based on its computational efficiency and by complementing existing toolsets, GQT provides a flexible framework for current and future analyses of massive genome datasets.
- Downloaded 3,135 times
- Download rankings, all-time:
- Site-wide: 1,080 out of 66,916
- In genomics: 259 out of 4,551
- Year to date:
- Site-wide: 23,996 out of 66,916
- Since beginning of last month:
- Site-wide: 15,499 out of 66,916
Downloads over time
Distribution of downloads per paper, site-wide
- Top preprints of 2018
- Paper search
- Author leaderboards
- Overall metrics
- The API
- Email newsletter
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!