Alignathon: A competitive assessment of whole genome alignment methods.
Robert S. Harris,
Brian J. Raney,
Aaron E. Darling,
Posted 10 Mar 2014
bioRxiv DOI: 10.1101/003285 (published DOI: 10.1101/gr.174920.114)
Posted 10 Mar 2014
Background: Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark datasets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole genome alignment (WGA). Results: Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments, and assessments were performed collectively after all the submissions were received. Three datasets were used: two of simulated primate and mammalian phylogenies, and one of 20 real fly genomes. In total 35 submissions were assessed, submitted by ten teams using 12 different alignment pipelines. Conclusions: We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable difference in the alignment quality of differently annotated regions, and found few tools aligned the duplications analysed. We found many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all datasets, submissions and assessment programs for further study, and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.
- Downloaded 3,303 times
- Download rankings, all-time:
- Site-wide: 3,379
- In bioinformatics: 350
- Year to date:
- Site-wide: 62,329
- Since beginning of last month:
- Site-wide: 94,046
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!