Rxivist logo

HAMAP rules as SPARQL - A portable annotation pipeline for genomes and proteomes

By Jerven Bolleman, Eduoard de Castro, Delphine Baratin, Sebastien Gehant, Beatrice A. Cuche, Andrea H. Auchincloss, Elisabeth Coudert, Chantal Hulo, Patrick Masson, Ivo Pedruzzi, Catherine Rivoire, Ioannis Xenarios, Nicole Redaschi, Alan Bridge

Posted 24 Apr 2019
bioRxiv DOI: 10.1101/615294 (published DOI: 10.1093/gigascience/giaa003)

Motivation: Genome and proteome annotation pipelines are generally custom built and therefore not easily reusable by other groups, which leads to duplication of effort, increased costs, and suboptimal results. One cost-effective way to increase the data quality in public databases is to encourage the adoption of annotation standards and technological solutions that enable the sharing of biological knowledge and tools for genome and proteome annotation. Results: We have translated the rules of our HAMAP proteome annotation pipeline to queries in the W3C standard SPARQL 1.1 syntax and applied them with two off-the-shelf SPARQL engines to UniProtKB/Swiss-Prot protein sequences described in RDF format. This approach is applicable to any genome or proteome annotation pipeline and greatly simplifies their reuse. Availability: HAMAP SPARQL rules and documentation are freely available for download from the HAMAP FTP site ftp://ftp.expasy.org/databases/hamap/hamap_sparql.tar.gz under a CC-BY-ND 4.0 license. The annotations generated by the rules are under the CC-BY 4.0 license.

Download data

  • Downloaded 334 times
  • Download rankings, all-time:
    • Site-wide: 45,903 out of 85,229
    • In bioinformatics: 5,345 out of 8,149
  • Year to date:
    • Site-wide: 43,721 out of 85,229
  • Since beginning of last month:
    • Site-wide: 74,701 out of 85,229

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)