Rxivist logo

The impact of super-spreaders in COVID-19: mapping genome variation worldwide

By Alberto Gómez-Carballa, Xabier Bello, Jacobo Pardo-Seco, Federico Martinón-Torres, Antonio Salas

Posted 19 May 2020
bioRxiv DOI: 10.1101/2020.05.19.097410

The human pathogen severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the major pandemic of the 21st century. We analyzed >4,700 SARS-CoV-2 genomes and associated meta-data retrieved from public repositories. SARS-CoV-2 sequences have a high sequence identity (>99.9%), which drops to >96% when compared to bat coronavirus. We built a mutation-annotated reference SARS-CoV-2 phylogeny with two main macro-haplogroups, A and B, both of Asian origin, and >160 sub-branches representing virus strains of variable geographical origins worldwide, revealing a uniform mutation occurrence along branches that could complicate the design of future vaccines. The root of SARS-CoV-2 genomes locates at the Chinese haplogroup B1, with a TMRCA dating to 12 November 2019 - thus matching epidemiological records. Sub-haplogroup A2a originates in China and represents the major non-Asian outbreak. Multiple founder effects, most likely associated with super-spreader hosts, explain COVID-19 pandemic to a large extent. ### Competing Interest Statement The authors have declared no competing interest.

Download data

  • Downloaded 6,125 times
  • Download rankings, all-time:
    • Site-wide: 530 out of 89,147
    • In genomics: 110 out of 5,693
  • Year to date:
    • Site-wide: 150 out of 89,147
  • Since beginning of last month:
    • Site-wide: 93 out of 89,147

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)