Motivation: Clustering of antigen-specific T cell receptor repertoire (TCRR) sequences is challenging. The recently published tool GLIPH aims to solve this problem. However, clustering large repertoires takes several days to weeks, making its use impractical in larger studies. In addition, the methodology used in GLIPH suffers from several shortcomings, including non-determinism, potential loss of significant antigen-specific sequences or inclusion of too many unspecific sequences. Results: We present an algorithm for clustering TCRR sequences that scales efficiently to large repertoires. We clustered 26 real datasets with up to 62000 unique CDR3β sequences using both GLIPH and an implementation of our method called ting. While GLIPH required multiple weeks, ting only needed about one hour for the same task. In addition, we found that in naïve repertoires, where no or very few antigen-specific CDR3 sequences or clusters should exist, our method indeed selects fewer sequences. Availability: Our method has been implemented in Python as a tool called ting, using numpy and NetworkX. It is available on GitHub (https://github.com/FelixMoelder/ting) and on PyPI under the MIT license. ### Competing Interest Statement The authors have declared no competing interest.
- Downloaded 203 times
- Download rankings, all-time:
- Site-wide: 87,219 out of 116,126
- In bioinformatics: 7,967 out of 9,552
- Year to date:
- Site-wide: 38,501 out of 116,126
- Since beginning of last month:
- Site-wide: 25,273 out of 116,126
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!