Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 84,482 bioRxiv papers from 363,664 authors.

Most downloaded bioRxiv papers, all time

in category genomics

5,344 results found. For more information, click each entry to expand.

61: Pooled optical screens in human cells
more details view paper

Posted to bioRxiv 02 Aug 2018

Pooled optical screens in human cells
8,335 downloads genomics

David Feldman, Avtar Singh, Jonathan L. Schmid-Burgk, Anja Mezger, Anthony J Garrity, Rebecca J Carlson, Feng Zhang, Paul C. Blainey

Large-scale genetic screens play a key role in the systematic discovery of genes underlying cellular phenotypes. Pooling of genetic perturbations greatly increases screening throughput, but has so far been limited to screens of enrichments defined by cell fitness and flow cytometry, or to comparatively low-throughput single cell gene expression profiles. Although microscopy is a rich source of spatial and temporal information about mammalian cells, high-content imaging screens have been restricted to much less efficient arrayed formats. Here, we introduce an optical method to link perturbations and their phenotypic outcomes at the single-cell level in a pooled setting. Barcoded perturbations are read out by targeted in situ sequencing following image-based phenotyping. We apply this technology to screen a focused set of 952 genes across >3 million cells for involvement in NF-κB activation by imaging the translocation of RelA (p65) to the nucleus, recovering 20 known pathway components and 3 novel candidate positive regulators of IL-1β and TNFα-stimulated immune responses.

62: Minor allele frequency thresholds strongly affect population structure inference with genomic datasets
more details view paper

Posted to bioRxiv 14 Sep 2017

Minor allele frequency thresholds strongly affect population structure inference with genomic datasets
8,244 downloads genomics

Ethan Linck, C.J. Battey

Across the genome, the effects of different evolutionary processes and historical events can result in different classes of genetic variants (or alleles) characterized by their relative frequency in a given population. As a result, population genetic inference can be strongly affected by biases in laboratory and bioinformatics treatments that affect the site frequency spectrum, or SFS. Yet despite the widespread use of reduced-representation genomic datasets with nonmodel organisms, the potential consequences of these biases for downstream analyses remain poorly examined. Here, we assess the influence of minor allele frequency (MAF) thresholds implemented during variant detection on inference of population structure. We use simulated and empirical datasets to evaluate the effect of MAF thresholds on the ability to discriminate among populations and quantify admixture with both model-based and non-model-based clustering methods. We find model-based inference of population structure is highly sensitive to choice of MAF, and may be confounded by either including singletons or excluding all rare alleles. In contrast, non-model-based clustering is largely robust to MAF choice. Our results suggest that model-based inference of population structure can fail due to either natural demographic processes or assembly artifacts, with broad consequences for phylogeographic and population genetic studies using NGS data. We propose a simple hypothesis to explain this behavior and recommend a set of best practices for researchers seeking to describe population structure using reduced-representation libraries.

63: Multi-platform discovery of haplotype-resolved structural variation in human genomes
more details view paper

Posted to bioRxiv 23 Sep 2017

Multi-platform discovery of haplotype-resolved structural variation in human genomes
8,138 downloads genomics

Mark J.P. Chaisson, Ashley D. Sanders, Xuefang Zhao, Ankit Malhotra, David Porubsky, Tobias Rausch, Eugene J. Gardner, Oscar Rodriguez, Li Guo, Ryan L. Collins, Xian Fan, Jia Wen, Robert E Handsaker, Susan Fairley, Zev N. Kronenberg, Xiangmeng Kong, Fereydoun Hormozdiari, Dillon Lee, Aaron M. Wenger, Alex Hastie, Danny Antaki, Peter Audano, Harrison Brand, Stuart Cantsilieris, Han Cao, Eliza Cerveira, Chong Chen, Xintong Chen, Chen-Shan Chin, Zechen Chong, Nelson T. Chuang, Christine C. Lambert, Deanna M Church, Laura Clarke, Andrew Farrell, Joey Flores, Timur Galeev, David U. Gorkin, Madhusudan Gujral, Victor Guryev, William Haynes Heaton, Jonas Korlach, Sushant Kumar, Jee Young Kwon, Jong Eun Lee, Joyce Lee, Wan-Ping Lee, Sau Peng Lee, Shantao Li, Patrick Marks, Karine Viaud-Martinez, Sascha Meiers, Katherine M. Munson, Fabio Navarro, Bradley J Nelson, Conor Nodzak, Amina Noor, Sofia Kyriazopoulou-Panagiotopoulou, Andy Pang, Yunjiang Qiu, Gabriel Rosanio, Mallory Ryan, Adrian Stütz, Diana C.J. Spierings, Alistair Ward, AnneMarie E. Welch, Ming Xiao, Wei Xu, Chengsheng Zhang, Qihui Zhu, Xiangqun Zheng-Bradley, Ernesto Lowy, Sergei Yakneen, Steven McCarroll, Goo Jun, Li Ding, Chong Lek Koh, Bing Ren, Paul Flicek, Ken Chen, Mark B. Gerstein, Pui-Yan Kwok, Peter M. Lansdorp, Gabor Marth, Jonathan Sebat, Xinghua Shi, Ali Bashir, Kai Ye, Scott E. Devine, Michael Talkowski, Ryan E. Mills, Tobias Marschall, Jan O. Korbel, Evan E. Eichler, Charles Lee

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, and strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent-child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per human genome. We also discover 156 inversions per genome - most of which previously escaped detection. Fifty-eight of the inversions we discovered intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The method and the dataset serve as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies.

64: Nanopore native RNA sequencing of a human poly(A) transcriptome
more details view paper

Posted to bioRxiv 09 Nov 2018

Nanopore native RNA sequencing of a human poly(A) transcriptome
8,109 downloads genomics

Rachael E. Workman, Alison D Tang, Paul S. Tang, Miten Jain, John R Tyson, Philip C Zuzarte, Timothy Gilpatrick, Roham Razaghi, Joshua Quick, Norah Sadowski, Nadine Holmes, Jaqueline Goes de Jesus, Karen L. Jones, Terrance P Snutch, Nicholas J Loman, Benedict Paten, Matthew Loose, Jared T Simpson, Hugh E Olsen, Angela N. Brooks, Mark Akeson, Winston Timp

High throughput RNA sequencing technologies have dramatically advanced our understanding of transcriptome complexity and regulation. However, these cDNA-based methods lose information contained in biological RNA because the copied reads are short or because modifications are not carried forward in cDNA. Here we address these limitations using a native poly(A) RNA sequencing strategy developed by Oxford Nanopore Technologies (ONT). Our study focused on poly(A) RNA isolated from the human cell line GM12878, from which we sequenced approximately 9.9 million individual aligned strands. These native RNA sequence reads had an N50 length of 1334 bases, and a maximum length of 22,000 bases. A total of 78,199 high-confidence isoforms were identified by combining long nanopore reads with short higher accuracy Illumina reads. Among these isoforms, over 50% are not present in GENCODE v24. We describe strategies for assessing 3'poly(A) tail length, base modifications and transcript haplotypes using this single molecule technology. Together, these nanopore-based techniques are poised to deliver new insights into RNA biology.

65: Tools and best practices for allelic expression analysis
more details view paper

Posted to bioRxiv 06 Mar 2015

Tools and best practices for allelic expression analysis
7,950 downloads genomics

Stephane E. Castel, Ami Levy Moonshine, Pejman Mohammadi, Eric Banks, Tuuli Lappalainen

Allelic expression (AE) analysis has become an important tool for integrating genome and transcriptome data to characterize various biological phenomena such as cis-regulatory variation and nonsense-mediated decay. In this paper, we systematically analyze the properties of AE read count data and technical sources of error, such as low-quality or double-counted RNA-seq reads, genotyping errors, allelic mapping bias, and technical covariates due to sample preparation and sequencing, and variation in total read depth. We provide guidelines for correcting and filtering for such errors, and show that the resulting AE data has extremely low technical noise. Finally, we introduce novel software for high-throughput production of AE data from RNA-sequencing data, implemented in the GATK framework. These improved tools and best practices for AE analysis yield higher quality AE data by reducing technical bias. This provides a practical framework for wider adoption of AE analysis by the genomics community.

66: MULTI-seq: Scalable sample multiplexing for single-cell RNA sequencing using lipid-tagged indices
more details view paper

Posted to bioRxiv 08 Aug 2018

MULTI-seq: Scalable sample multiplexing for single-cell RNA sequencing using lipid-tagged indices
7,770 downloads genomics

Christopher S. McGinnis, David M Patterson, Juliane Winkler, Marco Y. Hein, Vasudha Srivastava, Daniel N Conrad, Lyndsay M Murrow, Jonathan S. Weissman, Zena Werb, Eric D. Chow, Zev J. Gartner

We describe MULTI-seq: A rapid, modular, and universal scRNA-seq sample multiplexing strategy using lipid-tagged indices. MULTI-seq reagents can barcode any cell type from any species with an accessible plasma membrane. The method is compatible with enzymatic tissue dissociation, and also preserves viability and endogenous gene expression patterns. We leverage these features to multiplex the analysis of multiple solid tissues comprising human and mouse cells isolated from patient-derived xenograft mouse models. We also utilize MULTI-seq's modular design to perform a 96-plex perturbation experiment with human mammary epithelial cells. MULTI-seq also enables robust doublet identification, which improves data quality and increases scRNA-seq cell throughput by minimizing the negative effects of Poisson loading. We anticipate that the sample throughput and reagent savings enabled by MULTI-seq will expand the purview of scRNA-seq and democratize the application of these technologies within the scientific community.

67: Comparative analysis of commercially available single-cell RNA sequencing platforms for their performance in complex human tissues
more details view paper

Posted to bioRxiv 05 Feb 2019

Comparative analysis of commercially available single-cell RNA sequencing platforms for their performance in complex human tissues
7,726 downloads genomics

Yue J Wang, Jonathan Schug, Jerome Lin, Zhiping Wang, Andrew Kossenkov, the HPAP Consortium, Klaus H. Kaestner

The past five years have witnessed a tremendous growth of single-cell RNA-seq methodologies. Currently, there are three major commercial platforms for single-cell RNA-seq: Fluidigm C1, Clontech iCell8 (formerly Wafergen) and 10x Genomics Chromium. Here, we provide a systematic comparison of the throughput, sensitivity, cost and other performance statistics for these three platforms using single cells from primary human islets. The primary human islets represent a complex biological system where multiple cell types coexist, with varying cellular abundance, diverse transcriptomic profiles and differing total RNA contents. We apply standard pipelines optimized for each system to derive gene expression matrices. We further evaluate the performance of each system by benchmarking single-cell data with bulk RNA-seq data from sorted cell fractions. Our analyses can be generalized to a variety of complex biological systems and serve as a guide to newcomers to the field of single-cell RNA-seq when selecting platforms.

68: LeafCutter: Annotation-free quantification of RNA splicing
more details view paper

Posted to bioRxiv 16 Mar 2016

LeafCutter: Annotation-free quantification of RNA splicing
7,719 downloads genomics

Yang I. Li, David A. Knowles, Jack Humphrey, Alvaro N. Barbeira, Scott P. Dickinson, Hae Kyung Im, Jonathan K. Pritchard

The excision of introns from pre-mRNA is an essential step in mRNA processing. We developed LeafCutter to study sample and population variation in intron splicing. LeafCutter identifies variable intron splicing events from short-read RNA-seq data and finds alternative splicing events of high complexity. Our approach obviates the need for transcript annotations and circumvents the challenges in estimating relative isoform or exon usage in complex splicing events. LeafCutter can be used both for detecting differential splicing between sample groups, and for mapping splicing quantitative trait loci (sQTLs). Compared to contemporary methods, we find 1.4-2.1 times more sQTLs, many of which help us ascribe molecular effects to disease-associated variants. Strikingly, transcriptome-wide associations between LeafCutter intron quantifications and 40 complex traits increased the number of associated disease genes at 5% FDR by an average of 2.1-fold as compared to using gene expression levels alone. LeafCutter is fast, scalable, easy to use, and available at https://github.com/davidaknowles/leafcutter.

69: Accurate Genomic Prediction Of Human Height
more details view paper

Posted to bioRxiv 18 Sep 2017

Accurate Genomic Prediction Of Human Height
7,610 downloads genomics

Louis Lello, Steven G. Avery, Laurent Tellier, Ana I. Vazquez, G. de los Campos, Stephen D. H. Hsu

We construct genomic predictors for heritable and extremely complex human quantitative traits (height, heel bone density, and educational attainment) using modern methods in high dimensional statistics (i.e., machine learning). Replication tests show that these predictors capture, respectively, ~40, 20, and 9 percent of total variance for the three traits. For example, predicted heights correlate ~0.65 with actual height; actual heights of most individuals in validation samples are within a few cm of the prediction. The variance captured for height is comparable to the estimated SNP heritability from GCTA (GREML) analysis, and seems to be close to its asymptotic value (i.e., as sample size goes to infinity), suggesting that we have captured most of the heritability for the SNPs used. Thus, our results resolve the common SNP portion of the "missing heritability" problem - i.e., the gap between prediction R-squared and SNP heritability. The ~20k activated SNPs in our height predictor reveal the genetic architecture of human height, at least for common SNPs. Our primary dataset is the UK Biobank cohort, comprised of almost 500k individual genotypes with multiple phenotypes. We also use other datasets and SNPs found in earlier GWAS for out-of-sample validation of our results.

70: Generation of high-resolution a priori Y-chromosome phylogenies using “next-generation” sequencing data
more details view paper

Posted to bioRxiv 22 Nov 2013

Generation of high-resolution a priori Y-chromosome phylogenies using “next-generation” sequencing data
7,537 downloads genomics

Gregory R Magoon, Raymond H Banks, Christian Rottensteiner, Bonnie E Schrack, Vincent O Tilroe, Terry Robb, Andrew J Grierson

An approach for generating high-resolution a priori maximum parsimony Y-chromosome (“chrY”) phylogenies based on SNP and small INDEL variant data from massively-parallel short-read (“next-generation”) sequencing data is described; the tree-generation methodology produces annotations localizing mutations to individual branches of the tree, along with indications of mutation placement uncertainty in cases for which "no-calls" (through lack of mapped reads or otherwise) at particular sites precludes precise phylogenetic placement of mutations. The approach leverages careful variant site filtering and a novel iterative reweighting procedure to generate high-accuracy trees while considering variants in regions of chrY that had previously been excluded from analyses based on short-read sequencing data. It is argued that the proposed approach is also superior to previous region-based filtering approaches in that it adapts to the quality of the underlying data and will automatically allow the scope of sites considered to expand as the underlying data quality improves (e.g. through longer read lengths). Key related issues, including calling of genotypes for the hemizygous chrY, reliability of variant results, read mismappings and "heterozygous" genotype calls, and the mutational stability of different variants are discussed and taken into account. The methodology is demonstrated through application to a dataset consisting of 1292 male samples from diverse populations and haplogroups, with the majority coming from low-coverage sequencing by the 1000 Genomes Project. Application of the tree-generation approach to these data produces a tree involving over 120,000 chrY variant sites (about 45,000 sites if “singletons” are excluded). The utility of this approach in refining the Y-chromosome phylogenetic tree is demonstrated by examining results for several haplogroups. The results indicate a number of new branches on the Y-chromosome phylogenetic tree, many of them subdividing known branches, but also including some that inform the presence of additional levels along the “trunk” of the tree. Finally, opportunities for extensions of this phylogenetic analysis approach to other types of genetic data are noted.

71: Single Molecule Sequencing Of M13 Virus Genome Without Amplification
more details view paper

Posted to bioRxiv 03 May 2017

Single Molecule Sequencing Of M13 Virus Genome Without Amplification
7,470 downloads genomics

Luyang Zhao, Liwei Deng, Gailing Li, Huan Jin, Jinsen Cai, Huan Shang, Yan Li, Haomin Wu, Weibin Xu, Lidong Zeng, Renli Zhang, Huan Zhao, Ping Wu, Zhiliang Zhou, Jiao Zheng, Pierre Ezanno, Qin Yan, Michael Deem, Jiankui He

Third generation sequencing is a direct measurement of DNA/RNA sequences at the single molecule level without amplification. In this study, we report sequencing of the genome of the M13 virus by a new single molecule sequencing platform. Our platform detects single molecule fluorescence by the total internal reflection microscope technique, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x and 100% coverage. The consensus sequence accuracy is 100%. We demonstrated that single molecule sequencing has no significant GC bias.

72: CUT&Tag for efficient epigenomic profiling of small samples and single cells
more details view paper

Posted to bioRxiv 06 Mar 2019

CUT&Tag for efficient epigenomic profiling of small samples and single cells
7,384 downloads genomics

Hatice S. Kaya-Okur, Steven J. Wu, Christine A. Codomo, Erica S. Pledger, Terri D Bryson, Steven Henikoff, Kami Ahmad, Steven Henikoff

Many chromatin features play critical roles in regulating gene expression. A complete understanding of gene regulation will require the mapping of specific chromatin features in small samples of cells at high resolution. Here we describe Cleavage Under Targets and Tagmentation (CUT&Tag), an enzyme-tethering strategy that provides efficient high-resolution sequencing libraries for profiling diverse chromatin components. In CUT&Tag, a chromatin protein is bound in situ by a specific antibody, which then tethers a protein A-Tn5 transposase fusion protein. Activation of the transposase efficiently generates fragment libraries with high resolution and exceptionally low background. All steps from live cells to sequencing-ready libraries can be performed in a single tube on the benchtop or a microwell in a high-throughput pipeline, and the entire procedure can be performed in one day. We demonstrate the utility of CUT&Tag by profiling histone modifications, RNA Polymerase II and transcription factors on low cell numbers and single cells.

73: Recovering signals of ghost archaic introgression in African populations
more details view paper

Posted to bioRxiv 21 Mar 2018

Recovering signals of ghost archaic introgression in African populations
7,382 downloads genomics

Arun Durvasula, Sriram Sankararaman

While introgression from Neanderthals and Denisovans has been well-documented in modern humans outside Africa, the contribution of archaic hominins to the genetic variation of present-day Africans remains poorly understood. Using 405 whole-genome sequences from four sub-Saharan African populations, we provide complementary lines of evidence for archaic introgression into these populations. Our analyses of site frequency spectra indicate that these populations derive 2-19% of their genetic ancestry from an archaic population that diverged prior to the split of Neanderthals and modern humans. Using a method that can identify segments of archaic ancestry without the need for reference archaic genomes, we built genome-wide maps of archaic ancestry in the Yoruba and the Mende populations that recover about 482 and 502 megabases of archaic sequence, respectively. Analyses of these maps reveal segments of archaic ancestry at high frequency in these populations that represent potential targets of adaptive introgression. Our results reveal the substantial contribution of archaic ancestry in shaping the gene pool of present-day African populations.

74: Unraveling the polygenic architecture of complex traits using blood eQTL meta-analysis
more details view paper

Posted to bioRxiv 19 Oct 2018

Unraveling the polygenic architecture of complex traits using blood eQTL meta-analysis
7,378 downloads genomics

Urmo Võsa, A. Claringbould, Harm-Jan Westra, Marc Jan Bonder, Patrick Deelen, Biao Zeng, Holger Kirsten, Ashis Saha, Roman Kreuzhuber, Silva Kasela, Natalia Pervjakova, Isabel Alvaes, Marie-Julie Fave, Mawusse Agbessi, Mark Christiansen, Rick Jansen, Ilkka Seppälä, Lin Tong, Alexander Teumer, Katharina Schramm, Gibran Hemani, Joost Verlouw, Hanieh Yaghootkar, Reyhan Sönmez, Andrew Brown, Viktorija Kukushkina, Anette Kalnapenkis, Sina Rüeger, Eleonora Porcu, Jaanika Kronberg-Guzman, Johannes Kettunen, Joseph Powell, Bernett Lee, Futao Zhang, Wibowo Arindrarto, Frank Beutner, BIOS Consortium, Harm Brugge, i2QTL Consortium, Julia Dmitreva, Mahmoud Elansary, Benjamin P Fairfax, Michel Georges, Bastiaan T. Heijmans, Mika Kähönen, Yungil Kim, Julian C Knight, Peter Kovacs, Knut Krohn, Shuang Li, Markus Loeffler, Urko M Marigorta, Hailang Mei, Yukihide Momozawa, Martina Müller-Nurasyid, Matthias Nauck, Michel Nivard, Brenda Penninx, Jonathan Pritchard, Olli Raitakari, Olaf Rotzchke, Eline P Slagboom, Coen D.A. Stehouwer, Michael Stumvoll, Patrick Sullivan, Peter A.C. ‘t Hoen, Joachim Thiery, Anke Tönjes, Jenny van Dongen, Maarten van Iterson, Jan Veldink, Uwe Völker, C. Wijmenga, Morris Swertz, Anand Andiappan, Grant W. Montgomery, Samuli Ripatti, Markus Perola, Zoltán Kutalik, Emmanouil Dermitzakis, Sven Bergmann, Timothy Frayling, Joyce van Meurs, Holger Prokisch, Habibul Ahsan, Brandon Pierce, Terho Lehtimäki, Dorret I. Boomsma, Bruce M. Psaty, Sina A. Gharib, Philip Awadalla, Lili Milani, Willem Ouwehand, Kate Downes, Oliver Stegle, Alexis Battle, Jian Yang, Peter M. Visscher, Markus Scholz, Gregory Gibson, Tõnu Esko, L. Franke

While many disease-associated variants have been identified through genome-wide association studies, their downstream molecular consequences remain unclear. To identify these effects, we performed cis- and trans-expression quantitative trait locus (eQTL) analysis in blood from 31,684 individuals through the eQTLGen Consortium. We observed that cis-eQTLs can be detected for 88% of the studied genes, but that they have a different genetic architecture compared to disease-associated variants, limiting our ability to use cis-eQTLs to pinpoint causal genes within susceptibility loci. In contrast, trans-eQTLs (detected for 37% of 10,317 studied trait-associated variants) were more informative. Multiple unlinked variants, associated to the same complex trait, often converged on trans-genes that are known to play central roles in disease etiology. We observed the same when ascertaining the effect of polygenic scores calculated for 1,263 genome-wide association study (GWAS) traits. Expression levels of 13% of the studied genes correlated with polygenic scores, and many resulting genes are known to drive these traits.

75: Adapterama I: Universal stubs and primers for 384 unique dual-indexed or 147,456 combinatorially-indexed Illumina libraries (iTru & iNext)
more details view paper

Posted to bioRxiv 15 Jun 2016

Adapterama I: Universal stubs and primers for 384 unique dual-indexed or 147,456 combinatorially-indexed Illumina libraries (iTru & iNext)
7,361 downloads genomics

Travis C. Glenn, Roger A. Nilsen, Troy J. Kieran, Jon G Sanders, Natalia J. Bayona-Vásquez, John W. Finger, Todd W. Pierson, Kerin E. Bentley, Sandra L. Hoffberg, Swarnali Louha, Francisco J. García-De León, Miguel Angel Del Río-Portilla, Kurt D. Reed, Jennifer L. Anderson, Jennifer K. Meece, Samuel E. Aggrey, Romdhane Rekaya, Magdy Alabady, Myriam Bélanger, Kevin Winker, Brant C. Faircloth

Next-generation DNA sequencing (NGS) offers many benefits, but major factors limiting NGS include reducing costs of: 1) start-up (i.e., doing NGS for the first time); 2) buy-in (i.e., getting the smallest possible amount of data from a run); and 3) sample preparation. Reducing sample preparation costs is commonly addressed, but start-up and buy-in costs are rarely addressed. We present dual-indexing systems to address all three of these issues. By breaking the library construction process into universal, re-usable, combinatorial components, we reduce all costs, while increasing the number of samples and the variety of library types that can be combined within runs. We accomplish this by extending the Illumina TruSeq dual-indexing approach to 768 (384 + 384) indexed primers that produce 384 unique dual-indexes or 147,456 (384 x 384) unique combinations. We maintain eight nucleotide indexes, with many that are compatible with Illumina index sequences. We synthesized these indexing primers, purifying them with only standard desalting and placing small aliquots in replicate plates. In qPCR validation tests, 206 of 208 primers tested passed (99% success). We then created hundreds of libraries in various scenarios. Our approach reduces start-up and per-sample costs by requiring only one universal adapter that works with indexed PCR primers to uniquely identify samples. Our approach reduces buy-in costs because: 1) relatively few oligonucleotides are needed to produce a large number of indexed libraries; and 2) the large number of possible primers allows researchers to use unique primer sets for different projects, which facilitates pooling of samples during sequencing. Our libraries make use of standard Illumina sequencing primers and index sequence length and are demultiplexed with standard Illumina software, thereby minimizing customization headaches. In subsequent Adapterama papers, we use these same primers with different adapter stubs to construct amplicon and restriction-site associated DNA libraries, but their use can be expanded to any type of library sequenced on Illumina platforms.

76: Cell freezing protocol optimized for ATAC-Seq on motor neurons derived from human induced pluripotent stem cells
more details view paper

Posted to bioRxiv 15 Jan 2016

Cell freezing protocol optimized for ATAC-Seq on motor neurons derived from human induced pluripotent stem cells
7,321 downloads genomics

Pamela Milani, Renan Escalante-Chong, Brandon C. Shelley, Natasha L. Patel-Murray, Xiaofeng Xin, Miriam Adam, Berhan Mandefro, Dhruv Sareen, Clive N. Svendsen, Ernest Fraenkel

In recent years, the assay for transposase-accessible chromatin using sequencing (ATAC-Seq) has become a fundamental tool of epigenomic research. However, it has proven difficult to perform this technique on frozen samples because freezing cells before extracting nuclei impairs nuclear integrity and alters chromatin structure. We describe a protocol for freezing cells that is compatible with ATAC-Seq, producing results that compare well with those generated from fresh cells. We found that while flash-frozen samples are not suitable for ATAC-Seq, the assay is successful with slow-cooled cryopreserved samples. Using this method, we were able to isolate high quality, intact nuclei, and we verified that epigenetic results from fresh and cryopreserved samples agree quantitatively. We developed our protocol on a disease-relevant cell type, namely motor neurons differentiated from induced pluripotent stem cells from a patient affected by spinal muscular atrophy.

77: Specific ACE2 Expression in Cholangiocytes May Cause Liver Damage After 2019-nCoV Infection
more details view paper

Posted to bioRxiv 04 Feb 2020

Specific ACE2 Expression in Cholangiocytes May Cause Liver Damage After 2019-nCoV Infection
7,308 downloads genomics

Xiaoqiang Chai, Longfei Hu, Yan Zhang, Weiyu Han, Zhou Lu, Aiwu Ke, Jian Zhou, Guoming Shi, Nan Fang, Jia Fan, Jiabin Cai, Jue Fan, Fei Lan

A newly identified coronavirus, 2019-nCoV, has been posing significant threats to public health since December 2019. ACE2, the host cell receptor for severe acute respiratory syndrome coronavirus (SARS), has recently been demonstrated in mediating 2019-nCoV infection. Interestingly, besides the respiratory system, substantial proportion of SARS and 2019-nCoV patients showed signs of various degrees of liver damage, the mechanism and implication of which have not yet been determined. Here, we performed an unbiased evaluation of cell type specific expression of ACE2 in healthy liver tissues using single cell RNA-seq data of two independent cohorts, and identified specific expression in cholangiocytes. The results indicated that virus might directly bind to ACE2 positive cholangiocytes but not necessarily hepatocytes. This finding suggested the liver abnormalities of SARS and 2019-nCoV patients may not be due to hepatocyte damage, but cholangiocyte dysfunction and other causes such as drug induced and systemic inflammatory response induced liver injury. Our findings indicate that special care of liver dysfunction should be installed in treating 2019-nCoV patients during the hospitalization and shortly after cure.

78: Ultra-high throughput single-cell RNA sequencing by combinatorial fluidic indexing
more details view paper

Posted to bioRxiv 18 Dec 2019

Ultra-high throughput single-cell RNA sequencing by combinatorial fluidic indexing
7,261 downloads genomics

Paul Datlinger, André F. Rendeiro, Thorina Boenke, Thomas Krausgruber, Daniele Barreca, Christoph Bock

Cell atlas projects and single-cell CRISPR screens hit the limits of current technology, as they require cost-effective profiling for millions of individual cells. To satisfy these enormous throughput requirements, we developed "single-cell combinatorial fluidic indexing" (scifi) and applied it to single-cell RNA sequencing. The resulting scifi-RNA-seq assay combines one-step combinatorial pre-indexing of single-cell transcriptomes with subsequent single-cell RNA-seq using widely available droplet microfluidics. Pre-indexing allows us to load multiple cells per droplet, which increases the throughput of droplet-based single-cell RNA-seq up to 15-fold, and it provides a straightforward way of multiplexing hundreds of samples in a single scifi-RNA-seq experiment. Compared to multi-round combinatorial indexing, scifi-RNA-seq provides an easier, faster, and more efficient workflow, thereby enabling massive-scale scRNA-seq experiments for a broad range of applications ranging from population genomics to drug screens with scRNA-seq readout. We benchmarked scifi-RNA-seq on various human and mouse cell lines, and we demonstrated its feasibility for human primary material by profiling TCR activation in T cells.

79: Two novel lncRNAs discovered in human mitochondrial DNA using PacBio full-length transcriptome data
more details view paper

Posted to bioRxiv 06 Oct 2016

Two novel lncRNAs discovered in human mitochondrial DNA using PacBio full-length transcriptome data
7,220 downloads genomics

Shan Gao, Xiaoxuan Tian, Yu Sun, Zhenfeng Wu, Zhi Cheng, Pengzhi Dong, Qiang Zhao, Bingjun He, Jishou Ruan, Wenjun Bu

In this study, we introduced a general framework to use PacBio full-length transcriptome sequencing for the investigation of the fundamental problems in mitochondrial biology, e.g. genome arrangement, heteroplasmy, RNA processing and the regulation of transcription or replication. As a result, we produced the first full-length human mitochondrial transcriptome from the MCF7 cell line based on the PacBio platform and characterized the human mitochondrial transcriptome with more comprehensive and accurate information. The most important finding was two novel lnRNAs hsa-MDL1 and hsa-MDL1AS, which are encoded by the mitochondrial D-loop regions. We propose hsa-MDL1 and hsa-MDL1AS, as the precursors of transcription initiation RNAs (tiRNAs), belong to a novel class of long non-coding RNAs (lnRNAs), which is named as long tiRNAs (ltiRNAs). Based on the mitochondrial RNA processing model, the primary tiRNAs, precursors and mature tiRNAs could be discovered to completely reveal tiRNAs from their origins to functions. The MDL1 and MDL1AS lnRNAs and their regulation mechanisms exist ubiquitously from insects to human.

80: Targeted Nanopore Sequencing with Cas9 for studies of methylation, structural variants, and mutations
more details view paper

Posted to bioRxiv 11 Apr 2019

Targeted Nanopore Sequencing with Cas9 for studies of methylation, structural variants, and mutations
7,175 downloads genomics

Timothy Gilpatrick, Isac Lee, James E. Graham, Etienne Raimondeau, Rebecca Bowen, Andrew Heron, Fritz J. Sedlazeck, Winston Timp

Nanopore sequencing technology can rapidly and directly interrogate native DNA molecules. Often we are interested only in interrogating specific areas at high depth, but conventional enrichment methods have thus far proved unsuitable for long reads[1][1]. Existing strategies are currently limited by high input DNA requirements, low yield, short (<5kb) reads, time-intensive protocols, and/or amplification or cloning (losing base modification information). In this paper, we describe a technique utilizing the ability of Cas9 to introduce cuts at specific locations and ligating nanopore sequencing adaptors directly to those sites, a method we term ‘nanopore Cas9 Targeted-Sequencing’ (nCATS). We have demonstrated this using an Oxford Nanopore MinION flow cell (Capacity >10Gb+) to generate a median 165X coverage at 10 genomic loci with a median length of 18kb, representing a several hundred-fold improvement over the 2-3X coverage achieved without enrichment. We performed a pilot run on the smaller Flongle flow cell (Capacity ~1Gb), generating a median coverage of 30X at 11 genomic loci with a median length of 18kb. Using panels of guide RNAs, we show that the high coverage data from this method enables us to (1) profile DNA methylation patterns at cancer driver genes, (2) detect structural variations at known hot spots, and (3) survey for the presence of single nucleotide mutations. Together, this provides a low-cost method that can be applied even in low resource settings to directly examine cellular DNA. This technique has extensive clinical applications for assessing medically relevant genes and has the versatility to be a rapid and comprehensive diagnostic tool. We demonstrate applications of this technique by examining the well-characterized GM12878 cell line as well as three breast cell lines (MCF-10A, MCF-7, MDA-MB-231) with varying tumorigenic potential as a model for cancer. Contributions TG and WT constructed the study. TG performed the experiments. TG, IL, and FS analyzed the data. TG, JG, ER, RB and AH and developed the method. TG and WT wrote the paper [1]: #ref-1

Previous page 1 2 3 4 5 6 7 8 . . . 268 Next page


Sign up for the Rxivist weekly newsletter! (Click here for more details.)