Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 84,482 bioRxiv papers from 363,664 authors.

Most downloaded bioRxiv papers, all time

in category genomics

5,344 results found. For more information, click each entry to expand.

4121: pdxBlacklist: Identifying artefactual variants in patient-derived xenograft samples
more details view paper

Posted to bioRxiv 25 Aug 2017

pdxBlacklist: Identifying artefactual variants in patient-derived xenograft samples
307 downloads genomics

Max Salm, Sven-Eric Schelhorn, Lee Lancashire, Thomas Grombacher

Patient-derived tumor xenograft (PDX) samples typically represent a mixture of mouse and human tissue. Variant call sets derived from sequencing such samples are commonly contaminated with false positive variants that arise when mouse-derived reads are mapped to the human genome. pdxBlacklist is a novel approach designed to rapidly identify these false-positive variants, and thus significantly improve variant call set quality. Availability: pdxBlacklist is freely available on GitHub: https://github.com/MaxSalm/pdxBlacklist.

4122: A High HIV-1 Strain Variability in London, UK, Revealed by Full-Genome Analysis: Results from the ICONIC Project
more details view paper

Posted to bioRxiv 17 Jul 2017

A High HIV-1 Strain Variability in London, UK, Revealed by Full-Genome Analysis: Results from the ICONIC Project
306 downloads genomics

Gonzalo Yebra, Dan Frampton, Tiziano Gallo Cassarino, Jade Raffle, Jonathan Hubb, R Bridget Ferns, Zisis Kozlakidis, Andrew Hayward, Paul Kellam, Deenan Pillay, Duncan Clark, Eleni Nastouli, Andrew J. Leigh Brown, on behalf of the ICONIC consortium

Background & Methods. The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospital and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. Results. The pipeline generated sequences of at least 1Kb of length (median=7.4Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n=149, 40.8%) and C (n=77, 21.1%) and the circulating recombinant form CRF02_AG (n=32, 8.8%). We found 14 different CRFs (n=66, 18.1%) and multiple URFs (n=32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. Conclusions. The initial analysis of genome sequences detected substantial hidden variability in the London HIV epidemic. Analysing full genome sequences, as opposed to only PR+RT, identified previously undetected recombinants. It provided a more reliable description of CRFs (that would be otherwise misclassified) and transmission clusters.

4123: Integrative transcriptomic analysis of SLE reveals IFN-driven cross-talk between immune cells
more details view paper

Posted to bioRxiv 29 Apr 2020

Integrative transcriptomic analysis of SLE reveals IFN-driven cross-talk between immune cells
306 downloads genomics

Bharat Panwar, Benjamin J Schmiedel, Shu Liang, Brandie White, Enrique Rodriguez, Kenneth Kalunian, Andrew J McKnight, Rachel Soloff, Gregory Seumois, Pandurangan Vijayanand, Ferhat Ay

The systemic lupus erythematosus (SLE) is an incurable autoimmune disease disproportionately affecting women and may lead to damage in multiple different organs. The marked heterogeneity in its clinical manifestations is a major obstacle in finding targeted treatments and involvement of multiple immune cell types further increases this complexity. Thus, identifying molecular subtypes that best correlate with disease heterogeneity and severity as well as deducing molecular cross-talk among major immune cell types that lead to disease progression are critical steps in the development of more informed therapies for SLE. Here we profile and analyze gene expression of six major circulating immune cell types from patients with well-characterized SLE (classical monocytes (n=64), T cells (n=24), neutrophils (n=24), B cells (n=20), conventional (n=20) and plasmacytoid (n=22) dendritic cells) and from healthy control subjects. Our results show that the interferon (IFN) response signature was the major molecular feature that classified SLE patients into two distinct groups: IFN-signature negative (IFNneg) and positive (IFNpos). We show that the gene expression signature of IFN response was consistent (i) across all immune cell types, (ii) all single cells profiled from three IFNpos donors using single-cell RNA-seq, and (iii) longitudinal samples of the same patient. For a better understanding of molecular differences of IFNpos versus IFNneg patients, we combined differential gene expression analysis with differential Weighted Gene Co-expression Network Analysis (WGCNA), which revealed a relatively small list of genes from classical monocytes including two known immune modulators, one the target of an approved therapeutic for SLE (TNFSF13B/BAFF: belimumab) and one itself a therapeutic for Rheumatoid Arthritis (IL1RN: anakinra). For a more integrative understanding of the cross-talk among different cell types and to identify potentially novel gene or pathway connections, we also developed a novel gene co-expression analysis method for joint analysis of multiple cell types named integrated WGNCA (iWGCNA). This method revealed an interesting cross-talk between T and B cells highlighted by a significant enrichment in the expression of known markers of T follicular helper cells (Tfh), which also correlate with disease severity in the context of IFNpos patients. Interestingly, higher expression of BAFF from all myeloid cells also shows a strong correlation with enrichment in the expression of genes in T cells that may mark circulating Tfh cells or related memory cell populations. These cell types have been shown to promote B cell class-switching and antibody production, which are well-characterized in SLE patients. In summary, we generated a large-scale gene expression dataset from sorted immune cell populations and present a novel computational approach to analyze such data in an integrative fashion in the context of an autoimmune disease. Our results reveal the power of a hypothesis-free and data-driven approach to discover drug targets and reveal novel cross-talk among multiple immune cell types specific to a subset of SLE patients. This approach is immediately useful for studying autoimmune diseases and is applicable in other contexts where gene expression profiling is possible from multiple cell types within the same tissue compartment. ### Competing Interest Statement Rachel Soloff, Andrew McKnight and Enrique Rodriguez are employed by Kyowa Kirin Pharmaceutical Research, Inc. This does not alter the authors’ adherence to this journal’s policies on sharing data and materials. There are no patents, products in development or marketed products associated with this research to declare.

4124: Myotis rufoniger Genome Sequence And Analyses: M. rufoniger's Genomic Feature And The Decreasing Effective Population Size Of Myotis Bats
more details view paper

Posted to bioRxiv 28 Apr 2017

Myotis rufoniger Genome Sequence And Analyses: M. rufoniger's Genomic Feature And The Decreasing Effective Population Size Of Myotis Bats
306 downloads genomics

Youngjune Bhak, Yeonsu Jeon, Sungwon Jeon, Oksung Chung, Sungwoong Jho, JeHoon Jun, Hak-Min Kim, Yongsoo Cho, Changhan Yoon, Seungwoo Lee, Jung-Hoon Kang, Jong-Deock Lim, Junghwa An, Yun Sung Cho, Doug-Young Ryu, Jong Bhak

Myotis rufoniger is a vesper bat in the genus Myotis. Here we report the whole genome sequence and analyses of the M. rufoniger. We generated 124 Gb of short-read DNA sequences with an estimated genome size of 1.88 Gb at a sequencing depth of 66× fold. The sequences were aligned to M. brandtii bat reference genome at a mapping rate of 96.50% covering 95.71% coding sequence region at 10× coverage. The divergence time of Myotis bat family is estimated to be 11.5 million years, and the divergence time between M. rufoniger and its closest species M. davidii is estimated to be 10.4 million years. We found 1,239 function-altering M. rufoniger specific amino acid sequences from 929 genes compared to other Myotis bat and mammalian genomes. The functional enrichment test of the 929 genes detected amino acid changes in melanin associated DCT, SLC45A2, TYRP1, and OCA2 genes possibly responsible for the M. rufoniger's red fur color and a general coloration in Myotis. N6AMT1 gene, associated with arsenic resistance, showed a high degree of function alteration in M. rufoniger. We further confirmed that M. rufoniger also has bat-specific sequences within FSHB, GHR, IGF1R, TP53, MDM2, SLC45A2, RGS7BP, RHO, OPN1SW, and CNGB3 genes that have already been published to be related to bat's reproduction, lifespan, flight, low vision, and echolocation. Additionally, our demographic history analysis found that the effective population size of Myotis clade has been consistently decreasing since ~30k years ago. M. rufoniger's effective population size was the lowest in Myotis bats, confirming its relatively low genetic diversity.

4125: Genome Report: De novo assembly of a high-quality reference genome for the Horned Lark (Eremophila alpestris)
more details view paper

Posted to bioRxiv 21 Oct 2019

Genome Report: De novo assembly of a high-quality reference genome for the Horned Lark (Eremophila alpestris)
306 downloads genomics

Nicholas A. Mason, Paulo Pulgarin, Carlos Daniel Cadena, Irby J. Lovette

The Horned Lark ( Eremophila alpestris ) is a species of small songbird that exhibits remarkable geographic variation in appearance and habitat across an expansive distribution. While E. alpestris and related species have been the focus of many ecological and evolutionary studies, we still lack a highly contiguous genome assembly for horned larks and related taxa (Alaudidae). Here, we present CLO\_EAlp\_1.0, a highly contiguous assembly for horned larks generated from blood samples of a wild, male bird captured in the Altiplano Cundiboyacense of Colombia. By combining short-insert and mate-pair libraries with the ALLPATHS-LG genome assembly pipeline, we generated a 1.04 Gb assembly comprised of 2708 contigs with an N50 of 10.58 Mb and a L50 of 29. After polishing the genome, we were able to identify 94.5% of single-copy gene orthologs from an Aves data set and 97.7% of single-copy gene orthologs from a vertebrata data set, indicating that our de novo assembly is near complete. We anticipate that this genomic resource will be useful to the broader ornithological community and those interested in studying the evolutionary history and ecological interactions of a widespread, yet understudied lineage of songbirds.

4126: Single-molecule sequencing of long DNA molecules allows high contiguity de novo genome assembly for the fungus fly, Sciara coprophila
more details view paper

Posted to bioRxiv 25 Feb 2020

Single-molecule sequencing of long DNA molecules allows high contiguity de novo genome assembly for the fungus fly, Sciara coprophila
306 downloads genomics

John M. Urban, Michael S Foulk, Jacob E. Bliss, C. Michelle Coleman, Nanyan Lu, Reza Mazloom, Susan J Brown, Allan C. Spradling, Susan A Gerbi

The lower Dipteran fungus fly, Sciara coprophila, has many unique biological features. For example, Sciara undergoes paternal chromosome elimination and maternal X chromosome nondisjunction during spermatogenesis, paternal X elimination during embryogenesis, intrachromosomal DNA amplification of DNA puff loci during larval development, and germline-limited chromosome elimination from all somatic cells. Paternal chromosome elimination in Sciara was the first observation of imprinting, though the mechanism remains a mystery. Here, we present the first draft genome sequence for Sciara coprophila to take a large step forward in aiding these studies. We approached assembling the Sciara genome using multiple sequencing technologies: PacBio, Oxford Nanopore MinION, and Illumina. To find an optimal assembly using these datasets, we generated 44 Illumina assemblies using 7 short-read assemblers and 50 long-read assemblies of PacBio and MinION sequence data using 6 long-read assemblers. We ranked assemblies using a battery of reference-free metrics, and scaffolded a subset of the highest-ranking assemblies using BioNano Genomics optical maps. RNA-seq datasets from multiple life stages and both sexes facilitated genome annotation. Moreover, we anchored nearly half of the Sciara genome sequence into chromosomes. Finally, we used the signal level of both the PacBio and Oxford Nanopore data to explore the presence or absence of DNA modifications in the Sciara genome since DNA modifications may play a role in imprinting in Sciara, as they do in mammals. These data serve as the foundation for future research by the growing community studying the unique features of this emerging model system.

4127: Deconvolution of Nucleic-acid Length Distributions: A Gel Electrophoresis Analysis Tool and Applications
more details view paper

Posted to bioRxiv 15 May 2019

Deconvolution of Nucleic-acid Length Distributions: A Gel Electrophoresis Analysis Tool and Applications
306 downloads genomics

Riccardo Ziraldo, Massa J Shoura, Andrew Z. Fire, Stephen D. Levene

Next-generation DNA-sequencing (NGS) technologies, which are designed to streamline the acquisition of massive amounts of sequencing data, are nonetheless dependent on various preparative steps to generate DNA fragments of required concentration, purity, and average size (molecular weight). Current automated electrophoresis systems for DNA- and RNA-sample quality control, such as Agilent's Bioanalyzer® and TapeStation® products, are costly to acquire and use; they also provide limited information for samples having broad size distributions. Here we describe a software tool that helps determine the size distribution of DNA fragments in an NGS library, or other DNA sample, based on gel-electrophoretic line profiles. The software, developed as an ImageJ plug-in, allows for straightforward processing of gel images, including lane selection and fitting of univariate functions to intensity distributions. The user selects the option of fitting either discrete profiles in cases where discrete gel bands are visible, or continuous profiles, having multiple bands buried under a single broad peak. The method requires only modest imaging capabilities and is a cost-effective, rigorous alternative characterization method to augment existing techniques for library quality control.

4128: Trochodendron aralioides, the first chromosome-level draft genome in Trochodendrales and a valuable resource for basal eudicot research
more details view paper

Posted to bioRxiv 26 May 2019

Trochodendron aralioides, the first chromosome-level draft genome in Trochodendrales and a valuable resource for basal eudicot research
305 downloads genomics

Joeri S. Strijk, Damien D. Hinsinger, Feng-Ping Zhang, KunFang Cao

Background: The wheel tree (Trochodendron aralioides) is one of only two species in the basal eudicot order Trochodendrales. Together with Tetracentron sinense, the family is unique in having secondary xylem without vessel elements, long considered to be a primitive character also found in Amborella and Winteraceae. Recent studies however have shown that Trochodendraceae belong to basal eudicots and demonstrate this represents an evolutionary reversal for the group. Trochodendron aralioides is widespread in cultivation and popular for use in gardens and parks. Findings: We assembled the T. aralioides genome using a total of 679.56 Gb of clean reads that were generated using both PacBio and Illumina short-reads in combination with 10XGenomics and Hi-C data. Nineteen scaffolds corresponding to 19 chromosomes were assembled to a final size of 1.614 Gb with a scaffold N50 of 73.37 Mb in addition to 1,534 contigs. Repeat sequences accounted for 64.226% of the genome, and 35,328 protein-coding genes with an average of 5.09 exons per gene were annotated using de novo, RNA-seq, and homology-based approaches. According to a phylogenetic analysis of protein-coding genes, T. aralioides diverged in a basal position relatively to core eudicots, approximately 121.8-125.8 million years ago. Conclusions: Trochodendron aralioides is the first chromosome-scale genome assembled in the order Trochodendrales. It represents the largest genome assembled to date in the basal eudicot grade, as well as the closest order relative to the core-eudicots, as the position of Buxales remains unresolved. This genome will support further studies of wood morphology and floral evolution, and will be an essential resource for understanding rapid changes that took place at the base of the Eudicot tree. Finally, it can serve as a valuable source to aid both the acceleration of genome-assisted improvement for cultivation and conservation efforts of the wheel tree.

4129: Positive selection in Europeans and East-Asians at the ABCA12 gene
more details view paper

Posted to bioRxiv 16 Aug 2018

Positive selection in Europeans and East-Asians at the ABCA12 gene
305 downloads genomics

Roberto Sirica, Marianna Buonaiuto, Valeria Petrella, Lucia Sticco, Donatella Tramontano, Dario Antonini, Caterina Missero, Ombretta Guardiola, Gennaro Andolfi, Heerman Kumar, Qasim Ayub, Yali Xue, Chris Tyler-Smith, Marco Salvemini, Giovanni D’Angelo, Vincenza Colonna

Natural selection acts on genetic variants by increasing the frequency of alleles responsible for a cellular function that is favorable in a certain environment. In a previous genome-wide scan for positive selection in contemporary humans, we identified a signal of positive selection in European and Asians at the genetic variant rs10180970. The variant is located in the second intron of the ABCA12 gene, which is implicated in the lipid barrier formation and down-regulated by UVB radiation. We studied the signal of selection in the genomic region surrounding rs10180970 in a larger dataset that includes DNA sequences from ancient samples. We also investigated the functional consequences of gene expression of the alleles of rs10180970 and another genetic variant in its proximity in healthy volunteers exposed to similar UV radiation. We confirmed the selection signal and refine its location that extends over 35 kb and includes the first intron, the first two exons and the transcription starting site of ABCA12. We found no obvious effect of rs10180970 alleles on ABCA12 gene expression. We reconstructed the trajectory of the T allele over the last 80,000 years to discover that it was specific to H. sapiens and frequent among non-Africans already 45,000 years ago.

4130: Elucidating the functional role of predicted miRNAs in post-transcriptional gene regulation along with symbiosis in Medicago truncatula
more details view paper

Posted to bioRxiv 13 Jun 2018

Elucidating the functional role of predicted miRNAs in post-transcriptional gene regulation along with symbiosis in Medicago truncatula
305 downloads genomics

Roy Chowdhury Moumita, Jolly Basak, Ranjit Prasad Bahadur

Non-coding RNAs (ncRNAs) are found to be important regulator of gene expression because of their ability to modulate post-transcriptional processes. microRNAs are small ncRNAs which inhibit translational and post-transcriptional processes whereas long ncRNAs are found to regulate both transcriptional and post-transcriptional gene expression. Medicago truncatula is a well-known model plant for studying legume biology and is also used as a forage crop. In spite of its importance in nitrogen fixation and soil fertility improvement, little information is available about Medicago ncRNAs that play important role in symbiosis. To understand the role of Medicago ncRNAs in symbiosis and regulation of transcription factors, we have identified novel miRNAs and tried to establish an interaction model with their targets. 149 novel miRNAs are predicted along with their 770 target proteins. We have shown that 51 of these novel miRNAs are targeting 282 lncRNAs. We have analyzed the interactions between miRNAs and their target mRNAs as well as their targets on lncRNAs. Role of Medicago miRNAs in the regulation of various transcription factors were also elucidated. Knowledge gained from this study will have a positive impact on the nitrogen fixing ability of this important model plant, which in turn will improve the soil fertility.

4131: Revealing the impact of recurrent and rare structural variants in multiple myeloma
more details view paper

Posted to bioRxiv 19 Dec 2019

Revealing the impact of recurrent and rare structural variants in multiple myeloma
305 downloads genomics

Even H Rustad, Venkata D. Yellapantula, Dominik Glodzik, Kylee H Maclachlan, Benjamin Diamond, Eileen M Boyle, Cody Ashby, Patrick Blaney, Gunes Gundem, Malin Hultcrantz, Daniel Leongamornlert, Nicos Angelopoulos, Daniel Auclair, Yanming Zhang, Ahmet Dogan, Niccolò Bolli, Elli Papaemmanuil, Kenneth C. Anderson, Philippe Moreau, Herve Avet-Loiseau, Nikhil Munshi, Jonathan Keats, Peter J. Campbell, Gareth J Morgan, Ola Landgren, Francesco Maura

The landscape of structural variants (SVs) in multiple myeloma remains poorly understood. Here, we performed comprehensive classification and analysis of SVs in multiple myeloma, interrogating a large cohort of 762 patients with whole genome and RNA sequencing. We identified 100 SV hotspots involving 31 new candidate driver genes, including drug targets BCMA (TNFRSF17) and SLAMF7. Complex SVs, including chromothripsis and templated insertions, were present in 61 % of patients and frequently resulted in the simultaneous acquisition of multiple drivers. After accounting for all recurrent events, 63 % of SVs remained unexplained. Intriguingly, these rare SVs were associated with up to 7-fold enrichment for outlier gene expression, indicating that many rare driver SVs remain unrecognized and are likely important in the biology of individual tumors.

4132: Buffet-Style Expression Factor-Adjusted Discovery Increases the Yield of Robust Expression Quantitative Trait Loci
more details view paper

Posted to bioRxiv 11 Oct 2015

Buffet-Style Expression Factor-Adjusted Discovery Increases the Yield of Robust Expression Quantitative Trait Loci
305 downloads genomics

Peter Castaldi, Ma’en Obeidat, Eitan Halper-Stromberg, Andrew Lamb, Margaret Parker, Robert Chase, Vincent Carey, Ruth Tal-Singer, Edwin Silverman, Don Sin, Peter D. Paré, Craig Hersh

Expression quantitative trait locus (eQTL) analysis relates genetic variation to gene expression, and it has been shown that power to detect eQTLs is substantially increased by adjustment for measures of expression variability derived from singular value decomposition-based procedures (referred to as expression factors, or EFs). A potential downside to this approach is that power will be reduced for eQTL that are correlated with one or more EFs, but these approaches are commonly used in human eQTL studies on the assumption that this risk is low for cis (i.e. local) eQTL associations. Using two independent blood eQTL datasets, we show that this assumption is incorrect and that, in fact, 10-25% of eQTL that are significant without adjustment for EFs are no longer detected after EF adjustment. In addition, the majority of these lost eQTLs replicate in independent data, indicating that they are not spurious associations. Thus, in the ideal case, EFs would be re-estimated for each eQTL association test, as has been suggested by others; however, this is computationally infeasible for large datasets with densely imputed genotype data. We propose an alternative, buffet-style approach in which a series of EF and non-EF eQTL analyses are performed and significant eQTL discoveries are collected across these analyses. We demonstrate that standard methods to control the false discovery rate perform similarly between the single EF and buffet-style approaches, and we provide biological support for eQTL discovered by this approach in terms of immune cell-type specific enhancer enrichment in Roadmap Epigenomics and ENCODE cell lines.

4133: Transcriptomic changes resulting from STK32B overexpression identifies pathways potentially relevant to essential tremor
more details view paper

Posted to bioRxiv 18 Feb 2019

Transcriptomic changes resulting from STK32B overexpression identifies pathways potentially relevant to essential tremor
305 downloads genomics

Calwing Liao, Faezeh Sarayloo, Veikko Vuokila, Daniel Rochefort, Fulya Akçimen, Simone Diamond, Alexandre D Laporte, Dan Spiegelman, Qin He, Hélène Catoire, Patrick A Dion, Guy Rouleau

Essential tremor (ET) is a common movement disorder that has a high heritability. A number of genetic studies have associated different genes and loci with ET, but few have investigated the biology of any of these genes. STK32B was significantly associated with ET in a large GWAS study and was found to be overexpressed in ET cerebellar tissue. Here, we overexpressed STK32B in human cerebellar DAOY cells and used an RNA-Seq approach to identify differentially expressed genes by comparing the transcriptome profile of these cells to the one of control DAOY cells. Pathway and gene ontology enrichment identified axon guidance, olfactory signalling and calcium-voltage channels as significant. Additionally, we show that overexpressing STK32B affects transcript levels of previously implicated ET genes such as FUS. Our results investigate the effects of overexpressed STK32B and suggest that it may be involved in relevant ET pathways and genes.

4134: The Bovine Genome Variation Database (BGVD): Integrated Web-database for Bovine Sequencing Variations and Selective Signatures
more details view paper

Posted to bioRxiv 13 Oct 2019

The Bovine Genome Variation Database (BGVD): Integrated Web-database for Bovine Sequencing Variations and Selective Signatures
304 downloads genomics

Ningbo Chen, Weiwei Fu, Jianbang Zhao, Jiafei Shen, Qiuming Chen, Zhuqing Zheng, Hong Chen, Tad S. Sonstegard, Chuzhao Lei, Yu Jiang

Next-generation sequencing has yielded a vast amount of cattle genomic data for the global characterization of population genetic diversity and the identification of regions of the genome under natural and artificial selection. However, efficient storage, querying and visualization of such large datasets remain challenging. Here, we developed a comprehensive Bovine Genome Variation Database (BGVD, http://animal.nwsuaf.edu.cn/BosVar) that provides six main functionalities: Gene Search, Variation Search, Genomic Signature Search, Genome Browser, Alignment Search Tools and the Genome Coordinate Conversion Tool. The BGVD contains information on genomic variations comprising ~60.44 M SNPs, ~6.86 M indels, 76,634 CNV regions and signatures of selective sweeps in 432 samples from modern cattle worldwide. Users can quickly retrieve distribution patterns of these variations for 54 cattle breeds through an interactive source of breed origin map using a given gene symbol or genomic region for any of the three versions of the bovine reference genomes (ARS-UCD1.2, UMD3.1.1, and Btau 5.0.1). Signals of selection are displayed as Manhattan plots and Genome Browser tracks. To further investigate and visualize the relationships between variants and signatures of selection, the Genome Browser integrates all variations, selection data and resources from NCBI, the UCSC Genome Browser and AnimalQTLdb. Collectively, all these features make the BGVD a useful archive for in-depth data mining and analyses of cattle biology and cattle breeding on a global scale.

4135: Candidate SNP analyses integrated with mRNA expression and hormone levels reveal influence on mammographic density and breast cancer risk
more details view paper

Posted to bioRxiv 02 Feb 2018

Candidate SNP analyses integrated with mRNA expression and hormone levels reveal influence on mammographic density and breast cancer risk
304 downloads genomics

M. Biong, M. Suderman, VD. Haakensen, B. Kulle, PR. Berg, I.T. Gram, V. Dumeaux, G. Ursin, Å Helland, M. H Hallett, AL Børresen-Dale, V.N. Kristensen

Background: Mammographic density (MD) is a well-known risk factor for breast cancer. Genetic factors may account for as much as 30-60% of variation in MD, but the specific genes responsible for MD remain largely unknown. In the current study, we use a candidate gene approach to identify genes with a putative effect on MD. Genotypic profiles of single nucleotide polymorphisms (SNPs) within these genes were obtained and tested for association with MD. In addition expression profiles and hormone data were used to further investigate these associations. Methods: We have analyzed 257 SNPs in 165 genes in two sample materials (n=454) using a discovery and verification approach in order to identify SNP markers of MD. Hormone and, when available, gene expression levels obtained from biopsies taken from breasts with varying density were also included in the analyses in order to investigate the functional role of the identified genetic factors in MD. Results: We identified 28 SNPs associated with MD in both datasets, ten of which have a p-value ≤ 0.05. Of these ten, seven are associated in cis (p≤ 0.05) with mRNA expression levels measured from breast biopsies of which four are directly involved in the signalling, metabolism and regulation of estradiol. Conclusion: SNPs residing in genes belonging to the estradiol-signaling pathway were found associated with MD in two cohorts of Norwegian postmenopausal women. Coupled with gene expression, these results aid in the understanding of the molecular signature in mammographically dense breasts.

4136: Draft Genome Sequence of the Asian Pear Scab Pathogen, Venturia nashicola
more details view paper

Posted to bioRxiv 22 Jun 2018

Draft Genome Sequence of the Asian Pear Scab Pathogen, Venturia nashicola
304 downloads genomics

Shakira Johnson, Dan Jones, Amali H. Thrimawithana, Cecilia H Deng, Joanna K. Bowen, Carl H. Mesarich, Hideo Ishii, Kyungho Won, Vincent G.M. Bus, Kim M. Plummer

Venturia nashicola, which causes scab disease of Asian pear, is a host-specific, biotrophic fungus, with a sexual stage that occurs during saprobic growth. V. nashicola is endemic to Asia and is regarded as a quarantine threat to Asian pear production outside of this continent. Currently, fungicide applications are routinely used to control scab disease. However, fungicide resistance in V. nashicola, as in other fungal pathogens, is an ongoing challenge and alternative control or prevention measures that include, for example, the deployment of durable host resistance, are required. A close relative of V. nashicola, V. pirina, causes scab disease of European pear. European pear displays non-host resistance (NHR) to V. nashicola and Asian pears are non-hosts of V. pirina. It is anticipated that the host specificity of these two fungi is governed by differences in their effector arsenals, with a subset responsible for activating NHR. The Pyrus-Venturia pathosystems provide a unique opportunity to dissect the underlying genetics of non-host interactions and to understand coevolution in relation to this potentially more durable form of resistance. Here, we present the first V. nashicola draft whole genome sequence (WGS), which is made up of 40,800 scaffolds (totalling 45 Mb) and 11,094 predicted genes. Of these genes, 1,232 are predicted to encode a secreted protein by SignalP, with 273 of these predicted to be effectors by EffectorP. The V. nashicola WGS will enable comparison to the WGSs of other Venturia spp. to identify effectors that potentially activate NHR in the pear scab pathosystems.

4137: The common origin of symmetry and structure in genetic sequences
more details view paper

Posted to bioRxiv 06 Oct 2017

The common origin of symmetry and structure in genetic sequences
304 downloads genomics

G. Cristadoro, M. Degli Esposti, E.G. Altmann

When exploring statistical properties of genetic sequences two main features stand out: the existence of non-random structures at various scales (e.g., long-range correlations) and the presence of symmetries (e.g., Chargaff parity rules). In the last decades, numerous studies investigated the origin and significance of each of these features separately. Here we show that both symmetry and structure have to be considered as the outcome of the same biological processes, whose cumulative effect can be quantitatively measured on extant genomes. We present a novel analysis (based on a minimal model) that not only explains and reproduces previous observations but also predicts the existence of a nested hierarchy of symmetries emerging at different structural scales. Our genome-wide analysis of H. sapiens confirms the theoretical predictions.

4138: Correlation between the aberrant human testicular germ-cell gene expression and disruption of spermatogenesis leading to male infertility
more details view paper

Posted to bioRxiv 17 Aug 2018

Correlation between the aberrant human testicular germ-cell gene expression and disruption of spermatogenesis leading to male infertility
304 downloads genomics

Arka Baksi, Ruchi Jain, Ravi Manjithaya, S S Vasan, Paturu Kondaiah, Rajan R Dighe

Spermatogenesis is characterized by sequential gene-expression at precise stages in progression of differentiation of the germ cells. Any alteration in expression of the critical genes is responsible for arrest of spermatogenesis associated with infertility. In spite of advances, the differential gene expression accompanying spermatogenesis, the corresponding regulatory mechanisms and their correlation to human infertility have not been clearly established. This study aims to identify the gene expression pattern of the human testicular germ cells from patients with either obstructive azoospermia with complete intra-testicular spermatogenesis or non-obstructive azoospermia with spermatogenesis arrested at different stages and correlate the same to infertility. The testicular transcriptomes of 3 OA and 8 NOA patients and pooled testicular RNA (commercial source) were analyzed for their differential gene expression to identify potential regulators of spermatogenesis and the results were further validated in all of the 44 patients clinically diagnosed with azoospermia undergoing sperm retrieval surgery over the study period and 4 control samples included in this study. Analyses of the differential transcriptome led to identification of genes enriched in a specific testicular cell type and subsequently, several regulators of the diploid- double-diploid- haploid transitions in the human spermatogenesis were identified. Perturbations in the expression of these genes were identified as the potential causes of the spermatogenic arrest seen in azoospermia and thus the potential mediators of human male infertility. Another interesting observation was the increased autophagy in the testes of patients with non-obstructive azoospermia. The present study suggests that the regulation of the diploid-double-diploid-haploid transition is multigenic with the tandem alteration of several genes resulting in infertility. In conclusion, this study identified some of the genetic regulators controlling spermatogenesis using comparative transcriptome analyses of testicular tissues from azoospremic individuals and showed how alterations in several genes results in disruption of spermatogenesis and subsequent infertility. This study also provides interesting insights into the gene expression patterns of the Indian population that were not available earlier.

4139: β-cell dedifferentiation is associated with epithelial-mesenchymal transition triggered by miR-7-mediated repression of mSwi/Snf complex
more details view paper

Posted to bioRxiv 01 Oct 2019

β-cell dedifferentiation is associated with epithelial-mesenchymal transition triggered by miR-7-mediated repression of mSwi/Snf complex
304 downloads genomics

Tracy CS Mak, Yorrick von Ohlen, Yi Fang Wang, Eva Kane, Kaste Jurgaityte, Pedro Ervilha, Pauline Chabosseau, Walter Distaso, Victoria Salem, Alejandra Tomas, Markus Stoffel, Piero Marchetti, AM James Shapiro, Guy A. Rutter, Mathieu Latreille

β-cell dedifferentiation has been revealed as a pathological mechanism underlying pancreatic dysfunction in diabetes. However, little is known on the genetic and epigenetic changes linked with the dedifferentiation of β-cells. We now report that β-cell dedifferentiation is associated with epithelial to mesenchymal transition (EMT) triggered by miR-7-mediated repression of Smarca4/Brg1 expression, a catalytic subunit of the mSwi/Snf chromatin remodeling complexes essential for β-cell transcription factors (β-TFs) activity. miR-7-mediated repression of Brg1 expression in diabetes causes an overall compaction of chromatin structure preventing β-TFs from accessing and transactivating genes maintaining the functional and epithelial identity of β-cells. Concomitantly, loss of β-cell identity impairs the ability of β-TFs Pdx1, Nkx6-1, Neurod1 to repress non-β-cell genes enriched selectively in mesenchymal cells leading to EMT, change in islet microenvironment, and fibrosis. Remarkably, anti-EMT agents normalized glucose tolerance of diabetic mice, thus revealing mesenchymal reprogramming of β-cells as a novel therapeutic target in diabetes. This study sheds light on the genetic signature of dedifferentiated β-cells and highlights how loss of mSwi/Snf activity in diabetes initiating a step-wise remodeling of epigenetic landscapes of β-cells leading to the induction of an EMT process reminiscent of a response to tissue injury.

4140: MIP-MAP: High-Throughput Mapping of Caenorhabditis elegans Temperature Sensitive Mutants via Molecular Inversion Probes
more details view paper

Posted to bioRxiv 22 Jun 2017

MIP-MAP: High-Throughput Mapping of Caenorhabditis elegans Temperature Sensitive Mutants via Molecular Inversion Probes
304 downloads genomics

CA Mok, V Au, OA Thompson, ML Edgley, L Gevirtzman, J Yochem, J Lowry, N Memar, M Wallenfang, D Rasoloson, B Bowerman, R Schnabel, G Seydoux, DG Moerman, RH Waterston

Temperature sensitive (TS) alleles are important tools for the genetic and functional analysis of essential genes in many model organisms. While isolating TS alleles is not difficult, determining the TS-conferring mutation can be problematic. Even with whole-genome sequencing (WGS) data there is a paucity of predictive methods for identifying TS alleles from DNA sequence alone. We assembled 173 TS lethal mutants of Caenorhabditis elegans and used WGS to identify several hundred mutations per strain. We leveraged single molecule molecular inversion probes (MIPs) to sequence variant sites at high depth in the cross-progeny of TS mutants and a mapping strain with identified sequence variants but no apparent phenotypic differences from the reference N2 strain. By sampling for variants at ~1Mb intervals across the genome we genetically mapped mutant alleles at a resolution comparable to current standards in a process we call MIP-MAP. The MIP-MAP protocol, however, permits high-throughput sequencing of multiple TS mutation mapping libraries at less than 200K reads per library. Using MIP-MAP on a subset of TS mutants, via a competitive selection assay and standard recombinant mutant selection, we defined TS-associated intervals of 3Mb or less. Our results suggest this collection of strains contains a diverse library of TS alleles for genes involved in development and reproduction. MIP-MAP is a robust method to genetically map mutations in both viable and essential genes. The MIPs protocol should allow high-throughput tracking of genetic variants in any mixed population.

Previous page 1 . . . 205 206 207 208 209 210 211 . . . 268 Next page

PanLingua

Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News