2020
Authors
Teixeira, M; Martins, L; Fernandes, C; Chaves, C; Pinto, J; Tavares, F; Fonseca, NA;
Publication
MICROBIOLOGY RESOURCE ANNOUNCEMENTS
Abstract
We present the complete genome sequences of two Xanthomonas euroxanthea strains isolated from buds of a walnut tree. The whole-genome sequences of strains CPBF 367 and CPBF 426 consist of two circular chromosomes of 4,923,218 bp and 4,883,254 bp and two putative plasmids of 45,241 bp and 17,394 bp, respectively. These data may contribute to the understanding of Xanthomonas species-specific adaptations to walnut.
2020
Authors
Papatheodorou, I; Moreno, P; Manning, J; Fuentes, AMP; George, N; Fexova, S; Fonseca, NA; Fullgrabe, A; Green, M; Huang, N; Huerta, L; Lqbal, H; Jianu, M; Mohammed, S; Zhao, LY; Jarnuczak, AF; Jupp, S; Marioni, J; Meyer, K; Petryszak, R; Medina, CAP; Talavera Lopez, C; Teichmann, S; Vizcaino, JA; Brazma, A;
Publication
NUCLEIC ACIDS RESEARCH
Abstract
Expression Atlas is EMBL-EBI's resource for gene and protein expression. It sources and compiles data on the abundance and localisation of RNA and proteins in various biological systems and contexts and provides open access to this data for the research community. With the increased availability of single cell RNA-Seq datasets in the public archives, we have now extended Expression Atlas with a new added-value service to display gene expression in single cells. Single Cell Expression Atlas was launched in 2018 and currently includes 123 single cell RNA-Seq studies from 12 species. The website can be searched by genes within or across species to reveal experiments, tissues and cell types where this gene is expressed or under which conditions it is a marker gene. Within each study, cells can be visualized using a pre-calculated t-SNE plot and can be coloured by different features or by cell clusters based on gene expression. Within each experiment, there are links to downloadable files, such as RNA quantification matrices, clustering results, reports on protocols and associated metadata, such as assigned cell types.
2020
Authors
Calabrese, C; PCAWG Transcriptome Core Group,; Davidson, NR; Demircioglu, D; Fonseca, NA; He, Y; Kahles, A; Lehmann, K; Liu, F; Shiraishi, Y; Soulette, CM; Urban, L; Greger, L; Li, S; Liu, D; Perry, MD; Xiang, Q; Zhang, F; Zhang, J; Bailey, P; Erkek, S; Hoadley, KA; Hou, Y; Huska, MR; Kilpinen, H; Korbel, JO; Marin, MG; Markowski, J; Nandi, T; Pan-Hammarström, Q; Pedamallu, CS; Siebert, R; Stark, SG; Su, H; Tan, P; Waszak, SM; Yung, C; Zhu, S; Awadalla, P; Creighton, CJ; Meyerson, M; Ouellette, BFF; Wu, K; Yang, H; Brazma, A; Brooks, AN; Göke, J; Rätsch, G; Schwarz, RF; Stegle, O; Zhang, Z; PCAWG Transcriptome Working Group,; PCAWG Consortium,;
Publication
Nat.
Abstract
Transcript alterations often result from somatic changes in cancer genomes1. Various forms of RNA alterations have been described in cancer, including overexpression2, altered splicing3 and gene fusions4; however, it is difficult to attribute these to underlying genomic changes owing to heterogeneity among patients and tumour types, and the relatively small cohorts of patients for whom samples have been analysed by both transcriptome and whole-genome sequencing. Here we present, to our knowledge, the most comprehensive catalogue of cancer-associated gene alterations to date, obtained by characterizing tumour transcriptomes from 1,188 donors of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)5. Using matched whole-genome sequencing data, we associated several categories of RNA alterations with germline and somatic DNA alterations, and identified probable genetic mechanisms. Somatic copy-number alterations were the major drivers of variations in total gene and allele-specific expression. We identified 649 associations of somatic single-nucleotide variants with gene expression in cis, of which 68.4% involved associations with flanking non-coding regions of the gene. We found 1,900 splicing alterations associated with somatic mutations, including the formation of exons within introns in proximity to Alu elements. In addition, 82% of gene fusions were associated with structural variants, including 75 of a new class, termed ‘bridged’ fusions, in which a third genomic location bridges two genes. We observed transcriptomic alteration signatures that differ between cancer types and have associations with variations in DNA mutational signatures. This compendium of RNA alterations in the genomic context provides a rich resource for identifying genes and mechanisms that are functionally implicated in cancer. © 2020, The Author(s).
2020
Authors
Zhang, Y; Chen, F; Fonseca, NA; He, Y; Fujita, M; Nakagawa, H; Zhang, Z; Brazma, A; Creighton, CJ;
Publication
Nature Communications
Abstract
The impact of somatic structural variants (SVs) on gene expression in cancer is largely unknown. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole-genome sequencing data and RNA sequencing from a common set of 1220 cancer cases, we report hundreds of genes for which the presence within 100 kb of an SV breakpoint associates with altered expression. For the majority of these genes, expression increases rather than decreases with corresponding breakpoint events. Up-regulated cancer-associated genes impacted by this phenomenon include TERT, MDM2, CDK4, ERBB2, CD274, PDCD1LG2, and IGF2. TERT-associated breakpoints involve ~3% of cases, most frequently in liver biliary, melanoma, sarcoma, stomach, and kidney cancers. SVs associated with up-regulation of PD1 and PDL1 genes involve ~1% of non-amplified cases. For many genes, SVs are significantly associated with increased numbers or greater proximity of enhancer regulatory elements near the gene. DNA methylation near the promoter is often increased with nearby SV breakpoint, which may involve inactivation of repressor elements. © 2020, The Author(s).
2018
Authors
Tello Ruiz, MK; Naithani, S; Stein, JC; Gupta, P; Campbell, M; Olson, A; Wei, S; Preece, J; Geniza, MJ; Jiao, Y; Lee, YK; Wang, B; Mulvaney, J; Chougule, K; Elser, J; Al Bader, N; Kumari, S; Thomason, J; Kumar, V; Bolser, DM; Naamati, G; Tapanari, E; Fonseca, N; Huerta, L; Iqbal, H; Keays, M; Munoz Pomer Fuentes, A; Tang, A; Fabregat, A; D'Eustachio, P; Weiser, J; Stein, LD; Petryszak, R; Papatheodorou, I; Kersey, PJ; Lockhart, P; Taylor, C; Jaiswal, P; Ware, D;
Publication
Nucleic Acids Research
Abstract
Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramene's Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene-gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces. © Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
2020
Authors
Rheinbay, E; PCAWG Drivers and Functional Interpretation Working Group,; Nielsen, MM; Abascal, F; Wala, JA; Shapira, O; Tiao, G; Hornshøj, H; Hess, JM; Juul, RI; Lin, Z; Feuerbach, L; Sabarinathan, R; Madsen, T; Kim, J; Mularoni, L; Shuai, S; Lanzós, A; Herrmann, C; Maruvka, YE; Shen, C; Amin, SB; Bandopadhayay, P; Bertl, J; Boroevich, KA; Busanovich, J; Carlevaro-Fita, J; Chakravarty, D; Chan, CWY; Craft, D; Dhingra, P; Diamanti, K; Fonseca, NA; Gonzalez-Perez, A; Guo, Q; Hamilton, MP; Haradhvala, NJ; Hong, C; Isaev, K; Johnson, TA; Juul, M; Kahles, A; Kahraman, A; Kim, Y; Komorowski, J; Kumar, K; Kumar, S; Lee, D; Lehmann, K; Li, Y; Liu, EM; Lochovsky, L; Park, K; Pich, O; Roberts, ND; Saksena, G; Schumacher, SE; Sidiropoulos, N; Sieverling, L; Sinnott-Armstrong, N; Stewart, C; Tamborero, D; Tubio, JMC; Umer, HM; Uusküla-Reimand, L; Wadelius, C; Wadi, L; Yao, X; Zhang, C; Zhang, J; Haber, JE; Hobolth, A; Imielinski, M; Kellis, M; Lawrence, MS; von Mering, C; Nakagawa, H; Raphael, BJ; Rubin, MA; Sander, C; Stein, LD; Stuart, JM; Tsunoda, T; Wheeler, DA; Johnson, R; Reimand, J; Gerstein, M; Khurana, E; Campbell, PJ; López-Bigas, N; Weischenfeldt, J; Beroukhim, R; Martincorena, I; Pedersen, JS; Getz, G; PCAWG Structural Variation Working Group,; PCAWG Consortium,;
Publication
Nat.
Abstract
The discovery of drivers of cancer has traditionally focused on protein-coding genes1–4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available. © 2020, The Author(s).
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.