Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2020

Butler enables rapid cloud-based analysis of thousands of human genomes

Autores
Yakneen, S; Waszak, SM; Gertz, M; Korbel, JO; Aminou, B; Bartolome, J; Boroevich, KA; Boyce, R; Brooks, AN; Buchanan, A; Buchhalter, I; Butler, AP; Byrne, NJ; Cafferkey, A; Campbell, PJ; Chen, ZH; Cho, S; Choi, W; Clapham, P; Davis Dusenbery, BN; De La Vega, FM; Demeulemeester, J; Dow, MT; Dursi, LJ; Eils, J; Eils, R; Ellrott, K; Farcas, C; Favero, F; Fayzullaev, N; Ferretti, V; Flicek, P; Fonseca, NA; Gelpi, JL; Getz, G; Gibson, B; Grossman, RL; Harismendy, O; Heath, AP; Heinold, MC; Hess, JM; Hofmann, O; Hong, JH; Hudson, TJ; Hutter, B; Hutter, CM; Hubschmann, D; Imoto, S; Ivkovic, S; Jeon, SH; Jiao, W; Jung, J; Kabbe, R; Kahles, A; Kerssemakers, JNA; Kim, HL; Kim, H; Kim, J; Kim, Y; Kleinheinz, K; Koscher, M; Koures, A; Kovacevic, M; Lawerenz, C; Leshchiner, I; Liu, J; Livitz, D; Mihaiescu, GL; Mijalkovic, S; Lazic, AM; Miyano, S; Miyoshi, N; Nahal Bose, HK; Nakagawa, H; Nastic, M; Newhouse, SJ; Nicholson, J; O'Connor, BD; Ocana, D; Ohi, K; Ohno Machado, L; Omberg, L; Ouellette, BFF; Paramasivam, N; Perry, MD; Pihl, TD; Prinz, M; Puiggros, M; Radovic, P; Raine, KM; Rheinbay, E; Rosenberg, M; Royo, R; Ratsch, G; Saksena, G; Schlesner, M; Shorser, SI; Short, C; Sofia, HJ; Spring, J; Stein, LD; Struck, AJ; Tiao, G; Tijanic, N; Torrents, D; Van Loo, P; Vazquez, M; Vicente, D; Wala, JA; Wang, ZN; Weischenfeldt, J; Werner, J; Williams, A; Woo, Y; Wright, AJ; Xiang, Q; Yang, LM; Yuen, D; Yung, CK; Zhang, JJ;

Publicação
NATURE BIOTECHNOLOGY

Abstract
Efficient, large-scale genomic analysis is facilitated on the cloud by a computational tool with error-diagnosing and self-healing capabilities. We present Butler, a computational tool that facilitates large-scale genomic analyses on public and academic clouds. Butler includes innovative anomaly detection and self-healing functions that improve the efficiency of data processing and analysis by 43% compared with current approaches. Butler enabled processing of a 725-terabyte cancer genome dataset from the Pan-Cancer Analysis of Whole Genomes (PCAWG) project in a time-efficient and uniform manner.

2020

Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis

Autores
Carlevaro Fita, J; Lanzós, A; Feuerbach, L; Hong, C; Mas Ponte, D; Pedersen, JS; Abascal, F; Amin, SB; Bader, GD; Barenboim, J; Beroukhim, R; Bertl, J; Boroevich, KA; Brunak, S; Campbell, PJ; Carlevaro Fita, J; Chakravarty, D; Chan, CWY; Chen, K; Choi, JK; Deu Pons, J; Dhingra, P; Diamanti, K; Feuerbach, L; Fink, JL; Fonseca, NA; Frigola, J; Gambacorti Passerini, C; Garsed, DW; Gerstein, M; Getz, G; Gonzalez Perez, A; Guo, Q; Gut, IG; Haan, D; Hamilton, MP; Haradhvala, NJ; Harmanci, AO; Helmy, M; Herrmann, C; Hess, JM; Hobolth, A; Hodzic, E; Hong, C; Hornshøj, H; Isaev, K; Izarzugaza, JMG; Johnson, R; Johnson, TA; Juul, M; Juul, RI; Kahles, A; Kahraman, A; Kellis, M; Khurana, E; Kim, J; Kim, JK; Kim, Y; Komorowski, J; Korbel, JO; Kumar, S; Lanzós, A; Larsson, E; Lawrence, MS; Lee, D; Lehmann, KV; Li, S; Li, X; Lin, Z; Liu, EM; Lochovsky, L; Lou, S; Madsen, T; Marchal, K; Martincorena, I; Martinez Fundichely, A; Maruvka, YE; McGillivray, PD; Meyerson, W; Muiños, F; Mularoni, L; Nakagawa, H; Nielsen, MM; Paczkowska, M; Park, K; Park, K; Pedersen, JS; Pich, O; Pons, T; Pulido Tamayo, S; Raphael, BJ; Reimand, J; Reyes Salazar, I; Reyna, MA; Rheinbay, E; Rubin, MA; Rubio Perez, C; Sabarinathan, R; Sahinalp, SC; Saksena, G; Salichos, L; Sander, C; Schumacher, SE; Shackleton, M; Shapira, O; Shen, C; Shrestha, R; Shuai, S; Sidiropoulos, N; Sieverling, L; Sinnott Armstrong, N; Stein, LD; Stuart, JM; Tamborero, D; Tiao, G; Tsunoda, T; Umer, HM; Uusküla Reimand, L; Valencia, A; Vazquez, M; Verbeke, LPC; Wadelius, C; Wadi, L; Wang, J; Warrell, J; Waszak, SM; Weischenfeldt, J; Wheeler, DA; Wu, G; Yu, J; Zhang, J; Zhang, X; Zhang, Y; Zhao, Z; Zou, L; von Mering, C; Johnson, R;

Publicação
COMMUNICATIONS BIOLOGY

Abstract
Joana Carlevaro-Fita, Andres Lanzos et al. present the Cancer LncRNA Census (CLC), a manually curated dataset of 122 long noncoding RNAs (lncRNAs) with experimentally-validated functions in cancer based on data from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. CLC lncRNAs have unique gene features, and a number display evidence for cancer-driving functions that are conserved from humans to mice. Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1, but not MALAT1. Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis.

2020

Combined burden and functional impact tests for cancer driver discovery using DriverPower

Autores
Shuai, S; Abascal, F; Amin, SB; Bader, GD; Bandopadhayay, P; Barenboim, J; Beroukhim, R; Bertl, J; Boroevich, KA; Brunak, S; Campbell, PJ; Carlevaro Fita, J; Chakravarty, D; Chan, CWY; Chen, K; Choi, JK; Deu Pons, J; Dhingra, P; Diamanti, K; Feuerbach, L; Fink, JL; Fonseca, NA; Frigola, J; Gambacorti Passerini, C; Garsed, DW; Gerstein, M; Getz, G; Guo, Q; Gut, IG; Haan, D; Hamilton, MP; Haradhvala, NJ; Harmanci, AO; Helmy, M; Herrmann, C; Hess, JM; Hobolth, A; Hodzic, E; Hong, C; Hornshøj, H; Isaev, K; Izarzugaza, JMG; Johnson, R; Johnson, TA; Juul, M; Juul, RI; Kahles, A; Kahraman, A; Kellis, M; Khurana, E; Kim, J; Kim, JK; Kim, Y; Komorowski, J; Korbel, JO; Kumar, S; Lanzós, A; Larsson, E; Lawrence, MS; Lee, D; Lehmann, KV; Li, S; Li, X; Lin, Z; Liu, EM; Lochovsky, L; Lou, S; Madsen, T; Marchal, K; Martincorena, I; Martinez Fundichely, A; Maruvka, YE; McGillivray, PD; Meyerson, W; Muiños, F; Mularoni, L; Nakagawa, H; Nielsen, MM; Paczkowska, M; Park, K; Park, K; Pedersen, JS; Pons, T; Pulido Tamayo, S; Raphael, BJ; Reimand, J; Reyes Salazar, I; Reyna, MA; Rheinbay, E; Rubin, MA; Rubio Perez, C; Sahinalp, SC; Saksena, G; Salichos, L; Sander, C; Schumacher, SE; Shackleton, M; Shapira, O; Shen, C; Shrestha, R; Shuai, S; Sidiropoulos, N; Sieverling, L; Sinnott Armstrong, N; Stein, LD; Stuart, JM; Tamborero, D; Tiao, G; Tsunoda, T; Umer, HM; Uusküla Reimand, L; Valencia, A; Vazquez, M; Verbeke, LPC; Wadelius, C; Wadi, L; Wang, J; Warrell, J; Waszak, SM; Weischenfeldt, J; Wheeler, DA; Wu, G; Yu, J; Zhang, J; Zhang, X; Zhang, Y; Zhao, Z; Zou, L; von Mering, C; Gallinger, S; Stein, L;

Publicação
Nature Communications

Abstract
The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower’s background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery. © 2020, The Author(s).

2020

Integrative pathway enrichment analysis of multivariate omics data

Autores
Paczkowska, M; Barenboim, J; Sintupisut, N; Fox, NS; Zhu, H; Abd Rabbo, D; Mee, MW; Boutros, PC; Abascal, F; Amin, SB; Bader, GD; Beroukhim, R; Bertl, J; Boroevich, KA; Brunak, S; Campbell, PJ; Carlevaro Fita, J; Chakravarty, D; Chan, CWY; Chen, K; Choi, JK; Deu Pons, J; Dhingra, P; Diamanti, K; Feuerbach, L; Fink, JL; Fonseca, NA; Frigola, J; Gambacorti Passerini, C; Garsed, DW; Gerstein, M; Getz, G; Gonzalez Perez, A; Guo, Q; Gut, IG; Haan, D; Hamilton, MP; Haradhvala, NJ; Harmanci, AO; Helmy, M; Herrmann, C; Hess, JM; Hobolth, A; Hodzic, E; Hong, C; Hornshøj, H; Isaev, K; Izarzugaza, JMG; Johnson, R; Johnson, TA; Juul, M; Juul, RI; Kahles, A; Kahraman, A; Kellis, M; Khurana, E; Kim, J; Kim, JK; Kim, Y; Komorowski, J; Korbel, JO; Kumar, S; Lanzós, A; Lawrence, MS; Lee, D; Lehmann, KV; Li, S; Li, X; Lin, Z; Liu, EM; Lochovsky, L; Lou, S; Madsen, T; Marchal, K; Martincorena, I; Martinez Fundichely, A; Maruvka, YE; McGillivray, PD; Meyerson, W; Muiños, F; Mularoni, L; Nakagawa, H; Nielsen, MM; Park, K; Park, K; Pedersen, JS; Pich, O; Pons, T; Pulido Tamayo, S; Raphael, BJ; Reyes Salazar, I; Reyna, MA; Rheinbay, E; Rubin, MA; Rubio Perez, C; Sabarinathan, R; Sahinalp, SC; Saksena, G; Salichos, L; Sander, C; Schumacher, SE; Shackleton, M; Shapira, O; Shen, C; Shrestha, R; Shuai, S; Sidiropoulos, N; Sieverling, L; Sinnott Armstrong, N; Stein, LD; Stuart, JM; Tamborero, D; Tiao, G; Tsunoda, T; Umer, HM; Uusküla Reimand, L; Valencia, A; Vazquez, M; Verbeke, LPC; Wadelius, C; Wadi, L; Wang, J; Warrell, J; Waszak, SM; Weischenfeldt, J; Wheeler, DA; Wu, G; Yu, J; Zhang, J; Zhang, X; Zhang, Y; Zhao, Z; Zou, L; von Mering, C; Reimand, J;

Publicação
Nature Communications

Abstract
Multi-omics datasets represent distinct aspects of the central dogma of molecular biology. Such high-dimensional molecular profiles pose challenges to data interpretation and hypothesis generation. ActivePathways is an integrative method that discovers significantly enriched pathways across multiple datasets using statistical data fusion, rationalizes contributing evidence and highlights associated genes. As part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we integrated genes with coding and non-coding mutations and revealed frequently mutated pathways and additional cancer genes with infrequent mutations. We also analyzed prognostic molecular pathways by integrating genomic and transcriptomic features of 1780 breast cancers and highlighted associations with immune response and anti-apoptotic signaling. Integration of ChIP-seq and RNA-seq data for master regulators of the Hippo pathway across normal human tissues identified processes of tissue regeneration and stem cell regulation. ActivePathways is a versatile method that improves systems-level understanding of cellular organization in health and disease through integration of multiple molecular datasets and pathway annotations. © 2020, The Author(s).

2020

Speeding up the detection of invasive aquatic species using environmental DNA and nanopore sequencing

Autores
Egeter, B; Veríssimo, J; Lopes-Lima, M; Chaves, C; Pinto, J; Riccardi, N; Beja, P; Fonseca, NA;

Publicação

Abstract
AbstractTraditional detection of aquatic invasive species, via morphological identification is often time-consuming and can require a high level of taxonomic expertise, leading to delayed mitigation responses. Environmental DNA (eDNA) detection approaches of multiple species using Illumina-based sequencing technology have been used to overcome these hindrances, but sample processing is often lengthy. More recently, portable nanopore sequencing technology has become available, which has the potential to make molecular detection of invasive species more widely accessible and to substantially decrease sample turnaround times. However, nanopore-sequenced reads have a much higher error rate than those produced by Illumina platforms, which has so far hindered the adoption of this technology. We provide a detailed laboratory protocol and bioinformatic tools to increase the reliability of nanopore sequencing to detect invasive species, and we test its application using invasive bivalves. We sampled water from sites with pre-existing bivalve occurrence and abundance data, and contrasting bivalve communities, in Italy and Portugal. We extracted, amplified and sequenced eDNA with a turnaround of 3.5 days. The majority of processed reads were = 99 % identical to reference sequences. There were no taxa detected other than those known to occur. The lack of detections of some species at some sites could be explained by their known low abundances. This is the first reported use of MinION to detect aquatic invasive species from eDNA samples. The approach can be easily adapted for other metabarcoding applications, such as biodiversity assessment, ecosystem health assessment and diet studies.

2020

Tumour gene expression signature in primary melanoma predicts long-term outcomes: A prospective multicentre study

Autores
Garg, M; Couturier, D; Nsengimana, J; Fonseca, NA; Wongchenko, M; Yan, Y; Lauss, M; Jönsson, GB; Newton-Bishop, J; Parkinson, C; Middleton, MR; Bishop, T; Corrie, P; Adams, DJ; Brazma, A; Rabbie, R;

Publicação

Abstract
AbstractPurposePredicting outcomes after resection of primary melanoma remains crude, primarily based on tumour thickness. We explored gene expression signatures for their ability to better predict outcomes.MethodsDifferential expression analysis of 194 primary melanomas resected from patients who either developed distant metastasis (n=89) or did not (n=105) was performed. We identified 121 metastasis-associated genes that were included in our prognostic signature, “Cam_121”. Several machine learning classification models were trained using nested leave- one-out cross validation (LOOCV) to test the signature’s capacity to predict metastases, as well as regression models to predict survival. The prognostic accuracy was externally validated in two independent datasets.ResultsCam_121 performed significantly better in predicting distant metastases than any of the models trained with the clinical covariates alone (pAccuracy=4.92×10-3), as well as those trained with two published prognostic signatures. Cam_121 expression score was strongly associated with progression-free survival (HR=1.7, p=3.44×10-6), overall survival (HR=1.73, p=7.71×10-6) and melanoma-specific survival (HR=1.59, p=0.02). Cam_121 expression score also negatively correlated with measures of immune cell infiltration (?=-0.73, p<2.2×10-16), with a higher score representing reduced tumour lymphocytic infiltration and a higher absolute 5-year risk of death in stage II melanoma.ConclusionsThe Cam_121 primary melanoma gene expression signature outperformed currently available alternatives in predicting the risk of distant recurrence. The signature confirmed (using unbiased approaches) the central prognostic importance of immune cell infiltration in long-term patient outcomes and could be used to identify stage II melanoma patients at highest risk of metastases and poor survival who might benefit most from adjuvant therapies.Translational relevancePredicting outcomes after resection of primary melanoma is currently based on traditional histopathological staging, however survival outcomes within these disease stages varies markedly. Since adjuvant systemic therapies are now being used routinely, accurate prognostic information is needed to better risk stratify patients and avoid unnecessary use of high cost, potentially harmful drugs, as well as to inform future adjuvant strategies. The Cam_121 gene expression signature appears to have this capability and warrants evaluation in prospective clinical trials.

  • 108
  • 429