Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2014

High-resolution mapping of transcriptional dynamics across tissue development reveals a stable mRNA-tRNA interface

Autores
Schmitt, BM; Rudolph, KLM; Karagianni, P; Fonseca, NA; White, RJ; Talianidis, L; Odom, DT; Marioni, JC; Kutter, C;

Publicação
GENOME RESEARCH

Abstract
The genetic code is an abstraction of how mRNA codons and tRNA anticodons molecularly interact during protein synthesis; the stability and regulation of this interaction remains largely unexplored. Here, we characterized the expression of mRNA and tRNA genes quantitatively at multiple time points in two developing mouse tissues. We discovered that mRNA codon pools are highly stable over development and simply reflect the genomic background; in contrast, precise regulation of tRNA gene families is required to create the corresponding tRNA transcriptomes. The dynamic regulation of tRNA genes during development is controlled in order to generate an anticodon pool that closely corresponds to messenger RNAs. Thus, across development, the pools of mRNA codons and tRNA anticodons are invariant and highly correlated, revealing a stable molecular interaction interlocking transcription and translation.

2014

AND Parallelism for ILP: The APIS System

Autores
Camacho, R; Ramos, R; Fonseca, NA;

Publicação
INDUCTIVE LOGIC PROGRAMMING: 23RD INTERNATIONAL CONFERENCE

Abstract
Inductive Logic Programming (ILP) is a well known approach to Multi-Relational Data Mining. ILP systems may take a long time for analyzing the data mainly because the search (hypotheses) spaces are often very large and the evaluation of each hypothesis, which involves theorem proving, may be quite time consuming in some domains. To address these efficiency issues of ILP systems we propose the APIS (And ParallelISm for ILP) system that uses results from Logic Programming AND-parallelism. The approach enables the partition of the search space into sub-spaces of two kinds: sub-spaces where clause evaluation requires theorem proving; and sub-spaces where clause evaluation is performed quite efficiently without resorting to a theorem prover. We have also defined a new type of redundancy (Coverage-equivalent redundancy) that enables the prune of significant parts of the search space. The new type of pruning together with the partition of the hypothesis space considerably improved the performance of the APIS system. An empirical evaluation of the APIS system in standard ILP data sets shows considerable speedups without a lost of accuracy of the models constructed.

2014

Expression Atlas update-a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments

Autores
Petryszak, R; Burdett, T; Fiorelli, B; Fonseca, NA; Gonzalez Porta, M; Hastings, E; Huber, W; Jupp, S; Keays, M; Kryvych, N; McMurry, J; Marioni, JC; Malone, J; Megy, K; Rustici, G; Tang, AY; Taubert, J; Williams, E; Mannion, O; Parkinson, HE; Brazma, A;

Publicação
NUCLEIC ACIDS RESEARCH

Abstract
Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of 'baseline' expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful 'contrasts', i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user.

2014

RNA-Seq Gene Profiling - A Systematic Empirical Comparison

Autores
Fonseca, NA; Marioni, J; Brazma, A;

Publicação
PLOS ONE

Abstract
Accurately quantifying gene expression levels is a key goal of experiments using RNA-sequencing to assay the transcriptome. This typically requires aligning the short reads generated to the genome or transcriptome before quantifying expression of pre-defined sets of genes. Differences in the alignment/quantification tools can have a major effect upon the expression levels found with important consequences for biological interpretation. Here we address two main issues: do different analysis pipelines affect the gene expression levels inferred from RNA-seq data? And, how close are the expression levels inferred to the "true" expression levels? We evaluate fifty gene profiling pipelines in experimental and simulated data sets with different characteristics (e. g, read length and sequencing depth). In the absence of knowledge of the 'ground truth' in real RNAseq data sets, we used simulated data to assess the differences between the "true" expression and those reconstructed by the analysis pipelines. Even though this approach does not take into account all known biases present in RNAseq data, it still allows to estimate the accuracy of the gene expression values inferred by different analysis pipelines. The results show that i) overall there is a high correlation between the expression levels inferred by the best pipelines and the true quantification values; ii) the error in the estimated gene expression values can vary considerably across genes; and iii) a small set of genes have expression estimates with consistently high error (across data sets and methods). Finally, although the mapping software is important, the quantification method makes a greater difference to the results.

2014

Long-range enhancers regulating Myc expression are required for normal facial morphogenesis

Autores
Uslu, VV; Petretich, M; Ruf, S; Langenfeld, K; Fonseca, NA; Marioni, JC; Spitz, F;

Publicação
NATURE GENETICS

Abstract
Cleft lip with or without cleft palate (CL/P) is one of the most common congenital malformations observed in humans, with 1 occurrence in every 500-1,000 births(1,2). A 640-kb noncoding interval at 8q24 has been associated with increased risk of non-syndromic CL/P in humans(3-5), but the genes and pathways involved in this genetic susceptibility have remained elusive. Using a large series of rearrangements engineered over the syntenic mouse region, we show that this interval contains very remote cis-acting enhancers that control Myc expression in the developing face. Deletion of this interval leads to mild alteration of facial morphology in mice and, sporadically, to CUP. At the molecular level, we identify misexpression of several downstream genes, highlighting combined impact on the craniofacial developmental network and the general metabolic capacity of cells contributing to the future upper lip. This dual molecular etiology may account for the prominent influence of variants in the 8q24 region on human facial dysmorphologies.

2014

iRAP - an integrated RNA-seq Analysis Pipeline

Autores
Fonseca, NA; Petryszak, R; Marioni, J; Brazma, A;

Publicação

Abstract
RNA-sequencing (RNA-Seq) has become the technology of choice for whole-transcriptome profiling. However, processing the millions of sequence reads generated requires considerable bioinformatics skills and computational resources. At each step of the processing pipeline many tools are available, each with specific advantages and disadvantages. While using a specific combination of tools might be desirable, integrating the different tools can be time consuming, often due to specificities in the formats of input/output files required by the different programs. Here we present iRAP, an integrated RNA-seq analysis pipeline that allows the user to select and apply their preferred combination of existing tools for mapping reads, quantifying expression, testing for differential expression. iRAP also includes multiple tools for gene set enrichment analysis and generates web browsable reports of the results obtained in the different stages of the pipeline. Depending upon the application, iRAP can be used to quantify expression at the gene, exon or transcript level. iRAP is aimed at a broad group of users with basic bioinformatics training and requires little experience with the command line. Despite this, it also provides more advanced users with the ability to customise the options used by their chosen tools.

  • 264
  • 430