Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2013

Boosting the Detection of Transposable Elements Using Machine Learning

Autores
Loureiro, T; Camacho, R; Vieira, J; Fonseca, NA;

Publicação
Advances in Intelligent Systems and Computing

Abstract
Transposable Elements (TE) are sequences of DNA that move and transpose within a genome. TEs, as mutation agents, are quite important for their role in both genome alteration diseases and on species evolution. Several tools have been developed to discover and annotate TEs but no single one achieves good results on all different types of TEs. In this paper we evaluate the performance of several TEs detection and annotation tools and investigate if Machine Learning techniques can be used to improve their overall detection accuracy. The results of an in silico evaluation of TEs detection and annotation tools indicate that their performance can be improved by using machine learning classifiers. © Springer International Publishing Switzerland 2013.

2013

Improving the performance of Transposable Elements detection tools

Autores
Loureiro, T; Camacho, R; Vieira, J; Fonseca, NA;

Publicação
J. Integrative Bioinformatics

Abstract
Transposable Elements (TE) are sequences of DNA that move and transpose within a genome. TEs, as mutation agents, are quite important for their role in both genome alteration diseases and on species evolution. Several tools have been developed to discover and annotate TEs but no single tool achieves good results on all different types of TEs. In this paper we evaluate the performance of several TEs detection and annotation tools and investigate if Machine Learning techniques can be used to improve their overall detection accuracy. The results of an in silico evaluation of TEs detection and annotation tools indicate that their performance can be improved by using machine learning constructed classifiers.

2013

Drosophila americana as a Model Species for Comparative Studies on the Molecular Basis of Phenotypic Variation

Autores
Fonseca, NA; Morales Hojas, R; Reis, M; Rocha, H; Vieira, CP; Nolte, V; Schloetterer, C; Vieira, J;

Publicação
GENOME BIOLOGY AND EVOLUTION

Abstract
Understanding the molecular basis of within and between species phenotypic variation is one of the main goals of Biology. In Drosophila, most of the work regarding this issue has been performed in D. melanogaster, but other distantly related species must also be studied to verify the generality of the findings obtained for this species. Here, we make the case for D. americana, a species of the virilis group of Drosophila that has been diverging from the model species, D. melanogaster, for approximately 40 Myr. To determine the suitability of this species for such studies, polymorphism and recombination estimates are presented for D. americana based on the largest nucleotide sequence polymorphism data set so far analyzed (more than 100 data sets) for this species. The polymorphism estimates are also compared with those obtained from the comparison of the genome assembly of two D. americana strains (H5 and W11) here reported. As an example of the general utility of these resources, we perform a preliminary study on the molecular basis of lifespan differences in D. americana. First, we show that there are lifespan differences between D. americana populations from different regions of the distribution range. Then, we perform five F2 association experiments using markers for 21 candidate genes previously identified in D. melanogaster. Significant associations are found between polymorphism at two genes (hep and Lim3) and lifespan. For the F2 association study involving the two sequenced strains (H5 and W11), we identify amino acid differences at Lim3 and Hep that could be responsible for the observed changes in lifespan. For both genes, no large gene expression differences were observed between the two strains.

2013

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

Autores
Bradnam, KR; Fass, JN; Alexandrov, A; Baranay, P; Bechner, M; Birol, I; Boisvert, S; Chapman, JA; Chapuis, G; Chikhi, R; Chitsaz, H; Chou, WC; Corbeil, J; Del Fabbro, C; Docking, TR; Durbin, R; Earl, D; Emrich, S; Fedotov, P; Fonseca, NA; Ganapathy, G; Gibbs, RA; Gnerre, S; Godzaridis, E; Goldstein, S; Haimel, M; Hall, G; Haussler, D; Hiatt, JB; Ho, IY; Howard, J; Hunt, M; Jackman, SD; Jaffe, DB; Jarvis, ED; Jiang, H; Kazakov, S; Kersey, PJ; Kitzman, JO; Knight, JR; Koren, S; Lam, TW; Lavenier, D; Laviolette, F; Li, YR; Li, ZY; Liu, BH; Liu, Y; Luo, R; MacCallum, I; MacManes, MD; Maillet, N; Melnikov, S; Naquin, D; Ning, Z; Otto, TD; Paten, B; Paulo, OS; Phillippy, AM; Pina Martins, F; Place, M; Przybylski, D; Qin, X; Qu, C; Ribeiro, FJ; Richards, S; Rokhsar, DS; Ruby, JG; Scalabrin, S; Schatz, MC; Schwartz, DC; Sergushichev, A; Sharpe, T; Shaw, TI; Shendure, J; Shi, YJ; Simpson, JT; Song, H; Tsarev, F; Vezzi, F; Vicedomini, R; Vieira, BM; Wang, J; Worley, KC; Yin, SY; Yiu, SM; Yuan, JY; Zhang, GJ; Zhang, H; Zhou, S; Korf, IF;

Publicação
GIGASCIENCE

Abstract
Background: The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results: In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions: Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.

2013

The Drosophila melanogaster methuselah Gene: A Novel Gene with Ancient Functions

Autores
Araujo, AR; Reis, M; Rocha, H; Aguiar, B; Morales Hojas, R; Macedo Ribeiro, S; Fonseca, NA; Reboiro Jato, D; Reboiro Jato, M; Fdez Riverola, F; Vieira, CP; Vieira, J;

Publicação
PLOS ONE

Abstract
The Drosophila melanogaster G protein-coupled receptor gene, methuselah (mth), has been described as a novel gene that is less than 10 million years old. Nevertheless, it shows a highly specific expression pattern in embryos, larvae, and adults, and has been implicated in larval development, stress resistance, and in the setting of adult lifespan, among others. Although mth belongs to a gene subfamily with 16 members in D. melanogaster, there is no evidence for functional redundancy in this subfamily. Therefore, it is surprising that a novel gene influences so many traits. Here, we explore the alternative hypothesis that mth is an old gene. Under this hypothesis, in species distantly related to D. melanogaster, there should be a gene with features similar to those of mth. By performing detailed phylogenetic, synteny, protein structure, and gene expression analyses we show that the D. virilis GJ12490 gene is the orthologous of mth in species distantly related to D. melanogaster. We also show that, in D. americana (a species of the virilis group of Drosophila), a common amino acid polymorphism at the GJ12490 orthologous gene is significantly associated with developmental time, size, and lifespan differences. Our results imply that GJ12490 orthologous genes are candidates for developmental time and lifespan differences in Drosophila in general.

2013

Patterns of evolution at the gametophytic self-incompatibility Sorbus aucuparia (Pyrinae) S pollen genes support the non-self recognition by multiple factors model

Autores
Aguiar, B; Vieira, J; Cunha, AE; Fonseca, NA; Reboiro Jato, D; Reboiro Jato, M; Fdez Riverola, F; Raspe, O; Vieira, CP;

Publicação
JOURNAL OF EXPERIMENTAL BOTANY

Abstract
S-RNase-based gametophytic self-incompatibility evolved once before the split of the Asteridae and Rosidae. In Prunus (tribe Amygdaloideae of Rosaceae), the self-incompatibility S-pollen is a single F-box gene that presents the expected evolutionary signatures. In Malus and Pyrus (subtribe Pyrinae of Rosaceae), however, clusters of F-box genes (called SFBBs) have been described that are expressed in pollen only and are linked to the S-RNase gene. Although polymorphic, SFBB genes present levels of diversity lower than those of the S-RNase gene. They have been suggested as putative S-pollen genes, in a system of non-self recognition by multiple factors. Subsets of allelic products of the different SFBB genes interact with non-self S-RNases, marking them for degradation, and allowing compatible pollinations. This study performed a detailed characterization of SFBB genes in Sorbus aucuparia (Pyrinae) to address three predictions of the non-self recognition by multiple factors model. As predicted, the number of SFBB genes was large to account for the many S-RNase specificities. Secondly, like the S-RNase gene, the SFBB genes were old. Thirdly, amino acids under positive selectionuthose that could be involved in specificity determinationuwere identified when intra-haplotype SFBB genes were analysed using codon models. Overall, the findings reported here support the non-self recognition by multiple factors model.

  • 282
  • 430