Publications

Publications by Ricardo Rocha

2010

Threads and or-parallelism unified

Authors
Costa, VS; Dutra, I; Rocha, R;

Publication
THEORY AND PRACTICE OF LOGIC PROGRAMMING

Abstract
One of the main advantages of Logic Programming (LP) is that it provides an excellent framework for the parallel execution of programs. In this work we investigate novel techniques to efficiently exploit parallelism from real-world applications in low cost multi-core architectures. To achieve these goals, we revive and redesign the YapOr system to exploit or-parallelism based on a multi-threaded implementation. Our new approach takes full advantage of the state-of-the-art fast and optimized YAP Prolog engine and shares the underlying execution environment, scheduler and most of the data structures used to support YapOr's model. Initial experiments with our new approach consistently achieve almost linear speedups for most of the applications, proving itself as a good alternative for exploiting implicit parallelism in the currently available low cost multi-core architectures.

CloseRead Abstract

2012

An Efficient and Scalable Memory Allocator for Multithreaded Tabled Evaluation of Logic Programs

Authors
Areias, M; Rocha, R;

Publication
PROCEEDINGS OF THE 2012 IEEE 18TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2012)

Abstract
Despite the availability of both multithreading and tabling in some Prolog systems, the implementation of these two features, such that they work together, implies complex ties to one another and to the underlying engine. In recent work, we have proposed an approach to combine multithreading with tabling, implemented on top of the Yap Prolog system, whose primary goal was to reduce memory usage for the table space. Regarding the execution times, we observed some problems related to Yap's memory allocator, which is based on the operating system's default memory allocator, when running programs that allocate a higher number of data structures in the table space. In this paper, we propose a more efficient and scalable memory allocator for multithreaded tabled evaluation of logic programs. Our goal is to minimize the performance degradation that the system suffers when it is exposed to simultaneous memory requests made by multiple threads. For that, we propose a memory allocator based on local and global pages, to split memory among specific data structures and different threads, together with a strategy where data structures of the same type are pre-allocated within a page. Experimental results show that our new memory allocator can effectively reduce the execution time and scale better, when increasing the number of threads, than the original allocator.

CloseRead Abstract

2009

mtDNA GeneExtractor: A computer tool for mtDNA gene/region information extraction

Authors
Freitas, F; Oliveira, S; Rocha, R; Pereira, L;

Publication
MITOCHONDRION

Abstract
The analysis of considerable numbers of DNA sequences is largely dependent on the development of simple software tools for automatically process the genetic data deposited on public databases. However, there are some difficulties in the automation process due to diverse synonyms being used as qualifiers for genes and some inconsistencies in gene locations between related Primate species, this fact happening even in the carefully curated database RefSeq. Here, we present mtDNA GeneEXtractor, a Windows based computer tool developed for the extraction of information for particular gene/regions from mammal mitochondrial DNA sequences deposited under GenBank format. The tool was quite efficient in retrieving organized information for comparative mtDNA gene/region diversity analyses when tested for the evaluation of transition/transversion ratios in humans and between Primates. Taking phylogenetic information into account to avoid redundancy due to ancestry-sharing, the transition/transversion ratios in the 13 protein-coding genes had a mean value of 12.46 for Primates (from 6.46 in ND2 to 17.04 in COX1) and higher (34.74) but more heterogeneous (ranging from 17.30 in ND5 to 74.39 in ND4) in a worldwide human database. The similar patterns of transition/transversion ratios in all positions and in only four fold degenerate positions show no evidence for selection in the 13 mtDNA protein-coding genes.

CloseRead Abstract

2006

RepeatAround: A software tool for finding and visualizing repeats in circular genomes and its application to a human mtDNA database

Authors
Goios, A; Meirinhos, J; Rocha, R; Lopes, R; Amorim, A; Pereira, L;

Publication
MITOCHONDRION

Abstract
RepeatAround is a Windows based software tool designed to find "direct repeats", "inverted repeats", "mirror repeats" and "complementary repeats", from 3 to 64 bp length, in circular genomes. It processes input files directly extracted from GenBank database, providing visualisation of the repeats location in the genomic structure, so that for instance, in most mtDNAs the user can check if the repeats are located in coding or non-coding region (and in the first case in which gene), and how far apart the repeat pair(s) are. Besides the visual tool, it provides other outputs in a spreadsheet containing information on the number and location of the repeats, facilitating graphic analyses. Several genomes can be inputed simultaneously, for phylogenetic comparison purposes. Other capabilities of the software are the generation of random circular genomes, for statistical evaluation of comparison between observed repeats distributions with their shuffled counterparts, as well as the search for specific motifs, allowing an easy confirmation of repeats flanking a newly detected rearrangement. As an example of the programme's applications we analysed the Direct Repeats distribution in a large human mtDNA database. Results showed that Direct Repeats, even the larger ones, are evenly distributed among the human mtDNA haplogroups, enabling us to state that, based only on the repetitive motifs, no haplogroup is particularly more or less prone to mtDNA macrodeletions.

CloseRead Abstract

2009

The Diversity Present in 5140 Human Mitochondrial Genomes

Authors
Pereira, L; Freitas, F; Fernandes, V; Pereira, JB; Costa, MD; Costa, S; Maximo, V; Macaulay, V; Rocha, R; Samuels, DC;

Publication
AMERICAN JOURNAL OF HUMAN GENETICS

Abstract
We analyzed the current status (as of the end of August 2008) of human mitochondrial genomes deposited in GenBank, amounting to 5140 complete or coding-region sequences, in order to present an overall picture of the diversity present in the mitochondrial DNA of the global human population. To perform this task, we developed mtDNA-GeneSyn, a computer tool that identifies and exhaustedly classifies the diversity present in large genetic data sets. The diversity observed in the 5140 human mitochondrial genomes was compared with all possible transitions and transversions from the standard human mitochondrial reference genome. This comparison showed that tRNA and rRNA secondary structures have a large effect in limiting the diversity of the human mitochondrial sequences, whereas for the protein-coding genes there is a bias toward less variation at the second codon positions. The analysis of the observed amino acid variations showed a tolerance of variations that convert between the amino acids V, 1, A, M, and T. This defines a group of amino acids with similar chemical properties that can interconvert by a single transition.

CloseRead Abstract

2011

A Subterm-Based Global Trie for Tabled Evaluation of Logic Programs

Authors
Raimundo, J; Rocha, R;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
Tabling is an implementation technique that overcomes some limitations of traditional Prolog systems in dealing with redundant sub-computations and recursion. A critical component in the implementation of an efficient tabling system is the design of the table space. The most popular and successful data structure for representing tables is based on a two-level trie data structure, where one trie level stores the tabled subgoal calls and the other stores the computed answers. The Global Trie (GT) is an alternative table space organization designed with the intent to reduce the tables's memory usage, namely by storing terms in a global trie, thus preventing repeated representations of the same term in different trie data structures. In this paper, we propose an extension to the GT organization, named Global Trie for Subterms (GT-ST), where compound subterms in term arguments are represented as unique entries in the GT. Experimental results using the Yap Tab tabling system show that GT-ST support has potential to achieve significant reductions on memory usage, for programs with increasing compound subterms in term arguments, without compromising the execution time for other programs.

CloseRead Abstract