Publications

Publications by Nuno Fonseca

2010

Predicting the Start of Protein alpha-Helices Using Machine Learning Algorithms

Authors
Camacho, R; Ferreira, R; Rosa, N; Guimaraes, V; Fonseca, NA; Costa, VS; de Sousa, M; Magalhaes, A;

Publication
ADVANCES IN BIOINFORMATICS

Abstract

2009

Comparative Study of Classification Algorithms Using Molecular Descriptors in Toxicological DataBases

Authors
Pereira, M; Costa, VS; Camacho, R; Fonseca, NA; Simoes, C; Brito, RMM;

Publication
ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, PROCEEDINGS

Abstract
The rational development of new drugs is a complex and expensive process, comprising several steps. Typically, it starts by screening databases of small organic molecules for chemical structures with potential of binding to a target receptor and prioritizing the most promising ones. Only a few of these will be selected for biological evaluation and further refinement through chemical synthesis. Despite the accumulated knowledge by pharmaceutical companies that continually improve the process of finding new drugs, a myriad of factors affect the activity of putative candidate molecules in vivo and the propensity for causing adverse and toxic effects is recognized as the major hurdle behind the current "target-rich, lead-poor" scenario. In this study we evaluate the use of several Machine Learning algorithms to find useful rules to the elucidation and prediction of toxicity using ID and 2D molecular descriptors. The results indicate that: i) Machine Learning algorithms can effectively use ID molecular descriptors to construct accurate and simple models; ii) extending the set of descriptors to include 2D descriptors improve the accuracy of the models.

CloseRead Abstract

2009

Visually Guiding and Controlling the Search While Mining Chemical Structures

Authors
Pereira, M; Costa, VS; Camacho, R; Fonseca, NA;

Publication
DISTRIBUTED COMPUTING, ARTIFICIAL INTELLIGENCE, BIOINFORMATICS, SOFT COMPUTING, AND AMBIENT ASSISTED LIVING, PT II, PROCEEDINGS

Abstract
In this paper we present the work in progress on LogCHEM, an ILP based tool for discriminative interactive mining of chemical fragments. In particular, we describe the integration with a molecule visualisation software that allows the chemist to graphically control the search for interesting patterns in chemical fragments. Furthermore, we show how structured information, such as rings, functional groups like carboxyl, amine, methyl, ester, etc are integrated and exploited in LogCHEM.

CloseRead Abstract

2009

Partitional Clustering of Protein Sequences - An Inductive Logic Programming Approach

Authors
Fonseca, NA; Costa, VS; Camacho, R; Vieira, C; Vieira, J;

Publication
DISTRIBUTED COMPUTING, ARTIFICIAL INTELLIGENCE, BIOINFORMATICS, SOFT COMPUTING, AND AMBIENT ASSISTED LIVING, PT II, PROCEEDINGS

Abstract
We present a novel approach to cluster sets of protein sequences, based on Inductive Logic Programming (ILP). Preliminary results show that; the method proposed Produces understand able descriptions/explanations of the clusters. Furthermore, it can be used as a knowledge elicitation tool to explain clusters proposed by other clustering approaches, such as standard phylogenetic programs.

CloseRead Abstract

2008

k-RNN: k-Relational Nearest Neighbour Algorithm

Authors
Fonseca, NA; Costa, VS; Rocha, R; Camacho, R;

Publication
APPLIED COMPUTING 2008, VOLS 1-3

Abstract
The amount of data collected and stored in databases is growing considerably in almost all areas of human activity. In complex applications the data involves several relations and proposionalization is not a suitable approach. Multi-Relational Data Mining algorithms can analyze data from multiple relations, with no need to transform the data into a single table, but are computationally more expensive. In this paper a novel relational classification algorithm based on the k-nearest neighbour algorithm is presented and evaluated.

CloseRead Abstract

2007

Efficient and scalable induction of logic programs using a deductive database system

Authors
Ferreira, M; Fonseca, NA; Rocha, R; Scares, T;

Publication
Inductive Logic Programming

Abstract
A consequence of ILP systems being implemented in Prolog or using Prolog libraries is that, usually, these systems use a Prolog internal database to store and manipulate data. However, in real-world problems, the original data is rarely in Prolog format. In fact, the data is often kept in Relational Database Management Systems (RDBMS) and then converted to a format acceptable by the ILP system. Therefore, a more interesting approach is to link the ILP system to the RDBMS and manipulate the data without converting it. This scheme has the advantage of being more scalable since the whole data does not need to be loaded into memory by the ILP system. In this paper we study several approaches of coupling ILP systems with RDBMS systems and evaluate their impact on performance. We propose to use a Deductive Database (DDB) system to transparently translate the hypotheses to relational algebra expressions. The empirical evaluation performed shows that the execution time of ILP algorithms can be effectively reduced using a DDB and that the size of the problems can be increased due to a non-memory storage of the data.

CloseRead Abstract