Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Pedro Gabriel Ferreira

2005

Protein sequence classification through relevant sequence mining and Bayes Classifiers

Authors
Ferreira, PG; Azevedo, PJ;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS

Abstract
We tackle the problem of sequence classification using relevant subsequences found in a dataset of protein labelled sequences. A subsequence is relevant if it is frequent and has a minimal length. For each query sequence a vector of features is obtained. The features consist in the number and average length of the relevant subsequences shared with each of the protein families. Classification is performed by combining these features in a Bayes Classifier. The combination of these characteristics results in a multi-class and multi-domain method that is exempt of data transformation and background knowledge. We illustrate the performance of our method using three collections of protein datasets. The performed tests showed that the method has an equivalent performance to state of the art methods in protein classification.

2005

Protein sequence pattern mining with constraints

Authors
Ferreira, PG; Azevedo, PJ;

Publication
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005

Abstract
Considering the characteristics of biological sequence databases, which typically have a small alphabet, a very long length and a relative small size (several hundreds of sequences), we propose a new sequence mining algorithm (gIL). gIL was developed for linear sequence pattern mining and results from the combination of some of the most efficient techniques used in sequence and itemset mining. The algorithm exhibits a high adaptability, yielding a smooth and direct introduction of various types of features into the mining process, namely the extraction of rigid and arbitrary gap patterns. Both breadth or a depth first traversal are possible. The experimental evaluation, in synthetic and real life protein databases, has shown that our algorithm has superior performance to state-of-the art algorithms. The use of constraints has also proved to be a very useful tool to specify user interesting patterns.

2007

A closer look on protein unfolding Simulations through hierarchical clustering

Authors
Ferreira, PG; Silva, CG; Brito, RMM; Azevedo, PJ;

Publication
2007 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology

Abstract
Understanding protein folding and unfolding mechanisms are a central problem in molecular biology. Data obtained from molecular dynamics unfolding simulations may provide valuable insights for a better understanding of these mechanisms. Here, we propose the application of an augmented version of hierarchical clustering analysis to detect clusters of amino-acid residues with similar behavior in protein unfolding simulations. These clusters hold similar global pattern behavior of solvent accessible surface area (SASA) variation in unfolding simulations of the protein Transthyretin (TTR). Classical hierarchical clustering was applied to build a dendrogram based on the SASA variation of each amino-acid residue. The dendrogram was enriched with background information on the amino-acid residues, enabling the extraction of sub-clusters with well differentiated characteristics.

2006

Query Driven Sequence Pattern Mining

Authors
Ferreira, PG; Azevedo, PJ;

Publication
XXI Simpósio Brasileiro de Banco de Dados, 16-20 de Outubro, Florianópolis, Santa Catarina, Brasil, Anais/Proceedings

Abstract

2009

Deterministic Motif Mining in Protein Databases

Authors
Ferreira, PG; Azevedo, PJ;

Publication
Database Technologies: Concepts, Methodologies, Tools, and Applications (4 Volumes)

Abstract

2005

A Hybrid Method for Discovering Distance-Enhanced Inter-Transactional Rules

Authors
Ferreira, PG; Alves, R; Azevedo, PJ; Belo, O;

Publication
Actas de las X Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2005), September 14-16, 2005, Granada, Spain

Abstract

  • 7
  • 13