Publications

Publications by Carlos Ferreira

2012

Identifying Relationships in Transactional Data

Authors
Rodrigues, M; Gama, J; Ferreira, CA;

Publication
ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2012

Abstract
Association rules is the traditional way used to study market basket or transactional data. One drawback of this analysis is the huge number of rules generated. As a complement to association rules, Association Rules Network (ARN), based on Social Network Analysis (SNA) has been proposed by several researchers. In this work we study a real market basket analysis problem, available in a Belgian supermarket, using ARNs. We learn ARNs by considering the relationships between items that appear more often in the consequent of the association rules. Moreover, we propose a more compact variant of ARNs: the Maximal Itemsets Social Network. In order to assess the quality of these structures, we compute SNA based metrics, like weighted degree and utility of community.

CloseRead Abstract

2008

RUSE-WARMR: Rule Selection for Classifier Induction in Multi-Relational Data-Sets

Authors
Ferreira, CA; Gama, J; Costa, VS;

Publication
20TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL 1, PROCEEDINGS

Abstract
One of the major challenges in knowledge discovery is how to extract meaningful and useful knowledge from the complex structured data that one finds in Scientific and Technological applications. One approach is to explore the logic relations in the database and using, say, an Inductive Logic Programming (ILP) algorithm find descriptive and expressive patterns. These patterns can then be used as features to characterize the target concept, The effectiveness of these algorithms depends both upon the algorithm we use to generate the patterns and upon the classifier Rule mining provides an excellent framework for efficiently mining the interesting patterns that are relevant. We propose a novel method to select discriminative patterns and evaluate the effectiveness of this method on a complex discovery application of practical interest.

CloseRead Abstract

2012

Predictive sequence miner in ILP learning

Authors
Ferreira, CA; Gama, J; Santos Costa, V;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
This work presents an optimized version of XMuSer, an ILP based framework suitable to explore temporal patterns available in multi-relational databases. XMuSer's main idea consists of exploiting frequent sequence mining, an efficient method to learn temporal patterns in the form of sequences. XMuSer framework efficiency is grounded on a new coding methodology for temporal data and on the use of a predictive sequence miner. The frameworks selects and map the most interesting sequential patterns into a new table, the sequence relation. In the last step of our framework, we use an ILP algorithm to learn a classification theory on the enlarged relational database that consists of the original multi-relational database and the new sequence relation. We evaluate our framework by addressing three classification problems and map each one of three different types of sequential patterns: frequent, closed or maximal. The experiments show that our ILP based framework gains both from the descriptive power of the ILP algorithms and the efficiency of the sequential miners. © 2012 Springer-Verlag Berlin Heidelberg.

CloseRead Abstract

2010

Sequential Pattern Mining in Multi-relational Datasets

Authors
Ferreira, CA; Gama, J; Costa, VS;

Publication
CURRENT TOPICS IN ARTIFICIAL INTELLIGENCE

Abstract
We present a framework designed to mine sequential temporal patterns from multi-relational databases. In order to exploit logic-relational information without using aggregation methodologies, we convert the multi-relational dataset into what we name a multi-sequence database. Each example in a multi-relational target table is coded into a sequence that combines intra-table and inter-table relational temporal information. This allows us to find heterogeneous temporal patterns through standard sequence miners. Our framework is grounded in the excellent results achieved by previous propositionalization strategies. We follow a pipelined approach, where we first use a sequence miner to find frequent sequences in the multi-sequence database. Next, we select the most interesting findings to augment the representational space of the examples. The most interesting sequence patterns are discriminative and class correlated. In the final step we build a classifier model by taking an enlarged target table as input to a classifier algorithm. We evaluate the performance of this work through a motivating application, the hepatitis multi-relational dataset. We prove the effectiveness of our methodology by addressing two problems of the hepatitis dataset.

CloseRead Abstract

2012

Predicting Ramp Events with a Stream-Based HMM Framework

Authors
Ferreira, CA; Gama, J; Costa, VS; Miranda, V; Botterud, A;

Publication
Discovery Science - 15th International Conference, DS 2012, Lyon, France, October 29-31, 2012. Proceedings

Abstract
The motivation for this work is the study and prediction of wind ramp events occurring in a large-scale wind farm located in the US Midwest. In this paper we introduce the SHRED framework, a stream-based model that continuously learns a discrete HMM model from wind power and wind speed measurements. We use a supervised learning algorithm to learn HMM parameters from discretized data, where ramp events are HMM states and discretized wind speed data are HMM observations. The discretization of the historical data is obtained by running the SAX algorithm over the first order variations in the original signal. SHRED updates the HMM using the most recent historical data and includes a forgetting mechanism to model natural time dependence in wind patterns. To forecast ramp events we use recent wind speed forecasts and the Viterbi algorithm, that incrementally finds the most probable ramp event to occur. We compare SHRED framework against Persistence baseline in predicting ramp events occurring in short-time horizons, ranging from 30 minutes to 90 minutes. SHRED consistently exhibits more accurate and cost-effective results than the baseline. © 2012 Springer-Verlag Berlin Heidelberg.

CloseRead Abstract

2023

Modeling the Ink Tuning Process Using Machine Learning

Authors
Costa, C; Ferreira, CA;

Publication
Intelligent Data Engineering and Automated Learning - IDEAL 2023 - 24th International Conference, Évora, Portugal, November 22-24, 2023, Proceedings

Abstract
Paint bases are the essence of the color palette, allowing for the creation of a wide range of tones by combining them in different proportions. In this paper, an Artificial Neural Network is developed incorporating a pre-trained Decoder to predict the proportion of each paint base in an ink mixture in order to achieve the desired color. Color coordinates in the CIELAB space and the final finish are considered as input parameters. The proposed model is compared with commonly used models such as Linear Regression, Random Forest and Artificial Neural Network. It is important to note that the Artificial Neural Network was implemented with the same architecture as the proposed model but without incorporating the pre-trained Decoder. Experimental results demonstrate that the Artificial Neural Network with a pre-trained Decoder consistently outperforms the other models in predicting the proportions of paint bases for color tuning. This model exhibits lower Mean Absolute Error and Root Mean Square Error values across multiple objectives, indicating its superior accuracy in capturing the complexities of color relationships. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.

CloseRead Abstract