Publicacoes - INESC TEC

Publicações

Publicações por João Gama

2008

Knowledge discovery from sensor data (SensorKDD)

Autores
Vatsavai, RR; Omitaomu, OA; Gama, J; Chawla, NV; Gaber, MM; Ganguly, AR;

Publicação
SIGKDD Explorations

Abstract

2011

Adaptive windowing for online learning from multiple inter-related data streams

Autores
Ikonomovska, E; Driessensy, K; Dzeroski, S; Gamaz, J;

Publicação
Proceedings - IEEE International Conference on Data Mining, ICDM

Abstract
Relational reinforcement learning is a promising branch of reinforcement learning research that deals with structured environments. In these environments, states and actions are differentiated by the presence of certain types of objects and the relations between them and the objects that are involved in the actions. This makes it ultimately suited for tasks that require the manipulation of multiple, interacting objects, such as tasks that a future house-holding robot can be expected to perform like cleaning up a dinner table or storing away done dishes. However, the application of relational reinforcement learning to robotics has been hindered by assumptions such as discrete and atomic state observations. Typical robotic observation systems work in a streaming setup, where objects are discovered and recognized and their placement within their surroundings is determined in a quasi continuous manner instead of a state based one. The resulting information stream can be compared to a set of multiple inter-related data streams. In this paper, we propose an adaptive windowing strategy for generating a stream of learning examples and enabling relational learning from this kind of data. Our approach is independent from the learning algorithm and is based on a gradient search over the space of parameter values, i.e., window sizes, guided by the estimation of the testing error. The proposed algorithm performs online and is data driven and flexible. To the best of our knowledge, this is the first work addressing this problem. Our ideas are empirically supported by an extensive experimental evaluation in a controlled setup using artificial data. © 2011 IEEE.

FecharLer Abstract

2005

A study on Error Correcting Output Codes

Autores
Pimenta, E; Gama, J;

Publicação
2005 Portuguese Conference on Artificial Intelligence, Proceedings

Abstract
Recent work points towards advantages in decomposing multi-class decision problems into multiple binary problems. There are several strategies for this decomposition. The most used and studied are All-vs-All, One-vs-All and the Error correction output codes (Ecocs). Ecocs appeared in the scope of telecommunications thanks to the capacity to correct transmission errors. This capacity is due to introducing redundancy when codifying messages. Ecocs are binary words and can be adapted to be used in classifications problems. They must, however, respect some specific constraints. The binary words must be further apart as much as possible. Equal or complementary columns cannot exist and no column can be constant (either 1 or 0). Given two ecocs satisfying these constrains, which one is more appropriate for classification purposes? In this work we suggest a function for evaluating the quality of Ecocs. This function is used to guide the search in the persecution algorithm, a new method to generate Ecocs for classifications purposes. The binary words that form the Ecocs can have several dimensions for the same number of classes that it intends to represent. The growth of these possible dimensions is exponential with the number of classes of the multi-class problem. In this paper we present a method to choose the dimension of the Ecoc that assure a good tradeoff between redundancy and error correction capacity. The method is evaluated in a set of benchmark classification problems. Experimental results are competitive against standard decomposition methods.

FecharLer Abstract

2005

Partition incremental discretization

Autores
Pinto, C; Gama, J;

Publicação
2005 Portuguese Conference on Artificial Intelligence, Proceedings

Abstract
In this paper we propose a new method to perform incremental discretization. This approach consists in splitting the task in two layers. The first layer receives the sequence of input data and stores statistics of this data, using a higher number of intervals than what is usually required. The final discretization is generated by the second layer, based on the statistics stored by the previous layer. The proposed architecture processes streaming examples in a single scan, in constant time and space even for infinite sequences of examples. We demonstrate with examples that incremental discretization achieves better results than batch discretization, maintaining the performance of learning algorithms. The proposed method is much more appropriate to evaluate incremental algorithms, and in problems where data flows continuously as most of recent data mining applications.

FecharLer Abstract

2005

EKDB&W'05: Workshop on extraction of knowledge from databases and warehouses

Autores
Gama, J; Pires, JM; Cardoso, M; Marques, NC; Cavique, L;

Publicação
2005 Portuguese Conference on Artificial Intelligence, Proceedings

Abstract

2012

Identifying Relationships in Transactional Data

Autores
Rodrigues, M; Gama, J; Ferreira, CA;

Publicação
ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2012

Abstract
Association rules is the traditional way used to study market basket or transactional data. One drawback of this analysis is the huge number of rules generated. As a complement to association rules, Association Rules Network (ARN), based on Social Network Analysis (SNA) has been proposed by several researchers. In this work we study a real market basket analysis problem, available in a Belgian supermarket, using ARNs. We learn ARNs by considering the relationships between items that appear more often in the consequent of the association rules. Moreover, we propose a more compact variant of ARNs: the Maximal Itemsets Social Network. In order to assess the quality of these structures, we compute SNA based metrics, like weighted degree and utility of community.

FecharLer Abstract