Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por João Gama

2011

Adaptive windowing for online learning from multiple inter-related data streams

Autores
Ikonomovska, E; Driessensy, K; Dzeroski, S; Gamaz, J;

Publicação
Proceedings - IEEE International Conference on Data Mining, ICDM

Abstract
Relational reinforcement learning is a promising branch of reinforcement learning research that deals with structured environments. In these environments, states and actions are differentiated by the presence of certain types of objects and the relations between them and the objects that are involved in the actions. This makes it ultimately suited for tasks that require the manipulation of multiple, interacting objects, such as tasks that a future house-holding robot can be expected to perform like cleaning up a dinner table or storing away done dishes. However, the application of relational reinforcement learning to robotics has been hindered by assumptions such as discrete and atomic state observations. Typical robotic observation systems work in a streaming setup, where objects are discovered and recognized and their placement within their surroundings is determined in a quasi continuous manner instead of a state based one. The resulting information stream can be compared to a set of multiple inter-related data streams. In this paper, we propose an adaptive windowing strategy for generating a stream of learning examples and enabling relational learning from this kind of data. Our approach is independent from the learning algorithm and is based on a gradient search over the space of parameter values, i.e., window sizes, guided by the estimation of the testing error. The proposed algorithm performs online and is data driven and flexible. To the best of our knowledge, this is the first work addressing this problem. Our ideas are empirically supported by an extensive experimental evaluation in a controlled setup using artificial data. © 2011 IEEE.

2005

A study on Error Correcting Output Codes

Autores
Pimenta, E; Gama, J;

Publicação
2005 Portuguese Conference on Artificial Intelligence, Proceedings

Abstract
Recent work points towards advantages in decomposing multi-class decision problems into multiple binary problems. There are several strategies for this decomposition. The most used and studied are All-vs-All, One-vs-All and the Error correction output codes (Ecocs). Ecocs appeared in the scope of telecommunications thanks to the capacity to correct transmission errors. This capacity is due to introducing redundancy when codifying messages. Ecocs are binary words and can be adapted to be used in classifications problems. They must, however, respect some specific constraints. The binary words must be further apart as much as possible. Equal or complementary columns cannot exist and no column can be constant (either 1 or 0). Given two ecocs satisfying these constrains, which one is more appropriate for classification purposes? In this work we suggest a function for evaluating the quality of Ecocs. This function is used to guide the search in the persecution algorithm, a new method to generate Ecocs for classifications purposes. The binary words that form the Ecocs can have several dimensions for the same number of classes that it intends to represent. The growth of these possible dimensions is exponential with the number of classes of the multi-class problem. In this paper we present a method to choose the dimension of the Ecoc that assure a good tradeoff between redundancy and error correction capacity. The method is evaluated in a set of benchmark classification problems. Experimental results are competitive against standard decomposition methods.

2005

Partition incremental discretization

Autores
Pinto, C; Gama, J;

Publicação
2005 Portuguese Conference on Artificial Intelligence, Proceedings

Abstract
In this paper we propose a new method to perform incremental discretization. This approach consists in splitting the task in two layers. The first layer receives the sequence of input data and stores statistics of this data, using a higher number of intervals than what is usually required. The final discretization is generated by the second layer, based on the statistics stored by the previous layer. The proposed architecture processes streaming examples in a single scan, in constant time and space even for infinite sequences of examples. We demonstrate with examples that incremental discretization achieves better results than batch discretization, maintaining the performance of learning algorithms. The proposed method is much more appropriate to evaluate incremental algorithms, and in problems where data flows continuously as most of recent data mining applications.

2005

EKDB&W'05: Workshop on extraction of knowledge from databases and warehouses

Autores
Gama, J; Pires, JM; Cardoso, M; Marques, NC; Cavique, L;

Publicação
2005 Portuguese Conference on Artificial Intelligence, Proceedings

Abstract

2012

Identifying Relationships in Transactional Data

Autores
Rodrigues, M; Gama, J; Ferreira, CA;

Publicação
ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2012

Abstract
Association rules is the traditional way used to study market basket or transactional data. One drawback of this analysis is the huge number of rules generated. As a complement to association rules, Association Rules Network (ARN), based on Social Network Analysis (SNA) has been proposed by several researchers. In this work we study a real market basket analysis problem, available in a Belgian supermarket, using ARNs. We learn ARNs by considering the relationships between items that appear more often in the consequent of the association rules. Moreover, we propose a more compact variant of ARNs: the Maximal Itemsets Social Network. In order to assess the quality of these structures, we compute SNA based metrics, like weighted degree and utility of community.

2003

Adaptation to drifting concepts

Autores
Castillo, G; Gama, J; Medas, P;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
Most of supervised learning algorithms assume the stability of the target concept over time. Nevertheless in many real-user modeling systems, where the data is collected over an extended period of time, the learning task can be complicated by changes in the distribution underlying the data. This problem is known in machine learning as concept drift. The main idea behind Statistical Quality Control is to monitor the stability of one or more quality characteristics in a production process which generally shows some variation over time. In this paper we present a method for handling concept drift based on Shewhart P-Charts in an on-line framework for supervised learning. We explore the use of two alternatives P-charts, which differ only by the way they estimate the target value to set the center line. Experiments with simulated concept drift scenarios in the context of a user modeling prediction task compare the proposed method with other adaptive approaches. The results show that, both P-Charts consistently recognize concept changes, and that the learner can adapt quickly to these changes to maintain its performance level.

  • 66
  • 88