Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Gama

2007

Incremental discretization, application to data with concept drift

Authors
Pinto, C; Gama, J;

Publication
APPLIED COMPUTING 2007, VOL 1 AND 2

Abstract
In this paper we present a method for incremental discretization able to be adapted to gradual changes in the target concept. The proposed method is based on the Partition incremental Discretization (PiD for short). The algorithm divides the discretization task in two layers. The first layer receives the sequence of input data and retains some statistics of the data using more intervals than required. The second layer computes the final discretization, based in the statistics stored by the first layer. The method is able to process streaming examples in a single scan, in constant time and space even for infinite sequences of examples. In dynamic environments the target concept can gradually change over time. Past examples may not reflect the actual status of the problem. To accommodate concept drift we use an exponential decay that smoothly reduces the importance of older examples. Experimental evaluation on a benchmark problem for drift environments, clearly illustrates the benefits of the weighting examples technique.

2012

A density-based clustering approach for behavior change detection in data streams

Authors
Vallim, RMM; Filho, JAA; Carvalho, ACPLF; Gama, J;

Publication
Proceedings - Brazilian Symposium on Neural Networks, SBRN

Abstract
Mining data streams poses many challenges to existing Machine Learning algorithms. Algorithms designed to learn in this scenario need to constantly update their decision models in accordance with current data behavior. Therefore, the ability to detect when the behavior of the stream is changing is an important feature of any learning technique approaching data streams. This work is concerned with unsupervised behavior change detection. It suggests the use of density-based clustering and an entropy measurement for change detection that is independent of the number and format of clusters. The proposed approach uses a modified version of the Den Stream algorithm that is designed to better cope with the entropy calculation. Experimental results using synthetic data provide insight on how clustering and novelty detection algorithms can be used for change detection in data streams. © 2012 IEEE.

2011

Data stream mining algorithms for building decision models in a computer roleplaying game simulation

Authors
Vallim, RMM; De Carvalho, ACPLF; Gama, J;

Publication
Proceedings - 2010 Brazilian Symposium on Games and Digital Entertainment, SBGames 2010

Abstract
Computer games are attracting increasing interest in the Artificial Intelligence (AI) research community, mainly because games involve reasoning, planning and learning [1]. One area of particular interest in the last years is the creation of adaptive game AI. Adaptive game AI is the implementation of AI in computer games that holds the ability to adapt to changing circumstances, i.e., to exhibit adaptive behavior during the play. This kind of adaptation can be created using Machine Learning techniques, such as neural networks, reinforcement learning and bioinspired methods. In order to learn online, a system needs to overcome the main difficulties imposed by games: processing time and memory requirements. Learning in a game needs to be fast and the memory available is usually not enough to store a large number of training examples to a traditional Machine Learning technique. In this context, methods for mining data streams seem to be a natural approach. Data streams are, by definition, sequences of training examples that arrive over time [2]. In the data stream scenario, algorithms are usually incremental and capable of adapting the decision model when a change in the distribution of the training examples is detected. One particularly interesting algorithm for mining data streams is the Very Fast Decision Tree (VFDT) [3]. VFDTs are incremental decision trees designed specifically to meet the data stream problem requirements. In this paper, we analyse the use of VFDTs in the task of learning in a Computer RolePlaying Game context. First, we simulate data from manually designed tactics for a Computer RolePlaying Game, based on Spronck's static tactics [4], and test the suitability of VFDT to rapid learn these tactics. Afterwards, we conduct an experiment in order to simulate a data stream of examples where changes of tactics occur over time, and analyse how VFDT and some of its variations respond to these changes in the target concept. © 2010 IEEE.

2012

Mobile data stream mining: From algorithms to applications

Authors
Krishnaswamy, S; Gama, J; Gaber, MM;

Publication
Proceedings - 2012 IEEE 13th International Conference on Mobile Data Management, MDM 2012

Abstract
This paper presents an overview of the current state-of-the-art in mobile data stream mining. This area of mobile data stream mining is significant for a number of new application domains such as mobile crowd sensing and mobile activity recognition. The paper presents the strategies and techniques for adaptation that are essential in order to perform real-time, continuous data mining on mobile devices. We present an overview of the algorithms research in this area. Finally, we discuss the key toolkits, systems and applications of mobile data stream mining. © 2012 IEEE.

2008

RUSE-WARMR: Rule Selection for Classifier Induction in Multi-Relational Data-Sets

Authors
Ferreira, CA; Gama, J; Costa, VS;

Publication
20TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL 1, PROCEEDINGS

Abstract
One of the major challenges in knowledge discovery is how to extract meaningful and useful knowledge from the complex structured data that one finds in Scientific and Technological applications. One approach is to explore the logic relations in the database and using, say, an Inductive Logic Programming (ILP) algorithm find descriptive and expressive patterns. These patterns can then be used as features to characterize the target concept, The effectiveness of these algorithms depends both upon the algorithm we use to generate the patterns and upon the classifier Rule mining provides an excellent framework for efficiently mining the interesting patterns that are relevant. We propose a novel method to select discriminative patterns and evaluate the effectiveness of this method on a complex discovery application of practical interest.

2008

Online reliability estimates for individual predictions in data streams

Authors
Rodrigues, PP; Gama, J; Bosnic, Z;

Publication
Proceedings - IEEE International Conference on Data Mining Workshops, ICDM Workshops 2008

Abstract
Several predictive systems are nowadays vital for operations and decision support. The quality of these systems is most of the time defined by their average accuracy which has low or no information at all about the estimated error of each individual prediction. In many sensitive applications, users should be allowed to associate a measure of reliability to each prediction. In the case of batch systems, reliability measures have already been defined, mostly empirical measures as the estimation using the local sensitivity analysis. However, with the advent of data streams, these reliability estimates should also be computed online, based only on available data and current model's state. In this paper we define empirical measures to perform online estimation of reliability of individual predictions when made in the context of online learning systems. We present preliminary results and evaluate the estimators in two different problems. © 2008 IEEE.

  • 69
  • 90