Publications

Publications by João Gama

2006

Discretization from data streams: Applications to histograms and data mining

Authors
Gama, J; Pinto, C;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract
In this paper we propose a new method to perform incremental discretization. The basic idea is to perform the task in two layers. The first layer receives the sequence of input data and keeps some, statistics on the data using many more intervals than required. Based on the statistics stored by the first layer, the second layer creates the final discretization. The proposed architecture processes streaming examples in a single scan, in constant time and space even for infinite sequences of examples. We experimentally demonstrate that incremental discretization is able to maintain the performance of learning algorithms in comparison to a batch discretization. The proposed method is much more appropriate in incremental learning, and in problems where data flows continuously, as in most of the recent data mining applications. Copyright 2006 ACM.

CloseRead Abstract

2008

Special track on data streams

Authors
Gama, J; Carvalho, A; Aguilar Rlliz, J;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract

2010

Clustering data streams with weightless neural networks

Authors
Cardoso, DO; Lima, PMV; De Gregorio, M; Gama, J; Franca, FMG;

Publication
ESANN 2011 proceedings, 19th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning

Abstract
Producing good quality clustering of data streams in real time is a difficult problem, since it is necessary to perform the analysis of data points arriving in a continuous style, with the support of quite limited computational resources. The incremental and evolving nature of the resulting clustering structures must reflect the dynamics of the target data stream. The WiSARD weightless perceptron, and its associated DRASiW extension, are intrinsically capable of, respectively, performing one-shot learning and producing prototypes of the learnt categories. This work introduces a simple generalization of RAM-based neurons in order to explore both weightless neural models in the data stream clustering problem.

CloseRead Abstract

2009

Decision Trees Using the Minimum Entropy-of-Error Principle

Authors
Marques de Sa, JPM; Gama, J; Sebastiao, R; Alexandre, LA;

Publication
COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PROCEEDINGS

Abstract
Binary decision trees based on univariate splits have traditionally employed so-called impurity functions as a means of searching for the best node splits. Such functions use estimates of the class distributions. In the present paper we introduce a new concept to binary tree design: instead of working with the class distributions of the data we work directly with the distribution of the errors originated by the node splits. Concretely, we search for the best splits using a minimum entropy-of-error (MEE) strategy. This strategy has recently been applied in other areas (e.g. regression, clustering, blind source separation, neural network training) with success. We show that MEE trees are capable of producing good results with often simpler trees, have interesting generalization properties and in the many experiments we have performed they could be used without pruning.

CloseRead Abstract

2011

New Results on Minimum Error Entropy Decision Trees

Authors
Marques de Sa, JPM; Sebastiao, R; Gama, J; Fontes, T;

Publication
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS

Abstract
We present new results on the performance of Minimum Error Entropy (MEE) decision trees, which use a novel node split criterion. The results were obtained in a comparive study with popular alternative algorithms, on 42 real world datasets. Carefull validation and statistical methods were used. The evidence gathered from this body of results show that the error performance of MEE trees compares well with alternative algorithms. An important aspect to emphasize is that MEE trees generalize better on average without sacrifing error performance.

CloseRead Abstract

2004

Functional trees

Authors
Gama, J;

Publication
MACHINE LEARNING

Abstract
In the context of classification problems, algorithms that generate multivariate trees are able to explore multiple representation languages by using decision tests based on a combination of attributes. In the regression setting, model trees algorithms explore multiple representation languages but using linear models at leaf nodes. In this work we study the effects of using combinations of attributes at decision nodes, leaf nodes, or both nodes and leaves in regression and classification tree learning. In order to study the use of functional nodes at different places and for different types of modeling, we introduce a simple unifying framework for multivariate tree learning. This framework combines a univariate decision tree with a linear function by means of constructive induction. Decision trees derived from the framework are able to use decision nodes with multivariate tests, and leaf nodes that make predictions using linear functions. Multivariate decision nodes are built when growing the tree, while functional leaves are built when pruning the tree. We experimentally evaluate a univariate tree, a multivariate tree using linear combinations at inner and leaf nodes, and two simplified versions restricting linear combinations to inner nodes and leaves. The experimental evaluation shows that all functional trees variants exhibit similar performance, with advantages in different datasets. In this study there is a marginal advantage of the full model. These results lead us to study the role of functional leaves and nodes. We use the bias-variance decomposition of the error, cluster analysis, and learning curves as tools for analysis. We observe that in the datasets under study and for classification and regression, the use of multivariate decision nodes has more impact in the bias component of the error, while the use of multivariate decision leaves has more impact in the variance component.

CloseRead Abstract