Publicacoes - INESC TEC

Publicações

Publicações por João Gama

2000

Cascade generalization

Autores
Gama, J; Brazdil, P;

Publicação
MACHINE LEARNING

Abstract
Using multiple classifiers for increasing learning accuracy is an active research area. In this paper we present two related methods for merging classifiers. The first method, Cascade Generalization, couples classifiers loosely. It belongs to the family of stacking algorithms. The basic idea of Cascade Generalization is to use sequentially the set of classifiers, at each step performing an extension of the original data by the insertion of new attributes. The new attributes are derived from the probability class distribution given by a base classifier. This constructive step extends the representational language for the high level classifiers, relaxing their bias. The second method exploits tight coupling of classifiers, by applying Cascade Generalization locally. At each iteration of a divide and conquer algorithm, a reconstruction of the instance space occurs by the addition of new attributes. Each new attribute represents the probability that an example belongs to a class given by a base classifier. We have implemented three Local Generalization Algorithms. The first merges a linear discriminant with a decision tree, the second merges a naive Bayes with a decision tree, and the third merges a linear discriminant and a naive Bayes with a decision tree. All the algorithms show an increase of performance, when compared with the corresponding single models. Cascade also outperforms other methods for combining classifiers, like Stacked Generalization, and competes well against Boosting at statistically significant confidence levels.

FecharLer Abstract

2007

Editorial

Autores
Cardoso, MGMS; Gama, J; Carvalho, A;

Publicação
Journal of Retailing and Consumer Services

Abstract

2001

Functional Trees for Regression

Autores
Gama, J;

Publicação
Advances in Intelligent Data Analysis, 4th International Conference, IDA 2001, Cascais, Portugal, September 13-15, 2001, Proceedings

Abstract
In this paper we present and evaluate a new algorithm for supervised learning regression problems. The algorithm combines a univariate regression tree with a linear regression function by means of constructive induction. When growing the tree, at each internal node, a linear-regression function creates one new attribute. This new attribute is the instantiation of the regression function for each example that fall at this node. This new instance space is propagated down through the tree. Tests based on those new attributes correspond to an oblique decision surface. Our approach can be seen as a hybrid model that combines a linear regression known to have low variance with a regression tree known to have low bias. Our algorithm was compared against to its components, and two simplified versions, and M5 using 16 benchmark datasets. The experimental evaluation shows that our algorithm has clear advantages with respect to the generalization ability when compared against its components and competes well against the state-of-art in regression trees. © Springer-Verlag Berlin Heidelberg 2001.

FecharLer Abstract

2007

Electricity load forecast using data streams techniques

Autores
Rodrigues, PP; Gama, J;

Publicação
Modulad

Abstract

2008

Issues and Challenges in Learning from Data Streams

Autores
Gama, J;

Publicação
Next Generation of Data Mining.

Abstract

2008

Research Challenges in Ubiquitous Knowledge Discovery

Autores
May, M; Berendt, B; Cornuéjols, A; Gama, J; Giannotti, F; Hotho, A; Malerba, D; Menasalvas, E; Morik, K; Pedersen, RU; Saitta, L; Saygin, Y; Schuster, A; Vanhoof, K;

Publicação
Next Generation of Data Mining.

Abstract