Publications

Publications by João Gama

2017

Computational Models for Social and Technical Interactions

Authors
Gama, J; Oliveira, E; Cardoso, HL;

Publication
NEW GENERATION COMPUTING

Abstract

2017

Clustering from Data Streams

Authors
Gama, J;

Publication
Encyclopedia of Machine Learning and Data Mining

Abstract

2013

Preface

Authors
Rodrigues, PP; Pechenizkiy, M; Gama, J; Correia, RC; Liu, J; Traina, A; Lucas, P; Soda, P;

Publication
Proceedings - IEEE Symposium on Computer-Based Medical Systems

Abstract

2016

An Overview of Concept Drift Applications

Authors
Žliobaite I.; Pechenizkiy M.; Gama J.;

Publication
Studies in Big Data

Abstract
In most challenging data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time, thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining this phenomenon is referred to as concept drift. The objective is to deploy models that would diagnose themselves and adapt to changing data over time. This chapter provides an application oriented view towards concept drift research, with a focus on supervised learning tasks. First we overview and categorize application tasks for which the problem of concept drift is particularly relevant. Then we construct a reference framework for positioning application tasks within a spectrum of problems related to concept drift. Finally, we discuss some promising research directions from the application perspective, and present recommendations for application driven concept drift research and development.

CloseRead Abstract

2015

Distributed Reasoning

Authors
Rodrigues, P; Gama, J;

Publication
MATHEMATICS OF ENERGY AND CLIMATE CHANGE

Abstract
This paper discusses the problem of learning a global model from local information. We consider ubiquitous streaming data sources, such as sensor networks, and discuss efficient learning distributed algorithms. We present the generic framework of distributed sources of data, an illustrative algorithm to monitor the global state of the network using limited communication between peers, and an efficient distributed clustering algorithm.

CloseRead Abstract

2015

Evaluation of Multiclass Novelty Detection Algorithms for Data Streams

Authors
de Faria, ER; Goncalves, IR; Gama, J; de Leon Ferreira Carvalho, ACPDF;

Publication
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Abstract
Data stream mining is an emergent research area that investigates knowledge extraction from large amounts of continuously generated data, produced by non-stationary distribution. Novelty detection, the ability to identify new or previously unknown situations, is a useful ability for learning systems, especially when dealing with data streams, where concepts may appear, disappear, or evolve over time. There are several studies currently investigating the application of novelty detection techniques in data streams. However, there is no consensus regarding how to evaluate the performance of these techniques. In this study, we propose a new evaluation methodology for multiclass novelty detection in data streams able to deal with: i) unsupervised learning, which generates novelty patterns without an association with the true classes, where one class may be composed of a novelty set, ii) confusion matrix that increases over time, iii) confusion matrix with a column representing unknown examples, i.e., those not explained by the model, and iv) representation of the evaluation measures over time. We propose a new methodology to associate the novelty patterns detected by the algorithm, in an unsupervised fashion, with the true classes. Finally, we evaluate the performance of the proposed methodology through the use of known novelty detection algorithms with artificial and real data sets.

CloseRead Abstract