2017
Authors
Gama, J; Oliveira, E; Cardoso, HL;
Publication
NEW GENERATION COMPUTING
Abstract
2017
Authors
Gama, J;
Publication
Encyclopedia of Machine Learning and Data Mining
Abstract
2013
Authors
Rodrigues, PP; Pechenizkiy, M; Gama, J; Correia, RC; Liu, J; Traina, A; Lucas, P; Soda, P;
Publication
Proceedings - IEEE Symposium on Computer-Based Medical Systems
Abstract
2016
Authors
Žliobaite I.; Pechenizkiy M.; Gama J.;
Publication
Studies in Big Data
Abstract
In most challenging data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time, thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining this phenomenon is referred to as concept drift. The objective is to deploy models that would diagnose themselves and adapt to changing data over time. This chapter provides an application oriented view towards concept drift research, with a focus on supervised learning tasks. First we overview and categorize application tasks for which the problem of concept drift is particularly relevant. Then we construct a reference framework for positioning application tasks within a spectrum of problems related to concept drift. Finally, we discuss some promising research directions from the application perspective, and present recommendations for application driven concept drift research and development.
2015
Authors
Rodrigues, P; Gama, J;
Publication
MATHEMATICS OF ENERGY AND CLIMATE CHANGE
Abstract
This paper discusses the problem of learning a global model from local information. We consider ubiquitous streaming data sources, such as sensor networks, and discuss efficient learning distributed algorithms. We present the generic framework of distributed sources of data, an illustrative algorithm to monitor the global state of the network using limited communication between peers, and an efficient distributed clustering algorithm.
2015
Authors
de Faria, ER; Goncalves, IR; Gama, J; de Leon Ferreira Carvalho, ACPDF;
Publication
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Abstract
Data stream mining is an emergent research area that investigates knowledge extraction from large amounts of continuously generated data, produced by non-stationary distribution. Novelty detection, the ability to identify new or previously unknown situations, is a useful ability for learning systems, especially when dealing with data streams, where concepts may appear, disappear, or evolve over time. There are several studies currently investigating the application of novelty detection techniques in data streams. However, there is no consensus regarding how to evaluate the performance of these techniques. In this study, we propose a new evaluation methodology for multiclass novelty detection in data streams able to deal with: i) unsupervised learning, which generates novelty patterns without an association with the true classes, where one class may be composed of a novelty set, ii) confusion matrix that increases over time, iii) confusion matrix with a column representing unknown examples, i.e., those not explained by the model, and iv) representation of the evaluation measures over time. We propose a new methodology to associate the novelty patterns detected by the algorithm, in an unsupervised fashion, with the true classes. Finally, we evaluate the performance of the proposed methodology through the use of known novelty detection algorithms with artificial and real data sets.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.