Publications

Publications by LIAAD

2014

Recurrent concepts in data streams classification

Authors
Gama, J; Kosina, P;

Publication
KNOWLEDGE AND INFORMATION SYSTEMS

Abstract
This work addresses the problem of mining data streams generated in dynamic environments where the distribution underlying the observations may change over time. We present a system that monitors the evolution of the learning process. The system is able to self-diagnose degradations of this process, using change detection mechanisms, and self-repair the decision models. The system uses meta-learning techniques that characterize the domain of applicability of previously learned models. The meta-learner can detect recurrence of contexts, using unlabeled examples, and take pro-active actions by activating previously learned models. The experimental evaluation on three text mining problems demonstrates the main advantages of the proposed system: it provides information about the recurrence of concepts and rapidly adapts decision models when drift occurs.

CloseRead Abstract

2014

Challenges in Learning from Streaming Data

Authors
Gama, J;

Publication
ADVANCES IN DATABASES AND INFORMATION SYSTEMS (ADBIS 2014)

Abstract

2014

Data stream mining in ubiquitous environments: state-of-the-art and current directions

Authors
Gaber, MM; Gama, J; Krishnaswamy, S; Gomes, JB; Stahl, F;

Publication
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
In this article, we review the state-of-the-art techniques in mining data streams for mobile and ubiquitous environments. We start the review with a concise background of data stream processing, presenting the building blocks for mining data streams. In a wide range of applications, data streams are required to be processed on small ubiquitous devices like smartphones and sensor devices. Mobile and ubiquitous data mining target these applications with tailored techniques and approaches addressing scarcity of resources and mobility issues. Two categories can be identified for mobile and ubiquitous mining of streaming data: single-node and distributed. This survey will cover both categories. Mining mobile and ubiquitous data require algorithms with the ability to monitor and adapt the working conditions to the available computational resources. We identify the key characteristics of these algorithms and present illustrative applications. Distributed data stream mining in the mobile environment is then discussed, presenting the Pocket Data Mining framework. Mobility of users stimulates the adoption of context-awareness in this area of research. Context-awareness and collaboration are discussed in the Collaborative Data Stream Mining, where agents share knowledge to learn adaptive accurate models. Conflict of interest: The authors have declared no conflicts of interest for this article. For further resources related to this article, please visit the .

CloseRead Abstract

2014

Challenges in Learning from Streaming Data Extended Abstract

Authors
Gama, J;

Publication
ICT Innovations 2014 - World of Data, Ohrid, Macedonia, 1-4 October, 2014

Abstract
Machine learning studies automatic methods for acquisition of domain knowledge with the goal of improving systems performance as the result of experience. In the past two decades, machine learning research and practice has focused on batch learning usually with small data sets. The rationale behind this practice is that examples are generated at random accordingly to some stationary probability distribution. Most learners use a greedy, hill-climbing search in the space of models. They are prone to overfitting, local maximas, etc. Data are scarce and statistic estimates have high variance. A paradigmatic example is the TDIT algorithm to learn decision trees [14]. As the tree grows, less and fewer examples are available to compute the sufficient statistics, variance increase leading to model instability Moreover, the growing process re-uses the same data, exacerbating the overfitting problem. Regularization and pruning mechanisms are mandatory. © Springer International Publishing Switzerland 2015.

CloseRead Abstract

2014

Ensembles of Adaptive Model Rules from High-Speed Data Streams

Authors
Duarte, J; Gama, J;

Publication
Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, BigMine 2014, New York City, USA, August 24, 2014

Abstract

2014

Keynote speakers

Authors
Gama, J;

Publication
IEEE Symposium on Computers and Communications, ISCC 2014, Funchal, Madeira, Portugal, June 23-26, 2014

Abstract