Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Gama

2015

Keynote speaker 2: Real time data mining

Authors
Gama, J;

Publication
2015 IEEE International Conference on Evolving and Adaptive Intelligent Systems, EAIS 2015, Douai, France, December 1-3, 2015

Abstract

2015

Online tree-based ensembles and option trees for regression on evolving data streams

Authors
Ikonomovska, E; Gama, J; Dzeroski, S;

Publication
NEUROCOMPUTING

Abstract
The emergence of ubiquitous sources of streaming data has given rise to the popularity of algorithms for online machine learning. In that context, Hoeffding trees represent the state-of-the-art algorithms for online classification. Their popularity stems in large part from their ability to process large quantities of data with a speed that goes beyond the processing power of any other streaming or batch learning algorithm. As a consequence, Hoeffding trees have often been used as base models of many ensemble learning algorithms for online classification. However, despite the existence of many algorithms for online classification, ensemble learning algorithms for online regression do not exist. In particular, the field of online any-time regression analysis seems to have experienced a serious lack of attention. In this paper, we address this issue through a study and an empirical evaluation of a set of online algorithms for regression, which includes the baseline Hoeffding-based regression trees, online option trees, and an online least mean squares filter. We also design, implement and evaluate two novel ensemble learning methods for online regression: online bagging with Hoeffding-based model trees, and an online RandomForest method in which we have used a randomized version of the online model tree learning algorithm as a basic building block. Within the study presented in this paper, we evaluate the proposed algorithms along several dimensions: predictive accuracy and quality of models, time and memory requirements, bias-variance and bias-variance-covariance decomposition of the error, and responsiveness to concept drift.

2013

Novelty detection algorithm for data streams multi-class problems

Authors
Faria, ER; Gama, J; Carvalho, APLF;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract
Novelty detection has been presented in the literature as one-class problem. In this case, new examples are classified as either belonging to the target class or not. The examples not explained by the model are detected as belonging to a class named novelty. However, novelty detection is much more general, especially in data streams scenarios, where the number of classes might be unknown before learning and new classes can appear any time. In this case, the novelty concept is composed by different classes. This work presents a new algorithm to address novelty detection in data streams multi-class problems, the MINAS algorithm. Moreover, we also present a new experimental methodology to evaluate novelty detection methods in multi-class problems. The data used in the experiments include artificial and real data sets. Experimental results show that MINAS is able to discover novelties in multi-class problems. Copyright 2013 ACM.

2014

Data stream mining in ubiquitous environments: state-of-the-art and current directions

Authors
Gaber, MM; Gama, J; Krishnaswamy, S; Gomes, JB; Stahl, F;

Publication
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
In this article, we review the state-of-the-art techniques in mining data streams for mobile and ubiquitous environments. We start the review with a concise background of data stream processing, presenting the building blocks for mining data streams. In a wide range of applications, data streams are required to be processed on small ubiquitous devices like smartphones and sensor devices. Mobile and ubiquitous data mining target these applications with tailored techniques and approaches addressing scarcity of resources and mobility issues. Two categories can be identified for mobile and ubiquitous mining of streaming data: single-node and distributed. This survey will cover both categories. Mining mobile and ubiquitous data require algorithms with the ability to monitor and adapt the working conditions to the available computational resources. We identify the key characteristics of these algorithms and present illustrative applications. Distributed data stream mining in the mobile environment is then discussed, presenting the Pocket Data Mining framework. Mobility of users stimulates the adoption of context-awareness in this area of research. Context-awareness and collaboration are discussed in the Collaborative Data Stream Mining, where agents share knowledge to learn adaptive accurate models. Conflict of interest: The authors have declared no conflicts of interest for this article. For further resources related to this article, please visit the .

2015

Data Mining Frequent Temporal Events In Agrieconomic Time Series

Authors
Correa, FE; Gama, J; Correa, PLP; Alves, LRA;

Publication
IEEE LATIN AMERICA TRANSACTIONS

Abstract
The agricultural commodities are important to economies of several countries, especially in Brazil. Despite the amount of money involved, as knows that in agribusiness activities do not have accurate information in all the process. Therefore some research centers in Brazil, such as Center for Advanced Studies on Applied Economics - CEPEA, collect and provide daily price indices of these commodities, on several agricultural products, and spread information to these researchers markets, producers and formulators public policy. The idea is to understand the evolution and pattern for the time series of Grains price indices for seven years. The aim of this paper is find common patterns on time series, i.e. highlight events that happens frequently over seven year of daily grain prices quotation in several products. The results give an understanding of the dynamic of these grains time series, such as, some important aspects were detect was these products competes in fields for crops.

2013

Data stream mining: The bounded rationality

Authors
Gama, J;

Publication
Informatica (Slovenia)

Abstract
The developments of information and communication technologies dramatically change the data collection and processing methods. Data mining is now moving to the era of bounded rationality. In this work we discuss the implications of the resource constraints impose by the data stream computational model in the design of learning algorithms. We analyze the behavior of stream mining algorithms and present future research directions including ubiquitous stream mining and self-adaption models.

  • 21
  • 88