Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2015

Links between Scores, Real Default and Pricing: Evidence from the Freddie Mac’s Loan-Level Dataset

Authors
Rocha Sousa, M; Gama, J; Brandão, E;

Publication
Journal of Economics, Business and Management

Abstract

2015

Special track on data streams

Authors
Rodrigues, PP; Bifet, A; Krishnaswamy, S; Gama, J;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract

2015

Keynote speaker 2: Real time data mining

Authors
Gama, J;

Publication
2015 IEEE International Conference on Evolving and Adaptive Intelligent Systems, EAIS 2015, Douai, France, December 1-3, 2015

Abstract

2015

Online tree-based ensembles and option trees for regression on evolving data streams

Authors
Ikonomovska, E; Gama, J; Dzeroski, S;

Publication
NEUROCOMPUTING

Abstract
The emergence of ubiquitous sources of streaming data has given rise to the popularity of algorithms for online machine learning. In that context, Hoeffding trees represent the state-of-the-art algorithms for online classification. Their popularity stems in large part from their ability to process large quantities of data with a speed that goes beyond the processing power of any other streaming or batch learning algorithm. As a consequence, Hoeffding trees have often been used as base models of many ensemble learning algorithms for online classification. However, despite the existence of many algorithms for online classification, ensemble learning algorithms for online regression do not exist. In particular, the field of online any-time regression analysis seems to have experienced a serious lack of attention. In this paper, we address this issue through a study and an empirical evaluation of a set of online algorithms for regression, which includes the baseline Hoeffding-based regression trees, online option trees, and an online least mean squares filter. We also design, implement and evaluate two novel ensemble learning methods for online regression: online bagging with Hoeffding-based model trees, and an online RandomForest method in which we have used a randomized version of the online model tree learning algorithm as a basic building block. Within the study presented in this paper, we evaluate the proposed algorithms along several dimensions: predictive accuracy and quality of models, time and memory requirements, bias-variance and bias-variance-covariance decomposition of the error, and responsiveness to concept drift.

2015

Data Mining Frequent Temporal Events In Agrieconomic Time Series

Authors
Correa, FE; Gama, J; Correa, PLP; Alves, LRA;

Publication
IEEE LATIN AMERICA TRANSACTIONS

Abstract
The agricultural commodities are important to economies of several countries, especially in Brazil. Despite the amount of money involved, as knows that in agribusiness activities do not have accurate information in all the process. Therefore some research centers in Brazil, such as Center for Advanced Studies on Applied Economics - CEPEA, collect and provide daily price indices of these commodities, on several agricultural products, and spread information to these researchers markets, producers and formulators public policy. The idea is to understand the evolution and pattern for the time series of Grains price indices for seven years. The aim of this paper is find common patterns on time series, i.e. highlight events that happens frequently over seven year of daily grain prices quotation in several products. The results give an understanding of the dynamic of these grains time series, such as, some important aspects were detect was these products competes in fields for crops.

2015

Prediction Intervals for Electric Load Forecast: Evaluation for Different Profiles

Authors
Almeida, V; Gama, J;

Publication
2015 18TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM APPLICATION TO POWER SYSTEMS (ISAP)

Abstract
Electricity industries throughout the world have been using load profiles for many years. Electrical load data contain valuable information that can be useful for both electricity producers and consumers. Load forecasting is a fundamental and important task to operate power systems efficiently and economically. Currently, prediction intervals (PIs) are assuming increasing importance comparatively to point forecast that cannot properly handle forecast uncertainties, since they are capable to compromise informativeness and correctness. This paper aims to demonstrate that different demand profiles clearly influence PIs reliability and width. The evaluation is performed using data from different customers on the basis of their electricity behavior using hierarchical clustering, and taking the Kullback-Leibler divergence as the distance metric. PIs are obtained using two different strategies: (1) dual perturb and combine algorithm and (2) conformal prediction. It was possible to demonstrate that different demand profiles clearly influence PI reliability and width for both models. The knowledge retrieved from the analysis of the load patterns is useful and can be used to support the selection of the best method to interval forecast, considering a specific location. And also, it can support the selection of an optimum confidence level, considering that a too wide PI conveys little information and is of no use for decision making.

  • 301
  • 496