Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2012

Holistic distributed stream clustering for smart grids

Authors
Rodrigues, PP; Gama, J;

Publication
CEUR Workshop Proceedings

Abstract
Smart grids consist of millions of automated electronic meters that will be installed in electricity distribution networks and connected to servers that will manage grid supervision, billing and customer services. World sustainability regarding energy management will definitely rely on such grids, so smart grids need also to be sustainable themselves. This sustainability depends on several research problems that emerge from this new setting (from power balance to energy markets) requiring new approaches for knowledge discovery and decision support. This paper presents a holistic distributed stream clustering view of possible solutions for those problems, supported by previous research in related domains. The approach is based on two orthogonal clustering algorithms, combined for a holistic clustering of the grid. Experimental results are included to illustrate the benefits of each algorithm, while the proposal is discussed in terms of application to smart grid problems. This holistic approach could be used to help solving some of the smart grid intelligent layer research problems, thus improving global sustainability.

2012

Semi-supervised learning: Predicting activities in Android environment

Authors
Lopes, A; Mendes Moreira, J; Gama, J;

Publication
CEUR Workshop Proceedings

Abstract
Predicting activities from data gathered with sensors gained importance over the years with the objective of getting a better understanding of the human body. The purpose of this paper is to show that predicting activities on an Android phone is possible. We take into consideration different classifiers, their accuracy using different approaches (hierarchical and one step classification) and limitations of the mobile itself like battery and memory usage. A semi-supervised learning approach is taken in order to compare its results against supervised learning. The objective is to discover if the application can be adapted to the user providing a better solution for this problem. The activities predicted are the most usual in everyday life: walking, running, standing idle and sitting. An android prototype, embedding the software MOA, was developed to experimentally evaluate the ideas proposed here.

2012

A survey on learning from data streams: current and future trends

Authors
Gama, J;

Publication
Progress in AI

Abstract
Nowadays, there are applications in which the data are modeled best not as persistent tables, but rather as transient data streams. In this article, we discuss the limitations of current machine learning and data mining algorithms. We discuss the fundamental issues in learning in dynamic environments like continuously maintain learning models that evolve over time, learning and forgetting, concept drift and change detection. Data streams produce a huge amount of data that introduce new constraints in the design of learning algorithms: limited computational resources in terms of memory, cpu power, and communication bandwidth. We present some illustrative algorithms, designed to taking these constrains into account, for decision-tree learning, hierarchical clustering and frequent pattern mining. We identify the main issues and current challenges that emerge in learning from data streams that open research lines for further developments. © 2011 Springer-Verlag.

2012

A framework to monitor clusters evolution applied to economy and finance problems

Authors
Oliveira, M; Gama, J;

Publication
INTELLIGENT DATA ANALYSIS

Abstract
The study of evolution has become an important research issue, especially in the last decade, due to our ability to collect and store high detailed and time-stamped data. The need for describing and understanding the behavior of a given phenomena over time led to the emergence of new frameworks and methods focused on the temporal evolution of data and models. In this paper we address the problem of monitoring the evolution of clusters over time and propose the MEC framework. MEC traces evolution through the detection and categorization of clusters transitions, such as births, deaths and merges, and enables their visualization through bipartite graphs. It includes a taxonomy of transitions, a tracking method based in the computation of conditional probabilities, and a transition detection algorithm. We use MEC with two main goals: to determine the general evolution trends and to detect abnormal behavior or rare events. To demonstrate the applicability of our framework we present real world economic and financial case studies, using datasets extracted from Banco de Portugal Central Balance-Sheet Database and the The Data Page of New York University -Leonard N. Stern School of Business. The results allow us to draw interesting conclusions about the evolution of activity sectors and European companies.

2012

An overview of social network analysis

Authors
Oliveira, M; Gama, J;

Publication
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Data mining is being increasingly applied to social networks. Two relevant reasons are the growing availability of large volumes of relational data, boosted by the proliferation of social media web sites, and the intuition that an individual's connections can yield richer information than his/her isolate attributes. This synergistic combination can show to be germane to a variety of applications such as churn prediction, fraud detection and marketing campaigns. This paper attempts to provide a general and succinct overview of the essentials of social network analysis for those interested in taking a first look at this area and oriented to use data mining in social networks. C (C) 2012 Wiley Periodicals, Inc.

2012

Estimating reliability for assessing and correcting individual streaming predictions

Authors
Rodrigues, PPE; Bosnic, Z; Gama, J; Kononenko, I;

Publication
Reliable Knowledge Discovery

Abstract
Several predictive systems are nowadays vital for operations and decision support. The quality of these systems is most of the time defined by their average accuracy which has low or no information at all about the estimated error of each individual prediction. In these cases, users should be allowed to associate a measure of reliability to each prediction. However, with the advent of data streams, batch state-of-the-art reliability estimates need to be redefined. In this chapter we adapt and evaluate five empirical measures for online reliability estimation of individual predictions: similarity-based (k-NN) error, local sensitivity (bias and variance) and online bagging predictions (bias and variance). Evaluation is performed with a neural network base model on two different problems, with results showing that online bagging and k-NN estimates are consistently correlated with the error of the base model. Furthermore, we propose an approach for correcting individual predictions based on the CNK reliability estimate. Evaluation is done on a real-world problem (prediction of the electricity load for a selected European geographical region), using two different regression models: neural network and the k nearest neighbors algorithm. Comparison is performed with corrections based on the Kalman filter. The results show that our method performs better than the Kalman filter, significantly improving the original predictions to more accurate values.

  • 299
  • 430