Publications

Publications by LIAAD

2013

On Predicting the Taxi-Passenger Demand: A Real-Time Approach

Authors
Moreira Matias, L; Gama, J; Ferreira, M; Mendes Moreira, J; Damas, L;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2013

Abstract
Informed driving is becoming a key feature to increase the sustainability of taxi companies. Some recent works are exploring the data broadcasted by each vehicle to provide live information for decision making. In this paper, we propose a method to employ a learning model based on historical GPS data in a real-time environment. Our goal is to predict the spatiotemporal distribution of the Taxi-Passenger demand in a short time horizon. We did so by using learning concepts originally proposed to a well-known online algorithm: the perceptron [1]. The results were promising: we accomplished a satisfactory performance to output the next prediction using a short amount of resources.

CloseRead Abstract

2013

Special track on data streams

Authors
Rodrigues, PP; Bifet, A; Krishnaswamy, S; Gama, J;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract

2013

Adaptive model rules from data streams

Authors
Almeida, E; Ferreira, C; Gama, J;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Decision rules are one of the most expressive languages for machine learning. In this paper we present Adaptive Model Rules (AMRules), the first streaming rule learning algorithm for regression problems. In AMRules the antecedent of a rule is a conjunction of conditions on the attribute values, and the consequent is a linear combination of attribute values. Each rule uses a Page-Hinkley test to detect changes in the process generating data and react to changes by pruning the rule set. In the experimental section we report the results of AMRules on benchmark regression problems, and compare the performance of our system with other streaming regression algorithms. © 2013 Springer-Verlag.

CloseRead Abstract

2013

Contextual Anomalies in Medical Data

Authors
Vasco, D; Rodrigues, PP; Gama, J;

Publication
2013 IEEE 26TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS)

Abstract
Anomalies in data can cause a lot of problems in the data analysis processes. Thus, it is necessary to improve data quality by detecting and eliminating errors and inconsistencies in the data, known as the data cleaning process [1]. Since detection and correction of anomalies requires detailed domain knowledge, the involvement of experts in the field is essential to the success of the process of cleaning the data. However, considering the size of data to be processed, this process should be as automatic as possible so as to minimize the time spent [1]. © 2013 IEEE.

CloseRead Abstract

2013

Random rules from data streams

Authors
Almeida, E; Kosina, P; Gama, J;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract
Existing works suggest that random inputs and random features produce good results in classification. In this paper we study the problem of generating random rule sets from data streams. One of the most interpretable and flexible models for data stream mining prediction tasks is the Very Fast Decision Rules learner (VFDR). In this work we extend the VFDR algorithm using random rules from data streams. The proposed algorithm generates several sets of rules. Each rule set is associated with a set of Natt attributes. The proposed algorithm maintains all properties required when learning from stationary data streams: online and any-time classification, processing each example once. Copyright 2013 ACM.

CloseRead Abstract

2013

Avoiding Anomalies in Data Stream Learning

Authors
Gama, J; Kosina, P; Almeida, E;

Publication
DISCOVERY SCIENCE

Abstract
The presence of anomalies in data compromises data quality and can reduce the effectiveness of learning algorithms. Standard data mining methodologies refer to data cleaning as a pre-processing before the learning task. The problem of data cleaning is exacerbated when learning in the computational model of data streams. In this paper we present a streaming algorithm for learning classification rules able to detect contextual anomalies in the data. Contextual anomalies are surprising attribute values in the context defined by the conditional part of the rule. For each example we compute the degree of anomaliness based on the probability of the attribute-values given the conditional part of the rule covering the example. The examples with high degree of anomaliness are signaled to the user and not used to train the classifier. The experimental evaluation in real-world data sets shows the ability to discover anomalous examples in the data. The main advantage of the proposed method is the ability to inform the context and explain why the anomaly occurs.

CloseRead Abstract