Publications

Publications by Carlos Manuel Soares

2015

Improving the accuracy of long-term travel time prediction using heterogeneous ensembles

Authors
Mendes Moreira, J; Jorge, AM; de Sousa, JF; Soares, C;

Publication
NEUROCOMPUTING

Abstract
This paper is about long-term travel time prediction in public transportation. However, it can be useful for a wider area of applications. It follows a heterogeneous ensemble approach with dynamic selection. A vast set of experiments with a pool of 128 tuples of algorithms and parameter sets (a&ps) has been conducted for each of the six studied routes. Three different algorithms, namely, random forest, projection pursuit regression and support vector machines, were used. Then, ensembles of different sizes were obtained after a pruning step. The best approach to combine the outputs is also addressed. Finally, the best ensemble approach for each of the six routes is compared with the best individual a&ps. The results confirm that heterogeneous ensembles are adequate for long-term travel time prediction. Namely, they achieve both higher accuracy and robustness along time than state-of-the-art learners.

CloseRead Abstract

2015

Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings, Part I

Authors
Appice, A; Rodrigues, PP; Costa, VS; Soares, C; Gama, J; Jorge, A;

Publication
ECML/PKDD (1)

Abstract

2015

Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings, Part II

Authors
Appice, A; Rodrigues, PP; Costa, VS; Gama, J; Jorge, A; Soares, C;

Publication
ECML/PKDD (2)

Abstract

2015

Metalearning for multiple-domain transfer learning

Authors
Félix, C; Soares, C; Jorge, A;

Publication
CEUR Workshop Proceedings

Abstract
Machine learning processes consist in collecting data, obtaining a model and applying it to a given task. Given a new task, the standard approach is to restart the learning process and obtain a new model. However, previous learning experience can be exploited to assist the new learning process. The two most studied approaches for this are metalearning and transfer learning. Metalearning can be used for selecting the predictive model to use over a determined dataset. Transfer learning allows the reuse of knowledge from previous tasks. Our aim is to use metalearning to support transfer learning and reduce the computational cost without loss in terms of performance, as well as the user effort needed for the algorithm selection. In this paper we propose some methods for mapping the transfer of weights between neural networks to improve the performance of the target network, and describe some experiments performed in order to test our hypothesis.

CloseRead Abstract

2015

Metalearning to Choose the Level of Analysis in Nested Data: A Case Study on Error Detection in Foreign Trade Statistics

Authors
Zarmehri, MN; Soares, C;

Publication
2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)

Abstract
Traditionally, a single model is developed for a data mining task. As more data is being collected at a more detailed level, organizations are becoming more interested in having specific models for distinct parts of data (e. g. customer segments). From the business perspective, data can be divided naturally into different dimensions. Each of these dimensions is usually hierarchically organized (e. g. country, city, zip code), which means that, when developing a model for a given part of the problem (e. g. a zip code) the training data may be collected at different levels of this nested hierarchy (e. g. the same zip code, the city and the country it is located in). Selecting different levels of granularity may change the performance of the whole process, so the question is which level to use for a given part. We propose a metalearning model which recommends a level of granularity for the training data to learn the model that is expected to obtain the best performance. We apply decision tree and random forest algorithms for metalearning. At the base level, our experiment uses results obtained by outlier detection methods on the problem of detecting errors in foreign trade transactions. The results show that using metalearning help finding the best level of granularity.

CloseRead Abstract

2016

Meta-learning to select the best meta-heuristic for the Traveling Salesman Problem: A comparison of meta-features

Authors
Kanda, J; de Carvalho, A; Hruschka, E; Soares, C; Brazdil, P;

Publication
NEUROCOMPUTING

Abstract
The Traveling Salesman Problem (TSP) is one of the most studied optimization problems. Various meta heuristics (MHs) have been proposed and investigated on many instances of this problem. It is widely accepted that the best MH varies for different instances. Ideally, one should be able to recommend the best MHs for a new TSP instance without having to execute them. However, this is a very difficult task. We address this task by using a meta-learning approach based on label ranking algorithms. These algorithms build a mapping that relates the characteristics of those instances (i.e., the meta-features) with the relative performance (i.e., the ranking) of MHs, based on (meta-)data extracted from TSP instances that have been already solved by those MHs. The success of this approach depends on the quality of the meta-features that describe the instances. In this work, we investigate four different sets of meta-features based on different measurements of the properties of TSP instances: edge and vertex measures, complex network measures, properties from the MHs, and subsampling landmarkers properties. The models are investigated in four different TSP scenarios presenting symmetry and connection strength variations. The experimental results indicate that meta-learning models can accurately predict rankings of MHs for different TSP scenarios. Good solutions for the investigated TSP instances can be obtained from the prediction of rankings of MHs, regardless of the learning algorithm used at the meta level. The experimental results also show that the definition of the set of meta-features has an important impact on the quality of the solutions obtained.

CloseRead Abstract