Publications

Publications by Carlos Manuel Soares

2004

A meta-learning method to select the kernel width in Support Vector Regression

Authors
Soares, C; Brazdil, PB; Kuba, P;

Publication
MACHINE LEARNING

Abstract
The Support Vector Machine algorithm is sensitive to the choice of parameter settings. If these are not set correctly, the algorithm may have a substandard performance. Suggesting a good setting is thus an important problem. We propose a meta-learning methodology for this purpose and exploit information about the past performance of different settings. The methodology is applied to set the width of the Gaussian kernel. We carry out an extensive empirical evaluation, including comparisons with other methods (fixed default ranking; selection based on cross-validation and a heuristic method commonly used to set the width of the SVM kernel). We show that our methodology can select settings with low error while providing significant savings in time. Further work should be carried out to see how the methodology could be adapted to different parameter setting tasks.

CloseRead Abstract

2012

Combining meta-learning and search techniques to select parameters for support vector machines

Authors
Gomes, TAF; Prudencio, RBC; Soares, C; Rossi, ALD; Carvalho, A;

Publication
NEUROCOMPUTING

Abstract
Support Vector Machines (SVMs) have achieved very good performance on different learning problems. However, the success of SVMs depends on the adequate choice of the values of a number of parameters (e.g., the kernel and regularization parameters). In the current work, we propose the combination of meta-learning and search algorithms to deal with the problem of SVM parameter selection. In this combination, given a new problem to be solved, meta-learning is employed to recommend SVM parameter values based on parameter configurations that have been successfully adopted in previous similar problems. The parameter values returned by meta-learning are then used as initial search points by a search technique, which will further explore the parameter space. In this proposal, we envisioned that the initial solutions provided by meta-learning are located in good regions of the search space (i.e. they are closer to optimum solutions). Hence, the search algorithm would need to evaluate a lower number of candidate solutions when looking for an adequate solution. In this work, we investigate the combination of meta-learning with two search algorithms: Particle Swarm Optimization and Tabu Search. The implemented hybrid algorithms were used to select the values of two SVM parameters in the regression domain. These combinations were compared with the use of the search algorithms without meta-learning. The experimental results on a set of 40 regression problems showed that, on average, the proposed hybrid methods obtained lower error rates when compared to their components applied in isolation.

CloseRead Abstract

2012

Integrating data mining and optimization techniques on surgery scheduling

Authors
Gomes, C; Almada Lobo, B; Borges, J; Soares, C;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
This paper presents a combination of optimization and data mining techniques to address the surgery scheduling problem. In this approach, we first develop a model to predict the duration of the surgeries using a data mining algorithm. The prediction model outcomes are then used by a mathematical optimization model to schedule surgeries in an optimal way. In this paper, we present the results of using three different data mining algorithms to predict the duration of surgeries and compare them with the estimates made by surgeons. The results obtained by the data mining models show an improvement in estimation accuracy of 36%.We also compare the schedules generated by the optimization model based on the estimates made by the prediction models against reality. Our approach enables an increase in the number of surgeries performed in the operating theater, thus allowing a reduction on the average waiting time for surgery and a reduction in the overtime and undertime per surgery performed. These results indicate that the proposed approach can help the hospital improve significantly the efficiency of resource usage and increase the service levels. © Springer-Verlag 2012.

CloseRead Abstract

2012

Multilayer perceptron for label ranking

Authors
Ribeiro, G; Duivesteijn, W; Soares, C; Knobbe, A;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Label Ranking problems are receiving increasing attention in machine learning. The goal is to predict not just a single value from a finite set of labels, but rather the permutation of that set that applies to a new example (e.g., the ranking of a set of financial analysts in terms of the quality of their recommendations). In this paper, we adapt a multilayer perceptron algorithm for label ranking. We focus on the adaptation of the Back-Propagation (BP) mechanism. Six approaches are proposed to estimate the error signal that is propagated by BP. The methods are discussed and empirically evaluated on a set of benchmark problems. © 2012 Springer-Verlag.

CloseRead Abstract

2010

Empirical evaluation of ranking prediction methods for gene expression data classification

Authors
De Souza, BF; De Carvalho, ACPLF; Soares, C;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Recently, meta-learning techniques have been employed to the problem of algorithm recommendation for gene expression data classification. Due to their flexibility, the advice provided to the user was in the form of rankings, which are able to express a preference order of Machine Learning algorithms accordingly to their expected relative performance. Thus, choosing how to learn accurate rankings arises as a key research issue. In this work, the authors empirically evaluated 2 general approaches for ranking prediction and extended them. The results obtained for 49 publicly available microarray datasets indicate that the extensions introduced were very beneficial to the quality of the predicted rankings. © 2010 Springer-Verlag.

CloseRead Abstract

2010

Intelligent Document Routing as a First Step towards Workflow Automation: A Case Study Implemented in SQL

Authors
Soares, C; Calejo, M;

Publication
LEVERAGING APPLICATIONS OF FORMAL METHODS, VERIFICATION, AND VALIDATION, PT I

Abstract
In large and complex organizations, the development of workflow automation projects is hard. In some cases, a first important step in that direction is the automation of the routing of incoming documents. In this paper, we describe a project to develop a system for the first routing of incoming letters to the right department within a large, public portuguese institution. We followed a data mining approach, where data representing previous routings were analyzed to obtain a model that can be used to route future documents. The approach followed was strongly influenced by some of the limitations imposed by the customer: the budget available was small and the solution should be developed in SQL to facilitate integration with the existing system. The system developed was able to obtain satisfactory results. However, as in any Data Mining project, most of the effort was dedicated to activities other than modelling (e.g., data preparation), which means that there is still plenty of room for improvement.

CloseRead Abstract