Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Carlos Manuel Soares

2007

Applications of Data Mining in E-Business Finance: Introduction

Authors
Soares, Carlos; Peng, Yonghong; Meng, Jun; Washio, Takashi; Zhou, ZhiHua;

Publication
Applications of Data Mining in E-Business and Finance

Abstract
This chapter introduces the volume on Applications of Data Mining in E-Business and Finance. It discusses how application-specific issues can affect the development of a data mining project. An overview of the chapters in the book is then given to guide the reader.

2011

Selection of algorithms to solve traveling salesman problems using meta-learning

Authors
Kanda, J; Carvalho, ACPLFd; Hruschka, ER; Soares, C;

Publication
Int. J. Hybrid Intell. Syst.

Abstract

2000

A comparison of ranking methods for classification algorithm selection

Authors
Brazdil, PB; Soares, C;

Publication
MACHINE LEARNING: ECML 2000

Abstract
We investigate the problem of using past performance information to select an algorithm for a given classification problem. We present three ranking methods for that purpose: average ranks, success rate ratios and significant wins. We also analyze the problem of evaluating and comparing these methods. The evaluation technique used is based on a leave-one-out procedure. On each iteration, the method generates a ranking using the results obtained by the algorithms on the training datasets. This ranking is then evaluated by calculating its distance from the ideal ranking built using the performance information on the test dataset. The distance measure adopted here, average correlation, is based on Spearman's rank correlation coefficient. To compare ranking methods, a combination of Friedman's test and Dunn's multiple comparison procedure is adopted. When applied to the methods presented here, these tests indicate that the success rate ratios and average ranks methods perform better than significant wins.

1998

Dynamic discretization of continuous attributes

Authors
Gama, J; Torgo, L; Soares, C;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE-IBERAMIA 98

Abstract
Discretization of continuous attributes is an important task for certain types of machine learning algorithms. Bayesian approaches, for instance, require assumptions about data distributions. Decision Trees on the other hand, require sorting operations to deal with continuous attributes, which largely increase learning times. This paper presents a new method of discretization, whose main characteristic is that it takes into account interdependencies between attributes. Detecting interdependencies can be seen as discovering redundant attributes. This means that our method performs attribute selection as a side effect of the discretization. Empirical evaluation on five benchmark datasets from UCI repository, using C4.5 and a naive Bayes, shows a consistent reduction of the features without loss of generalization accuracy.

2006

Data mining for business applications: KDD-2006 workshop

Authors
Ghani, R; Soares, C;

Publication
SIGKDD Explorations

Abstract

2011

Uncertainty Sampling-Based Active Selection of Datasetoids for Meta-learning

Authors
Prudencio, RBC; Soares, C; Ludermir, TB;

Publication
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2011, PT II

Abstract
Several meta-learning approaches have been developed for the problem of algorithm selection. In this context, it is of central importance to collect a sufficient number of datasets to be used as meta-examples in order to provide reliable results. Recently, some proposals to generate datasets have addressed this issue with successful results. These proposals include datasetoids, which is a simple manipulation method to obtain new datasets from existing ones. However, the increase in the number of datasets raises another issue: in order to generate meta-examples for training, it is necessary to estimate the performance of the algorithms on the datasets. This typically requires running all candidate algorithms on all datasets, which is computationally very expensive. One approach to address this problem is the use of an active learning approach to meta-learning, termed active meta-learning. In this paper we investigate the combined use of an active meta-learning approach based on an uncertainty score and datasetoids. Based on our results, we conclude that the accuracy of our method is very good results with as little as 10% to 20% of the meta-examples labeled.

  • 24
  • 37