2003
Autores
Brazdil, PB; Soares, C; Da Costa, JP;
Publicação
MACHINE LEARNING
Abstract
We present a meta-learning method to support selection of candidate learning algorithms. It uses a k-Nearest Neighbor algorithm to identify the datasets that are most similar to the one at hand. The distance between datasets is assessed using a relatively small set of data characteristics, which was selected to represent properties that affect algorithm performance. The performance of the candidate algorithms on those datasets is used to generate a recommendation to the user in the form of a ranking. The performance is assessed using a multicriteria evaluation measure that takes not only accuracy, but also time into account. As it is not common in Machine Learning to work with rankings, we had to identify and adapt existing statistical techniques to devise an appropriate evaluation methodology. Using that methodology, we show that the meta-learning method presented leads to significantly better rankings than the baseline ranking method. The evaluation methodology is general and can be adapted to other ranking problems. Although here we have concentrated on ranking classification algorithms, the meta-learning framework presented can provide assistance in the selection of combinations of methods or more complex problem solving strategies.
2000
Autores
Soares, C; Brazdil, PB;
Publicação
LECTURE NOTES IN COMPUTER SCIENCE <D>
Abstract
Given the wide variety of available classification algorithms and the volume of data today's organizations need to analyze, the selection of the right algorithm to use on a new problem is an important issue. In this paper we present a combination of techniques to address this problem. The first one, zooming, analyzes a given dataset and selects relevant (similar) datasets that were processed by the candidate algoritms in the past. This process is based on the concept of distance, calculated on the basis of several dataset characteristics. The information about the performance of the candidate algorithms on the selected datasets is then processed by a second technique, a ranking method. Such a method uses performance information to generate advice in the form of a ranking, indicating which algorithms should be applied in which order. Here we propose the adjusted ratio of ratios ranking method. This method takes into account not only accuracy but also the time performance of the candidate algorithms. The generalization power of this ranking method is analyzed. For this purpose, an appropriate methodology is defined. The experimental results indicate that on average better results are obtained with zooming than without it.
2000
Autores
Brazdil, PB; Soares, C;
Publicação
MACHINE LEARNING: ECML 2000
Abstract
We investigate the problem of using past performance information to select an algorithm for a given classification problem. We present three ranking methods for that purpose: average ranks, success rate ratios and significant wins. We also analyze the problem of evaluating and comparing these methods. The evaluation technique used is based on a leave-one-out procedure. On each iteration, the method generates a ranking using the results obtained by the algorithms on the training datasets. This ranking is then evaluated by calculating its distance from the ideal ranking built using the performance information on the test dataset. The distance measure adopted here, average correlation, is based on Spearman's rank correlation coefficient. To compare ranking methods, a combination of Friedman's test and Dunn's multiple comparison procedure is adopted. When applied to the methods presented here, these tests indicate that the success rate ratios and average ranks methods perform better than significant wins.
2010
Autores
Utgoff, PE; Cussens, J; Kramer, S; Jain, S; Stephan, F; Raedt, LD; Todorovski, L; Flener, P; Schmid, U; Vilalta, R; Giraud-Carrier, C; Brazdil, P; Soares, C; Keogh, E; Smart, WD; Abbeel, P; Ng, AY;
Publicação
Encyclopedia of Machine Learning
Abstract
2010
Autores
Fürnkranz, J; Chan, PK; Craw, S; Sammut, C; Uther, W; Ratnaparkhi, A; Jin, X; Han, J; Yang, Y; Morik, K; Dorigo, M; Birattari, M; Stützle, T; Brazdil, P; Vilalta, R; Giraud-Carrier, C; Soares, C; Rissanen, J; Baxter, RA; Bruha, I; Baxter, RA; Webb, GI; Torgo, L; Banerjee, A; Shan, H; Ray, S; Tadepalli, P; Shoham, Y; Powers, R; Shoham, Y; Powers, R; Webb, GI; Ray, S; Scott, S; Blockeel, H; De Raedt, L;
Publicação
Encyclopedia of Machine Learning
Abstract
2002
Autores
Peng, YH; Flach, PA; Soares, C; Brazdil, P;
Publicação
DISCOVERY SCIENCE, PROCEEDINGS
Abstract
This paper presents new measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms. The main idea is to capture the characteristics of dataset from the structural shape and size of decision tree induced from the dataset. Totally 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the existing data characteristics techniques, including data characteristics tool (DCT) that is the most wide used technique in meta-learning, and Landmarking that is the most recently developed method.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.