Publications

Publications by Pavel Brazdil

2012

Selecting classification algorithms with active testing on similar datasets

Authors
Leite, R; Brazdil, P; Vanschoren, J;

Publication
CEUR Workshop Proceedings

Abstract
Given the large amount of data mining algorithms, their combinations (e.g. ensembles) and possible parameter settings, finding the most adequate method to analyze a new dataset becomes an ever more challenging task. This is because in many cases testing all possibly useful alternatives quickly becomes prohibitively expensive. In this paper we propose a novel technique, called active testing, that intelligently selects the most useful cross-validation tests. It proceeds in a tournament-style fashion, in each round selecting and testing the algorithm that is most likely to outperform the best algorithm of the previous round on the new dataset. This 'most promising' competitor is chosen based on a history of prior duels between both algorithms on similar datasets. Each new cross-validation test will contribute information to a better estimate of dataset similarity, and thus better predict which algorithms are most promising on the new dataset. We also follow a different path to estimate dataset similarity based on data characteristics. We have evaluated this approach using a set of 292 algorithm-parameter combinations on 76 UCI datasets for classification. The results show that active testing will quickly yield an algorithm whose performance is very close to the optimum, after relatively few tests. It also provides a better solution than previously proposed methods. The variants of our method that rely on crossvalidation tests to estimate dataset similarity provides better solutions than those that rely on data characteristics.

CloseRead Abstract

1998

Redundant Covering with Global Evaluation in the RC1 Inductive Learner

Authors
Lopes, AlneudeAndrade; Brazdil, Pavel;

Publication
Advances in Artificial Intelligence, 14th Brazilian Symposium on Artificial Intelligence, SBIA '98, Porto Alegre, Brazil, November 4-6, 1998, Proceedings

Abstract
This paper presents an inductive method that learns a logic program represented as an ordered list of clauses. The input consists of a training set of positive examples and background knowledge represented intensionally as a logic program. Our method starts by constructing the explanations of all the positive examples in terms of background knowledge, linking the input to the output arguments. These are used as candidate hypotheses and organized, by relation of generality, into a set of hierarchies (forest). In the second step the candidate hypotheses are analysed with the aim of establishing their effective coverage. In the third step all the inconsistencies are evaluated. This analysis permits to add, at each step, the best hypothesis to the theory. The method was applied to learn the past tense of English verbs. The method presented achieves more accurate results than the previous work by Mooney and Califf [7]. © Springer-Verlag Berlin Heidelberg 1998.

CloseRead Abstract

2004

Introduction to the special issue on meta-learning

Authors
Giraud Carrier, C; Vilalta, R; Brazdil, P;

Publication
MACHINE LEARNING

Abstract

1991

Learning to Relate Terms in a Multiple Agent Environment

Authors
Brazdil, P; Muggleton, S;

Publication
Machine Learning - EWSL-91, European Working Session on Learning, Porto, Portugal, March 6-8, 1991, Proceedings

Abstract

2007

Learning paraphrases from WNS corpora

Authors
Cordeiro, J; Dias, G; Brazdil, P;

Publication
Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2007

Abstract
Paraphrase detection can be seen as the task of aligning sentences that convey the same information but yet are written in different forms. Such resources are important to automatically learn text-to-text rewriting rules. In this paper, we present a new metric for unsupervised detection of paraphrases and apply it in the context of clustering of paraphrases. An exhaustive evaluation is conducted over a set of standard paraphrase corpora and real-world web news stories (WNS) corpora. The results are promising as they outperform state-of-the-art measures developed for similar tasks. Copyright

CloseRead Abstract

2007

A Metric for Paraphrase Detection

Authors
Cordeiro, J; Dias, G; Brazdil, P;

Publication
2007 International Multi-Conference on Computing in the Global Information Technology (ICCGI'07)

Abstract