Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Interest
Topics
Details

Details

  • Name

    Rui Leite
  • Role

    External Research Collaborator
  • Since

    01st January 2010
Publications

2021

Exploiting Performance-based Similarity between Datasets in Metalearning

Authors
Leite, R; Brazdil, P;

Publication
AAAI Workshop on Meta-Learning and MetaDL Challenge, MetaDL@AAAI 2021, virtual, February 9, 2021.

Abstract

2018

An agent-based model for detection in economic networks

Authors
Brito, J; Campos, P; Leite, R;

Publication
Communications in Computer and Information Science

Abstract
The economic impact of fraud is wide and fraud can be a critical problem when the prevention procedures are not robust. In this paper we create a model to detect fraudulent transactions, and then use a classification algorithm to assess if the agent is fraud prone or not. The model (BOND) is based on the analytics of an economic network of agents of three types: individuals, businesses and financial intermediaries. From the dataset of transactions, a sliding window of rows previously aggregated per agent has been used and machine learning (classification) algorithms have been applied. Results show that it is possible to predict the behavior of agents, based on previous transactions. © 2018, Springer International Publishing AG, part of Springer Nature.

2015

Risks deter but pleasures allure: Is pleasure more important?

Authors
Chao, LW; Szrek, H; Leite, R; Peltzer, K; Ramlagan, S;

Publication
JUDGMENT AND DECISION MAKING

Abstract
The pursuit of unhealthy behaviors, such as smoking or binge drinking, not only carries various downside risks, but also provides pleasure. A parsimonious model, used in the literature to explain the decision to pursue an unhealthy activity, represents that decision as a tradeoff between risks and benefits. We build on this literature by surveying a rural population in South Africa to elicit the perceived riskiness and the perceived pleasure for various risky activities and to examine how these perceptions relate to the pursuit of four specific unhealthy behaviors: frequent smoking, problem drinking, seatbelt nonuse, and risky sex. We show that perceived pleasure is a significant predictor for three of the behaviors and that perceived riskiness is a significant predictor for two of them. We also show that the correlation between the riskiness rating and behavior is significantly different from the correlation between the pleasure rating and behavior for three of the four behaviors. Finally, we show that the effect of pleasure is significantly greater than the effect of riskiness in determining drinking and risky sex, while the effects of pleasure and riskiness are not different from each other in determining smoking and seatbelt nonuse. We discuss how our findings can be used to inform the design of health promotion strategies.

2012

Selecting classification algorithms with active testing on similar datasets

Authors
Leite, R; Brazdil, P; Vanschoren, J;

Publication
CEUR Workshop Proceedings

Abstract
Given the large amount of data mining algorithms, their combinations (e.g. ensembles) and possible parameter settings, finding the most adequate method to analyze a new dataset becomes an ever more challenging task. This is because in many cases testing all possibly useful alternatives quickly becomes prohibitively expensive. In this paper we propose a novel technique, called active testing, that intelligently selects the most useful cross-validation tests. It proceeds in a tournament-style fashion, in each round selecting and testing the algorithm that is most likely to outperform the best algorithm of the previous round on the new dataset. This 'most promising' competitor is chosen based on a history of prior duels between both algorithms on similar datasets. Each new cross-validation test will contribute information to a better estimate of dataset similarity, and thus better predict which algorithms are most promising on the new dataset. We also follow a different path to estimate dataset similarity based on data characteristics. We have evaluated this approach using a set of 292 algorithm-parameter combinations on 76 UCI datasets for classification. The results show that active testing will quickly yield an algorithm whose performance is very close to the optimum, after relatively few tests. It also provides a better solution than previously proposed methods. The variants of our method that rely on crossvalidation tests to estimate dataset similarity provides better solutions than those that rely on data characteristics.

2010

Active Testing Strategy to Predict the Best Classification Algorithm via Sampling and Metalearning

Authors
Leite, R; Brazdil, P;

Publication
ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE

Abstract
Currently many classification algorithms exist and there is no algorithm that would outperform all the others in all tasks. Therefore it is of interest to determine which classification algorithm is the best one for a given task. Although direct comparisons can be made for any given problem using a cross-validation evaluation, it is desirable to avoid this, as the computational costs are significant. We describe a method which relies on relatively fast pairwise comparisons involving two algorithms. This method exploits sampling landmarks, that is information about learning curves besides classical data characteristics. One key feature of this method is an iterative procedure for extending the series of experiments used to gather new information in the form of sampling landmarks. Metalearning plays also a vital role. The comparisons between various pairs of algorithm are repeated and the result is represented in the form of a partially ordered ranking. Evaluation is done by comparing the partial order of algorithm that has been predicted to the partial order representing the supposedly correct result. The results of our analysis show that the method has good performance and could be of help in practical applications.