Publicacoes - INESC TEC

Publicações

Publicações por Pavel Brazdil

2007

New Functions for Unsupervised Asymmetrical Paraphrase Detection

Autores
Cordeiro, J; Dias, G; Brazdil, P;

Publicação
JSW

Abstract

1984

USE OF METALOGICAL PRIMITIVES IN COMMUNICATION.

Autores
Brazdil Pavel, B;

Publicação

Abstract
The object of this study is to consider how one could enable two or more Prolog-like systems to talk about problem solutions, and how more complex modes of communication could be defined in this framework. A 'reference list' associated with each clause indicates to which system the clause belongs (and where it came from). If a reference list is associated with an expression, it determines where this expression is to be sent if the appropriate action is taken. A special metapredicate is used to associate the answer (or answers) directly with each problem solved, and so, failures, for example, are represented in an explicit manner. The metapredicates sent to other systems may include the existing metapredicates of Prolog, the new metapredicates introduced, or conjunctions and disjunctions of metapredicates.

FecharLer Abstract

2005

Imitation networks and organizational survival in the Portuguese industry

Autores
Campos, P; Brazdil, P;

Publicação
2005 Portuguese Conference on Artificial Intelligence, Proceedings

Abstract
This paper aims at evaluate the impact of imitation networks on organizations' survival rates within a Portuguese industrial cluster. We used a Multi-Agent framework to represent the industrial cluster, its firms and the rules underlying the imitation strategies. Several experiments were based on the density dependence model, where vital rates are related with the size of the population (population density). We have concluded that imitation seems to improve the vital dynamics of the population and that present information about a firm is enough to establish an imitation network.

FecharLer Abstract

2007

An iterative process for building learning curves and predicting relative performance of classifiers

Autores
Leite, R; Brazdil, P;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS

Abstract
This paper concerns the problem of predicting the relative performance of classification algorithms. Our approach requires that experiments are conducted on small samples. The information gathered is used to identify the nearest learning curve for which the sampling procedure was fully carried out. This allows the generation of a prediction regarding the relative performance of the algorithms. The method automatically establishes how many samples are needed and their sizes. This is done iteratively by taking into account the results of all previous experiments - both on other datasets and on the new dataset obtained so far. Experimental evaluation has shown that the method achieves better performance than previous approaches.

FecharLer Abstract

2007

Does SVM really scale up to large bag of words feature spaces?

Autores
Colas, F; Paclik, P; Kok, JN; Brazdil, P;

Publicação
ADVANCES IN INTELLIGENT DATA ANALYSIS VII, PROCEEDINGS

Abstract
We are concerned with the problem of learning classification rules in text categorization where many authors presented Support Vector Machines (SVM) as leading classification method. Number of studies, however, repeatedly pointed out that in some situations SVM is outperformed by simpler methods such as naive Bayes or nearest-neighbor rule. In this paper, we aim at developing better understanding of SVM behaviour in typical text categorization problems represented by sparse bag of words feature spaces. We study in details the performance and the number of support vectors when varying the training set size, the number of features and, unlike existing studies, also SVM free parameter C, which is the Lagrange multipliers upper bound in SVM dual. We show that SVM solutions with small C are high performers. However, most training documents are then bounded support vectors sharing a same weight C. Thus, SVM reduce to a nearest mean classifier-, this raises an interesting question on SVM merits in sparse bag of words feature spaces. Additionally, SVM suffer from performance deterioration for particular training set size/number of features combinations.

FecharLer Abstract

2007

Cost-sensitive decision trees applied to medical data

Autores
Freitas, A; Costa Pereira, A; Brazdil, P;

Publicação
DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS

Abstract
Classification plays an important role in medicine, especially for medical diagnosis. Health applications often require classifiers that minimize the total cost, including misclassifications costs and test costs. In fact, there are many reasons for considering costs in medicine, as diagnostic tests are not free and health budgets are limited. Our aim with this work was to define, implement and test a strategy for cost-sensitive learning. We defined an algorithm for decision tree induction that considers costs, including test costs, delayed costs and costs associated with risk. Then we applied our strategy to train and evaluate cost-sensitive decision trees in medical data. Built trees can be tested following some strategies, including group costs, common costs, and individual costs. Using the factor of "risk" it is possible to penalize invasive or delayed tests and obtain decision trees patient-friendly.

FecharLer Abstract