Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2016

Active learning and data manipulation techniques for generating training examples in meta-learning

Autores
Sousa, AFM; Prudencio, RBC; Ludermir, TB; Soares, C;

Publicação
NEUROCOMPUTING

Abstract
Algorithm selection is an important task in different domains of knowledge. Meta-learning treats this task by adopting a supervised learning strategy. Training examples in meta-learning (called meta examples) are generated from experiments performed with a pool of candidate algorithms in a number of problems, usually collected from data repositories or synthetically generated. A meta-learner is then applied to acquire knowledge relating features of the problems and the best algorithms in terms of performance. In this paper, we address an important aspect in meta-learning which is to produce a significant number of relevant meta-examples. Generating a high quality set of meta-examples can be difficult due to the low availability of real datasets in some domains and the high computational cost of labelling the meta-examples. In the current work, we focus on the generation of meta-examples for meta-learning by combining: (1) a promising approach to generate new datasets (called datasetoids) by manipulating existing ones; and (2) active learning methods to select the most relevant datasets previously generated. The datasetoids approach is adopted to augment the number of useful problem instances for meta-example construction. However not all generated problems are equally relevant. Active meta-learning then arises to select only the most informative instances to be labelled. Experiments were performed in different scenarios, algorithms for meta-learning and strategies to select datasets. Our experiments revealed that it is possible to reduce the computational cost of generating meta-examples, while maintaining a good meta-learning performance.

2016

Advances in Intelligent Data Analysis XV - 15th International Symposium, IDA 2016, Stockholm, Sweden, October 13-15, 2016, Proceedings

Autores
Boström, Henrik; Knobbe, ArnoJ.; Soares, Carlos; Papapetrou, Panagiotis;

Publicação
IDA

Abstract

2016

Sentiment Aggregate Functions for Political Opinion Polling using Microblog Streams

Autores
Saleiro, P; Gomes, L; Soares, C;

Publicação
Proceedings of the Ninth International C* Conference on Computer Science & Software Engineering, C3S2E '16, Porto, Portugal, July 20-22, 2016

Abstract
The automatic content analysis of mass media in the social sciences has become necessary and possible with the raise of social media and computational power. One particularly promising avenue of research concerns the use of sentiment analysis in microblog streams. However, one of the main challenges consists in aggregating sentiment polarity in a timely fashion that can be fed to the prediction method. We investigated a large set of sentiment aggregate functions and performed a regression analysis using political opinion polls as gold standard. Our dataset contains nearly 233 000 tweets, classified according to their polarity (positive, negative or neutral), regarding the five main Portuguese political leaders during the Portuguese bailout (2011-2014). Results show that different sentiment aggregate functions exhibit different feature importance over time while the error keeps almost unchanged. © 2016 ACM.

2016

Preface

Autores
Boström, H; Knobbe, A; Soares, C; Papapetrou, P;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract

2016

An online learning approach to eliminate Bus Bunching in real-time

Autores
Moreira Matias, L; Cats, O; Gama, J; Mendes Moreira, J; de Sousa, JF;

Publicação
APPLIED SOFT COMPUTING

Abstract
Recent advances in telecommunications created new opportunities for monitoring public transport operations in real-time. This paper presents an automatic control framework to mitigate the Bus Bunching phenomenon in real-time. The framework depicts a powerful combination of distinct Machine Learning principles and methods to extract valuable information from raw location-based data. State-of-the-art tools and methodologies such as Regression Analysis, Probabilistic Reasoning and Perceptron's learning with Stochastic Gradient Descent constitute building blocks of this predictive methodology. The prediction's output is then used to select and deploy a corrective action to automatically prevent Bus Bunching. The performance of the proposed method is evaluated using data collected from 18 bus routes in Porto, Portugal over a period of one year. Simulation results demonstrate that the proposed method can potentially reduce bunching by 68% and decrease average passenger waiting times by 4.5%, without prolonging in-vehicle times. The proposed system could be embedded in a decision support system to improve control room operations. (C) 2016 Published by Elsevier B.V.

2016

Classification systems in dynamic environments: an overview

Autores
Pinage, FA; dos Santos, EM; Portela da Gama, JMP;

Publicação
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Data mining and machine learning algorithms can be employed to perform a variety of tasks. However, since most of these problems may depend on environments that change over time, performing classification tasks in dynamic environments has been a challenge in data mining research domain in the last decades. Currently, in the literature, the most common strategies used to detect changes are based on accuracy monitoring, which relies on previous knowledge of the data in order to identify whether or not correct classifications are provided. However, such a feedback can be infeasible in practical problems. In this work, we present a comprehensive overview of current machine learning/data mining approaches proposed to deal with dynamic environments problems. The objective is to highlight the main drawbacks and open issues, as well as future directions and problems worthy of investigation. In addition, we provide the definitions of the main terms used to represent this problem in the literature, such as concept drift and novelty detection. WIREs Data Mining Knowl Discov 2016, 6:156-166. doi: 10.1002/widm.1184 For further resources related to this article, please visit the .

  • 213
  • 430