Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2020

Proceedings of AI4Narratives - Workshop on Artificial Intelligence for Narratives in conjunction with the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence (IJCAI 2020), Yokohama, Japan, January 7th and 8th, 2021 (online event due to Covid-19 outbreak)

Authors
Jorge, AM; Campos, R; Jatowt, A; Aizawa, A;

Publication
AI4Narratives@IJCAI

Abstract

2020

Report on the third international workshop on narrative extraction from texts (Text2Story 2020)

Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Pasquali, A; Cordeiro, JP; Rocha, C; Mansouri, B; Santana, BS;

Publication
SIGIR Forum

Abstract

2020

ECIR 2020 workshops: assessing the impact of going online

Authors
Nunes, S; Little, S; Bhatia, S; Boratto, L; Cabanac, G; Campos, R; Couto, FM; Faralli, S; Frommholz, I; Jatowt, A; Jorge, A; Marras, M; Mayr, P; Stilo, G;

Publication
SIGIR Forum

Abstract

2020

A Study on Imbalanced Data Streams

Authors
Aminian, E; Ribeiro, RP; Gama, J;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II

Abstract
Data are growing fast in today's world and great portion of that is in the form of stream. In many situations, data streams are imbalanced making it difficult to use with classical data mining methods. However, mining these special kinds of streams is one of the most attractive research area. In this paper, we propose two algorithms for learning from imbalanced regression data streams. Both methods are based on Chebychev's inequality but in a different way. The first method, under-samples from the frequent target value examples while the second method over-samples the rare and extreme target value examples. This way, the learner will focus in the rare and more difficult cases. We applied our methods to train regression models using two benchmark datasets and two well-known regression algorithms: Perceptron and FIMT-DD. Our obtained results from the simulations indicate the usefulness of our proposed methods.

2020

Using Property-Based Testing to Generate Feedback for C Programming Exercises

Authors
Vasconcelos, PB; Ribeiro, RP;

Publication
First International Computer Programming Education Conference, ICPEC 2020, June 25-26, 2020, ESMAD, Vila do Conde, Portugal (Virtual Conference).

Abstract
This paper reports on the use of property-based testing for providing feedback to C programming exercises. Test cases are generated automatically from properties specified in a test script; this not only makes it possible to conduct many tests (thus potentially find more mistakes), but also allows simplifying failed tests cases automatically. We present some experimental validation gathered for an introductory C programming course during the fall semester of 2018 that show significant positive correlations between getting feedback during the semester and the student's results in the final exam. We also discuss some limitations regarding feedback for undefined behaviors in the C language. 2012 ACM Subject Classification Social and professional topics ! Student assessment; Software and its engineering ! Software testing and debugging; Software and its engineering ! Domain specific languages.

2020

Imbalanced regression and extreme value prediction

Authors
Ribeiro, RP; Moniz, N;

Publication
MACHINE LEARNING

Abstract
Research in imbalanced domain learning has almost exclusively focused on solving classification tasks for accurate prediction of cases labelled with a rare class. Approaches for addressing such problems in regression tasks are still scarce due to two main factors. First, standard regression tasks assume each domain value as equally important. Second, standard evaluation metrics focus on assessing the performance of models on the most common values of data distributions. In this paper, we present an approach to tackle imbalanced regression tasks where the objective is to predict extreme (rare) values. We propose an approach to formalise such tasks and to optimise/evaluate predictive models, overcoming the factors mentioned and issues in related work. We present an automatic and non-parametric method to obtain relevance functions, building on the concept of relevance as the mapping of target values into non-uniform domain preferences. Then, we proposeSERA, a new evaluation metric capable of assessing the effectiveness and of optimising models towards the prediction of extreme values while penalising severe model bias. An experimental study demonstrates howSERAprovides valid and useful insights into the performance of models in imbalanced regression tasks.

  • 98
  • 429