Publications

Publications by AI

2021

A Data-Driven Simulator for Assessing Decision-Making in Soccer

Authors
Mendes-Neves, T; Mendes-Moreira, J; Rossetti, RJF;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE (EPIA 2021)

Abstract
Decision-making is one of the crucial factors in soccer (association football). The current focus is on analyzing data sets rather than posing what if questions about the game. We propose simulation-based methods that allow us to answer these questions. To avoid simulating complex human physics and ball interactions, we use data to build machine learning models that form the basis of an event-based soccer simulator. This simulator is compatible with the OpenAI GYM API. We introduce tools that allow us to explore and gather insights about soccer, like (1) calculating the risk/reward ratios for sequences of actions, (2) manually defining playing criteria, and (3) discovering strategies through Reinforcement Learning.

CloseRead Abstract

2021

Tensor decomposition for analysing time-evolving social networks: an overview

Authors
Fernandes, S; Fanaee T, H; Gama, J;

Publication
ARTIFICIAL INTELLIGENCE REVIEW

Abstract
Social networks are becoming larger and more complex as new ways of collecting social interaction data arise (namely from online social networks, mobile devices sensors, ...). These networks are often large-scale and of high dimensionality. Therefore, dealing with such networks became a challenging task. An intuitive way to deal with this complexity is to resort to tensors. In this context, the application of tensor decomposition has proven its usefulness in modelling and mining these networks: it has not only been applied for exploratory analysis (thus allowing the discovery of interaction patterns), but also for more demanding and elaborated tasks such as community detection and link prediction. In this work, we provide an overview of the methods based on tensor decomposition for the purpose of analysing time-evolving social networks from various perspectives: from community detection, link prediction and anomaly/event detection to network summarization and visualization. In more detail, we discuss the ideas exploited to carry out each social network analysis task as well as its limitations in order to give a complete coverage of the topic.

CloseRead Abstract

2021

Hyper-parameter Optimization for Latent Spaces

Authors
Veloso, B; Caroprese, L; Konig, M; Teixeira, S; Manco, G; Hoos, HH; Gama, J;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT III

Abstract
We present an online optimization method for time-evolving data streams that can automatically adapt the hyper-parameters of an embedding model. More specifically, we employ the Nelder-Mead algorithm, which uses a set of heuristics to produce and exploit several potentially good configurations, from which the best one is selected and deployed. This step is repeated whenever the distribution of the data is changing. We evaluate our approach on streams of real-world as well as synthetic data, where the latter is generated in such way that its characteristics change over time (concept drift). Overall, we achieve good performance in terms of accuracy compared to state-of-the-art AutoML techniques.

CloseRead Abstract

2021

Predicting Predawn Leaf Water Potential up to Seven Days Using Machine Learning

Authors
Fares, AA; Vasconcelos, F; Mendes-Moreira, J; Ferreira, C;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE (EPIA 2021)

Abstract
Sustainable agricultural production requires a controlled usage of water, nutrients, and minerals from the environment. Different strategies of plant irrigation are being studied to control the quantity and quality balance of the fruits. Regarding efficient irrigation, particularly in deficit irrigation strategies, it is essential to act according to water stress status in the plant. For example, in the vine, to improve the quality of the grapes, the plants are deprived of water until they reach particular water stress before re-watered in specified phenological stages. The water status inside the plant is estimated by measuring either the Leaf Potential during the Predawn or soil water potential, along with the root zones. Measuring soil water potential has the advantage of being independent of diurnal atmospheric variations. However, this method has many logistic problems, making it very hard to apply along all the yard, especially the big ones. In this study, the Predawn Leaf Water Potential (PLWP) is daily predicted by Machine Learning models using data such as grapes variety, soil characteristics, irrigation schedules, and meteorological data. The benefits of these techniques are the reduction of the manual work of measuring PLWP and the capacity to implement those models on a larger scale by predicting PLWP up to 7 days which should enhance the ability to optimize the irrigation plan while the quantity and quality of the crop are under control.

CloseRead Abstract

2020

Imbalanced regression and extreme value prediction

Authors
Ribeiro, RP; Moniz, N;

Publication
MACHINE LEARNING

Abstract
Research in imbalanced domain learning has almost exclusively focused on solving classification tasks for accurate prediction of cases labelled with a rare class. Approaches for addressing such problems in regression tasks are still scarce due to two main factors. First, standard regression tasks assume each domain value as equally important. Second, standard evaluation metrics focus on assessing the performance of models on the most common values of data distributions. In this paper, we present an approach to tackle imbalanced regression tasks where the objective is to predict extreme (rare) values. We propose an approach to formalise such tasks and to optimise/evaluate predictive models, overcoming the factors mentioned and issues in related work. We present an automatic and non-parametric method to obtain relevance functions, building on the concept of relevance as the mapping of target values into non-uniform domain preferences. Then, we proposeSERA, a new evaluation metric capable of assessing the effectiveness and of optimising models towards the prediction of extreme values while penalising severe model bias. An experimental study demonstrates howSERAprovides valid and useful insights into the performance of models in imbalanced regression tasks.

CloseRead Abstract

2020

Gradient Boosting Machine and LSTM Network for Online Harassment Detection and Categorization in Social Media

Authors
Pereira, FSF; Andrade, T; de Carvalho, ACPLF;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II

Abstract
We present a solution submitted to the Social Media and Harassment Competition held in collaboration with ECML PKDD 2019 Conference. The dataset used is as set of tweets and the first task was on the detection of harassment tweets. To deal with this problem, we proposed a solution based on a gradient tree-boosting algorithm. The second task was categorization harassment tweets according to the type of harassment, a multiclass classification problem. For this problem we proposed a LSTM network model. The solutions proposed for these tasks presented good predictive accuracy.

CloseRead Abstract