Publications

Publications by Carlos Manuel Soares

2021

Novelty Detection in Physical Activity

Authors
Leite, B; Abdalrahman, A; Castro, J; Frade, J; Moreira, J; Soares, C;

Publication
ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2

Abstract
Artificial Intelligence (AI) is continuously improving several aspects of our daily lives. There has been a great use of gadgets & monitoring devices for health and physical activity monitoring. Thus, by analyzing large amounts of data and applying Machine Learning (ML) techniques, we have been able to infer fruitful conclusions in various contexts. Activity Recognition is one of them, in which it is possible to recognize and monitor our daily actions. The main focus of the traditional systems is only to detect pre-established activities according to the previously configured parameters, and not to detect novel ones. However, when applying activity recognizers in real-world applications, it is necessary to detect new activities that were not considered during the training of the model. We propose a method for Novelty Detection in the context of physical activity. Our solution is based on the establishment of a threshold confidence value, which determines whether an activity is novel or not. We built and train our models by experimenting with three different algorithms and four threshold values. The best results were obtained by using the Random Forest algorithm with a threshold value of 0.8, resulting in 90.9% of accuracy and 85.1% for precision.

CloseRead Abstract

2020

$\mu-\text{cf}2\text{vec}$: Representation Learning for Personalized Algorithm Selection in Recommender Systems

Authors
Pereira, TS; Cunha, T; Soares, C;

Publication
20th International Conference on Data Mining Workshops, ICDM Workshops 2020, Sorrento, Italy, November 17-20, 2020

Abstract
Collaborative Filtering (CF) has become the standard approach to solve recommendation systems problems. Collaborative Filtering algorithms try to make predictions about interests of a user by collecting the personal interests from multiple users. There are multiple CF algorithms, each one of them with its own biases. It is the Machine Learning practitioner that has to choose the best algorithm for each task beforehand. In Recommender Systems, different algorithms have different performance for different users within the same dataset. Meta Learning has been used to choose the best algorithm for a given problem. Meta Learning is usually applied to select algorithms for a whole dataset. Adapting it to select the to the algorithm for a single user in a RS involves several challenges. The most important is the design of the metafeatures which, in typical meta learning, characterize datasets while here, they must characterize a single user. This work presents a new meta-learning based framework named µ-cf2vec to select the best algorithm for each user. We propose using Representation Learning techniques to extract the metafeatures. Representation Learning tries to extract representations that can be reused in other learning tasks. In this work we also implement the framework using different RL techniques to evaluate which one can be more useful to solve this task. In the meta level, the meta learning model will use the metafeatures to extract knowledge that will be used to predict the best algorithm for each user. We evaluated an implementation of this framework using MovieLens 20M dataset. Our implementation achieved consistent gains in the meta level, however, in the base level we only achieved marginal gains. © 2020 IEEE.

CloseRead Abstract

2021

Micro-MetaStream: Algorithm selection for time-changing data

Authors
Rossi, ALD; Soares, C; de Souza, BF; de Carvalho, ACPDF;

Publication
INFORMATION SCIENCES

Abstract
Data stream mining needs to deal with scenarios where data distribution can change over time. As a result, different learning algorithms can be more suitable in different time periods. This paper proposes micro-MetaStream, a meta-learning based method to recommend the most suitable learning algorithm for each new example arriving in a data stream. It is an evolution of MetaStream, which recommends learning algorithms for batches of examples. By using a unitary granularity, micro-MetaStream is able to respond more efficiently to changes in data distribution than its predecessor. The meta-data combines meta-features, characteristics describing recent data, with base-level features, the original variables of the new example. In experiments on real-world regression data streams, micro-metaStream outperformed MetaStream and a baseline method at the meta-level and frequently improved the predictive performance at the base-level.

CloseRead Abstract

2021

Discovery Science - 24th International Conference, DS 2021, Halifax, NS, Canada, October 11-13, 2021, Proceedings

Authors
Soares, C; Torgo, L;

Publication
DS

Abstract

2021

Empirical Study on the Impact of Different Sets of Parameters of Gradient Boosting Algorithms for Time-Series Forecasting with LightGBM

Authors
Barros, F; Cerqueira, V; Soares, C;

Publication
PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Hanoi, Vietnam, November 8-12, 2021, Proceedings, Part I

Abstract
LightGBM has proven to be an effective forecasting algorithm by winning the M5 forecasting competition. However, given the sensitivity of LightGBM to hyperparameters, it is likely that their default values are not optimal. This work aims to answer whether it is essential to tune the hyperparameters of LightGBM to obtain better accuracy in time series forecasting and whether it can be done efficiently. Our experiments consisted of the collection and processing of data as well as hyperparameters generation and finally testing. We observed that on the 58 time series tested, the mean squared error is reduced by a maximum of 17.45% when using randomly generated configurations in contrast to using the default one. Additionally, the study of the individual hyperparameters’ performance was done. Based on the results obtained, we propose an alternative set of default LightGBM hyperparameter values to be used whilst using time series data for forecasting. © 2021, Springer Nature Switzerland AG.

CloseRead Abstract

2021

Promoting Fairness through Hyperparameter Optimization

Authors
Cruz, AF; Saleiro, P; Belem, C; Soares, C; Bizarro, P;

Publication
2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021)

Abstract
Considerable research effort has been guided towards algorithmic fairness but real-world adoption of bias reduction techniques is still scarce. Existing methods are either metric- or model-specific, require access to sensitive attributes at inference time, or carry high development or deployment costs. This work explores the unfairness that emerges when optimizing ML models solely for predictive performance, and how to mitigate it with a simple and easily deployed intervention: fairness-aware hyperparameter optimization (HO). We propose and evaluate fairness-aware variants of three popular HO algorithms: Fair Random Search, Fair TPE, and Fairband. We validate our approach on a real-world bank account opening fraud casestudy, as well as on three datasets from the fairness literature. Results show that, without extra training cost, it is feasible to find models with 111% mean fairness increase and just 6% decrease in performance when compared with fairness-blind HO.(1)

CloseRead Abstract