Publications

Publications by Carlos Manuel Soares

2024

GASTeNv2: Generative Adversarial Stress Testing Networks with Gaussian Loss

Authors
Teixeira, C; Gomes, I; Cunha, L; Soares, C; van Rijn, JN;

Publication
Progress in Artificial Intelligence - 23rd EPIA Conference on Artificial Intelligence, EPIA 2024, Viana do Castelo, Portugal, September 3-6, 2024, Proceedings, Part II

Abstract
As machine learning technologies are increasingly adopted, the demand for responsible AI practices to ensure transparency and accountability grows. To better understand the decision-making processes of machine learning models, GASTeN was developed to generate realistic yet ambiguous synthetic data near a classifier’s decision boundary. However, the results were inconsistent, with few images in the low-confidence region and noise. Therefore, we propose a new GASTeN version with a modified architecture and a novel loss function. This new loss function incorporates a multi-objective measure with a Gaussian loss centered on the classifier probability, targeting the decision boundary. Our study found that while the original GASTeN architecture yields the highest Fréchet Inception Distance (FID) scores, the updated version achieves lower Average Confusion Distance (ACD) values and consistent performance across low-confidence regions. Both architectures produce realistic and ambiguous images, but the updated one is more reliable, with no instances of GAN mode collapse. Additionally, the introduction of the Gaussian loss enhanced this architecture by allowing for adjustable tolerance in image generation around the decision boundary. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

CloseRead Abstract

2024

An Empirical Evaluation of DeepAR for Univariate Time Series Forecasting

Authors
Gomes, RU; Soares, C; Reis, LP;

Publication
Progress in Artificial Intelligence - 23rd EPIA Conference on Artificial Intelligence, EPIA 2024, Viana do Castelo, Portugal, September 3-6, 2024, Proceedings, Part III

Abstract
DeepAR is a popular probabilistic time series forecasting algorithm. According to the authors, DeepAR is particularly suitable to build global models using hundreds of related time series. For this reason, it is a common expectation that DeepAR obtains poor results in univariate forecasting [10]. However, there are no empirical studies that clearly support this. Here, we compare the performance of DeepAR with standard forecasting models to assess its performance regarding 1 step-ahead forecasts. We use 100 time series from the M4 competition to compare univariate DeepAR with univariate LSTM and SARIMAX models, both for point and quantile forecasts. Results show that DeepAR obtains good results, which contradicts common perception. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

CloseRead Abstract

2024

Lag Selection for Univariate Time Series Forecasting Using Deep Learning: An Empirical Study

Authors
Leites, J; Cerqueira, V; Soares, C;

Publication
Progress in Artificial Intelligence - 23rd EPIA Conference on Artificial Intelligence, EPIA 2024, Viana do Castelo, Portugal, September 3-6, 2024, Proceedings, Part III

Abstract
Most forecasting methods use recent past observations (lags) to model the future values of univariate time series. Selecting an adequate number of lags is important for training accurate forecasting models. Several approaches and heuristics have been devised to solve this task. However, there is no consensus about what the best approach is. Besides, lag selection procedures have been developed based on local models and classical forecasting techniques such as ARIMA. We bridge this gap in the literature by carrying out an extensive empirical analysis of different lag selection methods. We focus on deep learning methods trained in a global approach, i.e., on datasets comprising multiple univariate time series. Specifically, we use NHITS, a recently proposed architecture that has shown competitive forecasting performance. The experiments were carried out using three benchmark databases that contain a total of 2411 univariate time series. The results indicate that the lag size is a relevant parameter for accurate forecasts. In particular, excessively small or excessively large lag sizes have a considerable negative impact on forecasting performance. Cross-validation approaches show the best performance for lag selection, but this performance is comparable with simple heuristics. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

CloseRead Abstract

2024

Enhancing Algorithm Performance Understanding through tsMorph: Generating Semi-Synthetic Time Series for Robust Forecasting Evaluation

Authors
Santos, M; de Carvalho, ACPLF; Soares, C;

Publication
Proceedings of the 2nd Workshop on Fairness and Bias in AI co-located with 27th European Conference on Artificial Intelligence (ECAI 2024), Santiago de Compostela, Spain, October 20th, 2024.

Abstract
When never produced as much data as today, and tomorrow will probably produce even more data. The increase is due not only to the larger number of data sources, but also because the source can continuously produce more recent data. The discovery of temporal patterns in continuously generated data is the main goal in many forecasting tasks, such as the average value of a currency or the average temperature in a city, in the next day. In these tasks, it is assumed that the time difference between two consecutive values produced by the same source is constant, and the sequence of values form a time series. The importance, and the very large number, of time series forecasting tasks make them one of the most popular data analysis application, which has been dealt with by a large number of different methods. Despite its popularity, there is a dearth of research aimed at comprehending the conditions under which these methods present high or poor forecasting performances. Empirical studies, although common, are challenged by the limited availability of time series datasets, restricting the extraction of reliable insights. To address this limitation, we present tsMorph, a tool for generating semi-synthetic time series through dataset morphing. tsMorph works by creating a sequence of datasets from two original datasets. The characteristics of the generated datasets progressively depart from those of one of the datasets and a convergence toward the attributes of the other dataset. This method provides a valuable alternative for obtaining substantial datasets. In this paper, we show the benefits of tsMorph by assessing the predictive performance of the Long Short-Term Memory Network and DeepAR forecasting algorithms. The time series used for the experiments come from the NN5 Competition. The experimental results provide important insights. Notably, the performances of the two algorithms improve proportionally with the frequency of the time series. These experiments confirm that tsMorph can be an effective tool for better understanding the behaviour of forecasting algorithms, delivering a pathway to overcoming the limitations posed by empirical studies and enabling more extensive and reliable experiments. Furthermore, tsMorph can promote Responsible Artificial Intelligence by emphasising characteristics of time series where forecasting algorithms may not perform well, thereby highlighting potential limitations. © 2024 Copyright for this paper by its authors.

CloseRead Abstract

2024

Fair-OBNC: Correcting Label Noise for Fairer Datasets

Authors
Silva, IOe; Jesus, SM; Ferreira, HM; Saleiro, P; Sousa, I; Bizarro, P; Soares, C;

Publication
ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain - Including 13th Conference on Prestigious Applications of Intelligent Systems (PAIS 2024)

Abstract
Data used by automated decision-making systems, such as Machine Learning models, often reflects discriminatory behavior that occurred in the past. These biases in the training data are sometimes related to label noise, such as in COMPAS, where more African-American offenders are wrongly labeled as having a higher risk of recidivism when compared to their White counterparts. Models trained on such biased data may perpetuate or even aggravate the biases with respect to sensitive information, such as gender, race, or age. However, while multiple label noise correction approaches are available in the literature, these focus on model performance exclusively. In this work, we propose Fair-OBNC, a label noise correction method with fairness considerations, to produce training datasets with measurable demographic parity. The presented method adapts Ordering-Based Noise Correction, with an adjusted criterion of ordering, based both on the margin of error of an ensemble, and the potential increase in the observed demographic parity of the dataset. We evaluate Fair-OBNC against other different pre-processing techniques, under different scenarios of controlled label noise. Our results show that the proposed method is the overall better alternative within the pool of label correction methods, being capable of attaining better reconstructions of the original labels. Models trained in the corrected data have an increase, on average, of 150% in demographic parity, when compared to models trained in data with noisy labels, across the considered levels of label noise. © 2024 The Authors.

CloseRead Abstract

2025

Meta-learning and Data Augmentation for Stress Testing Forecasting Models

Authors
Inácio, R; Cerqueira, V; Barandas, M; Soares, C;

Publication
Advances in Intelligent Data Analysis XXIII - 23rd International Symposium on Intelligent Data Analysis, IDA 2025, Konstanz, Germany, May 7-9, 2025, Proceedings

Abstract
The effectiveness of time series forecasting models can be hampered by conditions in the input space that lead them to underperform. When those are met, negative behaviours, such as higher-than-usual errors or increased uncertainty are shown. Traditionally, stress testing is applied to assess how models respond to adverse, but plausible scenarios, providing insights on how to improve their robustness and reliability. This paper builds upon this technique by contributing with a novel framework called MAST (Meta-learning and data Augmentation for Stress Testing). In particular, MAST is a meta-learning approach that predicts the probability that a given model will perform poorly on a given time series based on a set of statistical features. This way, instead of designing new stress scenarios, this method uses the information provided by instances that led to decreases in forecasting performance. An additional contribution is made, a novel time series data augmentation technique based on oversampling, that improves the information about stress factors in the input space, which elevates the classification capabilities of the method. We conducted experiments using 6 benchmark datasets containing a total of 97.829 time series. The results suggest that MAST is able to identify conditions that lead to large errors effectively. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

CloseRead Abstract