Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Interest
Topics
Details

Details

  • Name

    Carlos Manuel Soares
  • Role

    External Research Collaborator
  • Since

    01st January 2008
006
Publications

2025

GASTeNv2: Generative Adversarial Stress Testing Networks with Gaussian Loss

Authors
Teixeira, C; Gomes, I; Cunha, L; Soares, C; van Rijn, N;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
As machine learning technologies are increasingly adopted, the demand for responsible AI practices to ensure transparency and accountability grows. To better understand the decision-making processes of machine learning models, GASTeN was developed to generate realistic yet ambiguous synthetic data near a classifier’s decision boundary. However, the results were inconsistent, with few images in the low-confidence region and noise. Therefore, we propose a new GASTeN version with a modified architecture and a novel loss function. This new loss function incorporates a multi-objective measure with a Gaussian loss centered on the classifier probability, targeting the decision boundary. Our study found that while the original GASTeN architecture yields the highest Fréchet Inception Distance (FID) scores, the updated version achieves lower Average Confusion Distance (ACD) values and consistent performance across low-confidence regions. Both architectures produce realistic and ambiguous images, but the updated one is more reliable, with no instances of GAN mode collapse. Additionally, the introduction of the Gaussian loss enhanced this architecture by allowing for adjustable tolerance in image generation around the decision boundary. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

2025

An Empirical Evaluation of DeepAR for Univariate Time Series Forecasting

Authors
Urjais Gomes, R; Soares, C; Reis, LP;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
DeepAR is a popular probabilistic time series forecasting algorithm. According to the authors, DeepAR is particularly suitable to build global models using hundreds of related time series. For this reason, it is a common expectation that DeepAR obtains poor results in univariate forecasting [10]. However, there are no empirical studies that clearly support this. Here, we compare the performance of DeepAR with standard forecasting models to assess its performance regarding 1 step-ahead forecasts. We use 100 time series from the M4 competition to compare univariate DeepAR with univariate LSTM and SARIMAX models, both for point and quantile forecasts. Results show that DeepAR obtains good results, which contradicts common perception. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

2024

Multidimensional subgroup discovery on event logs

Authors
Ribeiro, J; Fontes, T; Soares, C; Borges, JL;

Publication
EXPERT SYSTEMS WITH APPLICATIONS

Abstract
Subgroup discovery (SD) aims at finding significant subgroups of a given population of individuals characterized by statistically unusual properties of interest. SD on event logs provides insight into particular behaviors of processes, which may be a valuable complement to the traditional process analysis techniques, especially for low -structured processes. This paper proposes a scalable and efficient method to search significant SD rules on frequent sequences of events, exploiting their multidimensional nature. With this method, it is intended to identify significant subsequences of events where the distribution of values of some target aspect is significantly different than the same distribution for the entire event log. A publicly available real -life event log of a Dutch hospital is used as a running example to demonstrate the applicability of our method. The proposed approach was applied on a real -life case study based on the public transport of a medium size European city (Porto, Portugal), for which the event data consists of 133 million smartcard travel validations from buses, trams and trains. The results include a characterization of mobility flows over multiple aspects, as well as the identification of unexpected behaviors in the flow of commuters (public transport). The generated knowledge provided a useful insight into the behavior of travelers, which can be applied at operational, tactical and strategic business levels, enhancing the current view of the transport services to transport authorities and operators.

2024

VEST: automatic feature engineering for forecasting

Authors
Cerqueira, V; Moniz, N; Soares, C;

Publication
MACHINE LEARNING

Abstract
Time series forecasting is a challenging task with applications in a wide range of domains. Auto-regression is one of the most common approaches to address these problems. Accordingly, observations are modelled by multiple regression using their past lags as predictor variables. We investigate the extension of auto-regressive processes using statistics which summarise the recent past dynamics of time series. The result of our research is a novel framework called VEST, designed to perform feature engineering using univariate and numeric time series automatically. The proposed approach works in three main steps. First, recent observations are mapped onto different representations. Second, each representation is summarised by statistical functions. Finally, a filter is applied for feature selection. We discovered that combining the features generated by VEST with auto-regression significantly improves forecasting performance in a database composed by 90 time series with high sampling frequency. However, we also found that there are no improvements when the framework is applied for multi-step forecasting or in time series with low sample size. VEST is publicly available online.

2024

Systematic Analysis of the Impact of Label Noise Correction on ML Fairness

Authors
Silva, IOE; Soares, C; Sousa, I; Ghani, R;

Publication
ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT II

Abstract
Arbitrary, inconsistent, or faulty decision-making raises serious concerns, and preventing unfair models is an increasingly important challenge in Machine Learning. Data often reflect past discriminatory behavior, and models trained on such data may reflect bias on sensitive attributes, such as gender, race, or age. One approach to developing fair models is to preprocess the training data to remove the underlying biases while preserving the relevant information, for example, by correcting biased labels. While multiple label noise correction methods are available, the information about their behavior in identifying discrimination is very limited. In this work, we develop an empirical methodology to systematically evaluate the effectiveness of label noise correction techniques in ensuring the fairness of models trained on biased datasets. Our methodology involves manipulating the amount of label noise and can be used with fairness benchmarks but also with standard ML datasets. We apply the methodology to analyze six label noise correction methods according to several fairness metrics on standard OpenML datasets. Our results suggest that the Hybrid Label Noise Correction [20] method achieves the best trade-off between predictive performance and fairness. Clustering-Based Correction [14] can reduce discrimination the most, however, at the cost of lower predictive performance.

Supervised
thesis

2024

A Framework to Interpret Multiple Related Rule-based Models

Author
Pedro Rodrigo Caetano Strecht Ribeiro

Institution
UP-FEUP

2024

A Framework to Interpret Multiple Related Rule-based Models

Author
Pedro Rodrigo Caetano Strecht Ribeiro

Institution
UP-FEUP

2024

Enhancing Forecasting using Read & Write Recurrent Neural Networks

Author
Yassine Baghoussi

Institution
UP-FEUP

2019

Dataset morphing to analyze the performance of recommender systems

Author
André Gomes Ferreira Araújo Correia

Institution
UP-FEUP

2019

Automatic Interpretation of Promotional Leaflets in Retail for Pricing Strategy

Author
António Maria Aires Pereira Teixeira de Melo

Institution
UP-FEUP