2025
Authors
Leal, F; Veloso, B; Malheiro, B; Burguillo, JC;
Publication
EXPERT SYSTEMS
Abstract
Crowdsourced data streams are popular and extremely valuable in several domains, namely in tourism. Tourism crowdsourcing platforms rely on past tourist and business inputs to provide tailored recommendations to current users in real time. The continuous, open, dynamic and non-curated nature of the crowd-originated data demands specific stream mining techniques to support online profiling, recommendation, change detection and adaptation, explanation and evaluation. The sought techniques must, not only, continuously improve and adapt profiles and models; but must also be transparent, overcome biases, prioritize preferences, master huge data volumes and all in real time. This article surveys the state-of-art of adaptive and explainable stream recommendation, extends the taxonomy of explainable recommendations from the offline to the stream-based scenario, and identifies future research opportunities.
2025
Authors
Arianna Teixeira Pereira; Janielle Da Silva Lago; Yvelyne Bianca Iunes Santos; Bruno Miguel Delindro Veloso; Norma Ely Santos Beltrão;
Publication
Revista de Gestão Social e Ambiental
Abstract
2025
Authors
García-Méndez, S; de Arriba-Pérez, F; Leal, F; Veloso, B; Malheiro, B; Burguillo-Rial, JC;
Publication
SCIENTIFIC REPORTS
Abstract
The public transportation sector generates large volumes of sensor data that, if analyzed adequately, can help anticipate failures and initiate maintenance actions, thereby enhancing quality and productivity. This work contributes to a real-time data-driven predictive maintenance solution for Intelligent Transportation Systems. The proposed method implements a processing pipeline comprised of sample pre-processing, incremental classification with Machine Learning models, and outcome explanation. This novel online processing pipeline has two main highlights: (i) a dedicated sample pre-processing module, which builds statistical and frequency-related features on the fly, and (ii) an explainability module. This work is the first to perform online fault prediction with natural language and visual explainability. The experiments were performed with the Metropt data set from the metro operator of Porto, Portugal. The results are above 98 % for f-measure and 99 % for accuracy. In the context of railway predictive maintenance, achieving these high values is crucial due to the practical and operational implications of accurate failure prediction. In the specific case of a high f-measure, this ensures that the system maintains an optimal balance between detecting the highest possible number of real faults and minimizing false alarms, which is crucial for maximizing service availability. Furthermore, the accuracy obtained enables reliability, directly impacting cost reduction and increased safety. The analysis demonstrates that the pipeline maintains high performance even in the presence of class imbalance and noise, and its explanations effectively reflect the decision-making process. These findings validate the methodological soundness of the approach and confirm its practical applicability for supporting proactive maintenance decisions in real-world railway operations. Therefore, by identifying the early signs of failure, this pipeline enables decision-makers to understand the underlying problems and act accordingly swiftly.
2025
Authors
Alcoforado, A; Ferraz, TP; Okamura, LHT; Veloso, BM; Costa, AHR; Fama, IC; Bueno, BD;
Publication
LINGUAMATICA
Abstract
Acquiring high-quality annotated data remains one of the most significant challenges in Natural Language Processing (NLP), especially for supervised learning approaches. In scenarios where pre-existing labeled data is unavailable, common solutions like crowdsourcing and zero-shot approaches often fall short, suffering from limitations such as the need for large datasets and a lack of guarantees regarding annotation quality. Traditionally, data for human annotation has been selected randomly, a practice that is not only costly and inefficient but also prone to bias, particularly in imbalanced datasets where minority classes are underrepresented. To address these challenges, this work introduces an automatic and informed data selection architecture designed to minimize the volume of required annotations while maximizing the diversity and representativeness of the selected data. Among the evaluated methods, Reverse Semantic Search (RSS) demonstrated superior performance, consistently outperforming random sampling in imbalanced scenarios and enhancing the effectiveness of trained classifiers. Furthermore, we compared RSS with other clustering-based approaches, providing insights into their respective strengths and weaknesses.
2025
Authors
Barbosa, I; Gama, J; Veloso, B;
Publication
Progress in Artificial Intelligence - 24th EPIA Conference on Artificial Intelligence, EPIA 2025, Faro, Portugal, October 1-3, 2025, Proceedings, Part II
Abstract
2025
Authors
Méndez, SG; Arriba Pérez, Fd; Leal, F; Veloso, B; Malheiro, B; Burguillo Rial, JC;
Publication
CoRR
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.