Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2025

Online Learning from Capricious Data streams with Flexible Hoeffding Tree

Autores
Zhao, RR; Sun, JB; Gama, J; Jiang, J;

Publicação
40TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING

Abstract
Capricious data streams make no assumptions on feature space dynamics and are mainly handled based on feature correlation, linear classifier or ensemble of trees. There exist deficiencies such as limited learning capacity, high time cost and low interpretability. To enhance effectiveness and efficiency, capricious data streams are handled through a single tree in this paper, and the proposed algorithm is named OCFHT (Online learning from Capricious data streams with Flexible Hoeffding Tree). OCFHT does not rely on the correlation pattern among features and can achieve non-linear modeling. Its performance is verified by various experiments on 18 public datasets, showing that it is not only more accurate than state-of-the-art algorithms, but also runs faster.

FecharLer Abstract

2025

Data Science: Foundations and Applications - 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Sydney, Australia, June 10-13, 2025, Proceedings, Part VII

Autores
Wu, X; Spiliopoulou, M; Wang, C; Kumar, V; Cao, L; Zhou, X; Pang, G; Gama, J;

Publicação
PAKDD (7)

Abstract

2025

Data Science: Foundations and Applications - 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Sydney, NSW, Australia, June 10-13, 2025, Proceedings, Part VI

Autores
Wu, X; Spiliopoulou, M; Wang, C; Kumar, V; Cao, L; Zhou, X; Pang, G; Gama, J;

Publicação
PAKDD (6)

Abstract

2025

Salvador Urban Network Transportation (SUNT): A Landmark Spatiotemporal Dataset for Public Transportation

Autores
Ferreira, MV; Souza, M; Rios, TN; Fernandes, IFC; Nery, J; Gama, J; Bifet, A; Rios, RA;

Publicação
SCIENTIFIC DATA

Abstract
Efficient public transportation management is essential for the development of large urban centers, providing several benefits such as comprehensive coverage of population mobility, reduction of transport costs, better control of traffic congestion, and significant reduction of environmental impact limiting gas emissions and pollution. Realizing these benefits requires a deeply understanding the population and transit patterns and the adoption of approaches to model multiple relations and characteristics efficiently. This work addresses these challenges by providing a novel dataset that includes various public transportation components from three different systems: regular buses, subway, and BRT (Bus Rapid Transit). Our dataset comprises daily information from about 700,000 passengers in Salvador, one of Brazil's largest cities, and local public transportation data with approximately 2,000 vehicles operating across nearly 400 lines, connecting almost 3,000 stops and stations. With data collected from March 2024 to March 2025 at a frequency lower than one minute, SUNT stands as one of the largest, most comprehensive, and openly available urban datasets in the literature.

FecharLer Abstract

2025

Interpretable Predictive Maintenance: Combining Anomaly Detection with Quantitative Root Cause Analysis

Autores
Barbosa, I; Gama, J; Veloso, B;

Publicação
Progress in Artificial Intelligence - 24th EPIA Conference on Artificial Intelligence, EPIA 2025, Faro, Portugal, October 1-3, 2025, Proceedings, Part II

Abstract

2025

Effect of AI on Innovation Capacity in the context of Industry 5.0: Findings from a Qualitative study

Autores
Bécue, A; Gama, J; Brito, PQ;

Publicação
Strategic Business Research

Abstract