Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Gama

2023

Why Industry 5.0 Needs XAI 2.0?

Authors
Bobek, S; Nowaczyk, S; Gama, J; Pashami, S; Ribeiro, RP; Taghiyarrenani, Z; Veloso, B; Rajaoarisoa, LH; Szelazek, M; Nalepa, GJ;

Publication
Joint Proceedings of the xAI-2023 Late-breaking Work, Demos and Doctoral Consortium co-located with the 1st World Conference on eXplainable Artificial Intelligence (xAI-2023), Lisbon, Portugal, July 26-28, 2023.

Abstract
Advances in artificial intelligence trigger transformations that make more and more companies enter Industry 4.0 and 5.0 eras. In many cases, these transformations are gradual and performed in a bottom-up manner. This means that in the first step, the industrial hardware is upgraded to collect as much data as possible without actual planning of the utilization of the information. Furthermore, the data storage and processing infrastructure is prepared to keep large volumes of historical data accessible for further analysis. Only in the last step are methods for processing the data developed to improve or gain more insight into the industrial and business processes. Such a pipeline makes many companies face a problem with huge amounts of data, an incomplete understanding of how the existing knowledge is represented in the data, under which conditions the knowledge no longer holds, or what new phenomena are hidden inside the data. We argue that this gap needs to be addressed by the next generation of XAI methods which should be expert-oriented and focused on knowledge generation tasks rather than model debugging. The paper is based on the findings of the EU CHIST-ERA project on Explainable Predictive Maintenance (XPM). © 2023 CEUR-WS. All rights reserved.

2023

Topic Model with Contextual Outlier Handling: a Study on Electronic Invoice Product Descriptions

Authors
Andrade, C; Ribeiro, RP; Gama, J;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
E-commerce has become an essential aspect of modern life, providing consumers worldwide with convenience and accessibility. However, the high volume of short and noisy product descriptions in text streams of massive e-commerce platforms translates into an increased number of clusters, presenting challenges for standard model-based stream clustering algorithms. This is the case of a dataset extracted from the Brazilian NF-e Project containing electronic invoice product descriptions, including many product clusters. While LDA-based clustering methods have shown to be crucial, they have been mainly evaluated on datasets with few clusters. We propose the Topic Model with Contextual Outlier Handling (TMCOH) method to overcome this limitation. This method combines the Dirichlet Process, specific word representation, and contextual outlier detection techniques to recycle identified outliers aiming to integrate them into appropriate clusters later on. The experimental results for our case study demonstrate the effectiveness of TMCOH when compared to state-of-the-art methods and its potential for application to text clustering in large datasets.

2023

Pollution Emission Patterns of Transportation in Porto, Portugal Through Network Analysis

Authors
Andrade, T; Shaji, N; Ribeiro, RP; Gama, J;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
Over the past few decades, road transportation emissions have increased. Vehicles are among the most significant sources of pollutants in urban areas. As such, several studies and public policies emerged to address the issue. Estimating greenhouse emissions and air quality over space and time is crucial for human health and mitigating climate change. In this study, we demonstrate that it is feasible to utilize raw GPS data to measure regional pollution levels. By applying feature engineering techniques and using a microscopic emissions model to calculate vehicle-specific power (VSP) and various specific pollutants, we identify areas with higher emission levels attributable to a fleet of taxis in Porto, Portugal. Additionally, we conduct network analysis to uncover correlations between emission levels and the structural characteristics of the transportation network. These findings can potentially identify emission clusters based on the network's connectivity and contribute to developing an emission inventory for an urban city like Porto.

2023

Bayesian Federated Learning: A Survey

Authors
Cao, L; Chen, H; Fan, X; Gama, J; Ong, YS; Kumar, V;

Publication
CoRR

Abstract
Federated learning (FL) demonstrates its advantages in integrating distributed infrastructure, communication, computing and learning in a privacy-preserving manner. However, the robustness and capabilities of existing FL methods are challenged by limited and dynamic data and conditions, complexities including heterogeneities and uncertainties, and analytical explainability. Bayesian federated learning (BFL) has emerged as a promising approach to address these issues. This survey presents a critical overview of BFL, including its basic concepts, its relations to Bayesian learning in the context of FL, and a taxonomy of BFL from both Bayesian and federated perspectives. We categorize and discuss client- and server-side and FL-based BFL methods and their pros and cons. The limitations of the existing BFL methods and the future directions of BFL research further address the intricate requirements of real-life FL applications.

2023

Knowledge-driven Analytics and Systems Impacting Human Quality of Life- Neurosymbolic AI, Explainable AI and Beyond

Authors
Ukil, A; Gama, J; Jara, AJ; Marin, L;

Publication
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023

Abstract
The management of knowledge-driven artificial intelligence technologies is essential in order to evaluate their impact on human life and society. Social networks and tech use can have a negative impact on us physically, emotionally, socially and mentally. On the other hand, intelligent systems can have a positive effect on people's lives. Currently, we are witnessing the power of large language models (LLMs) like chatGPT and its influence towards the society. The objective of the workshop is to contribute to the advancement of intelligent technologies designed to address the human condition. This could include precise and personalized medicine, better care for elderly people, reducing private data leaks, using AI to manage resources better, using AI to predict risks, augmenting human capabilities, and more. The workshop's objective is to present research findings and perspectives that demonstrate how knowledge-enabled technologies and applications improve human well-being. This workshop indeed focuses on the impacts at different granularity levels made by Artificial Intelligence (AI) research on the micro granular level, where the daily or regular functioning of human life is affected, and also the macro granulate level, where the long-term or far-future effects of artificial intelligence on people's lives and the human society could be pretty high. In conclusion, this workshop explores how AI research can potentially address the most pressing challenges facing modern societies, and how knowledge management can potentially contribute to these solutions.

2024

Improving hyper-parameter self-tuning for data streams by adapting an evolutionary approach

Authors
Moya, AR; Veloso, B; Gama, J; Ventura, S;

Publication
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Hyper-parameter tuning of machine learning models has become a crucial task in achieving optimal results in terms of performance. Several researchers have explored the optimisation task during the last decades to reach a state-of-the-art method. However, most of them focus on batch or offline learning, where data distributions do not change arbitrarily over time. On the other hand, dealing with data streams and online learning is a challenging problem. In fact, the higher the technology goes, the greater the importance of sophisticated techniques to process these data streams. Thus, improving hyper-parameter self-tuning during online learning of these machine learning models is crucial. To this end, in this paper, we present MESSPT, an evolutionary algorithm for self-hyper-parameter tuning for data streams. We apply Differential Evolution to dynamically-sized samples, requiring a single pass-over of data to train and evaluate models and choose the best configurations. We take care of the number of configurations to be evaluated, which necessarily has to be reduced, thus making this evolutionary approach a micro-evolutionary one. Furthermore, we control how our evolutionary algorithm deals with concept drift. Experiments on different learning tasks and over well-known datasets show that our proposed MESSPT outperforms the state-of-the-art on hyper-parameter tuning for data streams.

  • 86
  • 88