Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2023

Error Analysis on Industry Data: Using Weak Segment Detection for Local Model Agnostic Prediction Intervals

Authors
Mamede, R; Paiva, N; Gama, J;

Publication
Discovery Science - 26th International Conference, DS 2023, Porto, Portugal, October 9-11, 2023, Proceedings

Abstract
Machine Learning has been overtaken by a growing necessity to explain and understand decisions made by trained models as regulation and consumer awareness have increased. Alongside understanding the inner workings of a model comes the task of verifying how adequately we can model a problem with the learned functions. Traditional global assessment functions lack the granularity required to understand local differences in performance in different regions of the feature space, where the model can have problems adapting. Residual Analysis adds a layer of model understanding by interpreting prediction residuals in an exploratory manner. However, this task can be unfeasible for high-dimensionality datasets through hypotheses and visualizations alone. In this work, we use weak interpretable learners to identify regions of high prediction error in the feature space. We achieve this by examining the absolute residuals of predictions made by trained regressors. This methodology retains the interpretability of the identified regions. It allows practitioners to have tools to formulate hypotheses surrounding model failure on particular regions for future model tunning, data collection, or data augmentation on critical cohorts of data. We present a way of including information on different levels of model uncertainty in the feature space through the use of locally fitted Model Agnostic Prediction Intervals (MAPIE) in the identified regions, comparing this approach with other common forms of conformal predictions which do not take into account findings from weak segment identification, by assessing local and global coverage of the prediction intervals. To demonstrate the practical application of our approach, we present a real-world industry use case in the context of inbound retention call-centre operations for a Telecom Provider to determine optimal pairing between a customer and an available assistant through the prediction of contracted revenue. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

2023

Which Way to Go - Finding Frequent Trajectories Through Clustering

Authors
Andrade, T; Gama, J;

Publication
Discovery Science - 26th International Conference, DS 2023, Porto, Portugal, October 9-11, 2023, Proceedings

Abstract
Trajectory clustering is one of the most important issues in mobility patterns data mining. It is applied in several cases such as hot-spots detection, urban transportation control, animal migration movements, and tourist visiting routes among others. In this paper, we describe how to identify the most frequent trajectories from raw GPS data. By making use of the Ramer-Douglas-Peucker (RDP) mechanism we simplify the trajectories in order to obtain fewer points to check without losing information. We construct a similarity matrix by using the Fréchet distance metric and then employ density-based clustering to find the most similar trajectories. We perform experiments over three real-world datasets collected in the city of Porto, Portugal, and in Beijing China, and check the results of the most frequent trajectories for the top-k origins x destinations for the moves. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

2023

Bayesian Federated Learning: A Survey

Authors
Cao, LB; Chen, H; Fan, XH; Gama, J; Ong, YS; Kumar, V;

Publication
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023

Abstract
Federated learning (FL) demonstrates its advantages in integrating distributed infrastructure, communication, computing and learning in a privacy-preserving manner. However, the robustness and capabilities of existing FL methods are challenged by limited and dynamic data and conditions, complexities including heterogeneities and uncertainties, and analytical explainability. Bayesian federated learning (BFL) has emerged as a promising approach to address these issues. This survey presents a critical overview of BFL, including its basic concepts, its relations to Bayesian learning in the context of FL, and a taxonomy of BFL from both Bayesian and federated perspectives. We categorize and discuss client- and server-side and FLbased BFL methods and their pros and cons. The limitations of the existing BFL methods and the future directions of BFL research further address the intricate requirements of real-life FL applications.

2023

Knowledge-driven Analytics and Systems Impacting Human Quality of Life- Neurosymbolic AI, Explainable AI and Beyond

Authors
Ukil, A; Gama, J; Jara, AJ; Marin, L;

Publication
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023

Abstract
The management of knowledge-driven artificial intelligence technologies is essential in order to evaluate their impact on human life and society. Social networks and tech use can have a negative impact on us physically, emotionally, socially and mentally. On the other hand, intelligent systems can have a positive effect on people's lives. Currently, we are witnessing the power of large language models (LLMs) like chatGPT and its influence towards the society. The objective of the workshop is to contribute to the advancement of intelligent technologies designed to address the human condition. This could include precise and personalized medicine, better care for elderly people, reducing private data leaks, using AI to manage resources better, using AI to predict risks, augmenting human capabilities, and more. The workshop's objective is to present research findings and perspectives that demonstrate how knowledge-enabled technologies and applications improve human well-being. This workshop indeed focuses on the impacts at different granularity levels made by Artificial Intelligence (AI) research on the micro granular level, where the daily or regular functioning of human life is affected, and also the macro granulate level, where the long-term or far-future effects of artificial intelligence on people's lives and the human society could be pretty high. In conclusion, this workshop explores how AI research can potentially address the most pressing challenges facing modern societies, and how knowledge management can potentially contribute to these solutions.

2023

Fairness Analysis in Causal Models: An Application to Public Procurement

Authors
Teixeira, S; Nogueira, AR; Gama, J;

Publication
Machine Learning and Principles and Practice of Knowledge Discovery in Databases - International Workshops of ECML PKDD 2023, Turin, Italy, September 18-22, 2023, Revised Selected Papers, Part II

Abstract
Data-driven decision models based on Artificial Intelligence (AI) have been widely used in the public and private sectors. These models present challenges and are intended to be fair, effective and transparent in public interest areas. Bias, fairness and government transparency are aspects that significantly impact the functioning of a democratic society. They shape the government’s and its citizens’ relationship, influencing trust, accountability, and the equitable treatment of individuals and groups. Data-driven decision models can be biased at several process stages, contributing to injustices. Our research purpose is to understand fairness in the use of causal discovery for public procurement. By analysing Portuguese public contracts data, we aim i) to predict the place of execution of public contracts using the PC algorithm with sp_mi, smc_?2 and mc_?2 conditional independence tests; ii) to analyse and compare the fairness in those scenarios using Predictive Parity Rate, Proportional Parity, Demographic Parity and Accuracy Parity metrics. By addressing fairness concerns, we pursue to enhance responsible data-driven decision models. We conclude that, in our case, fairness metrics make an assessment more local than global due to causality pathways. We also observe that the Proportional Parity metric is the one with the lowest variance among all metrics and one with the highest precision, and this reinforces the observation that the Agency category is the one that is furthest apart in terms of the proportion of the groups.

2023

Idioblasts accumulating anticancer alkaloids in Catharanthus roseus leaves are a unique cell type

Authors
Guedes, JG; Ribeiro, R; Carqueijeiro, I; Guimaraes, AL; Bispo, C; Archer, J; Azevedo, H; Fonseca, NA; Sottomayor, M;

Publication

Abstract
Catharanthus roseus leaves produce a range of monoterpenoid indole alkaloids (MIAs) that include low levels of the anticancer drugs vinblastine and vincristine. The MIA pathway displays a complex architecture spanning different subcellular and cell-type localizations and is under complex regulation. As a result, the development of strategies to increase the levels of the anticancer MIAs has remained elusive. The pathway involves mesophyll specialised idioblasts where the late unsolved biosynthetic steps are thought to occur. Here, protoplasts of C. roseus leaf idioblasts were isolated by fluorescence-activated cell sorting, and their differential alkaloid and transcriptomic profiles were characterised. This involved the assembly of an improved C. roseus transcriptome from short- and long-read data, IDIO+. It was observed that C. roseus mesophyll idioblasts possess a distinctive transcriptomic profile associated with protection against biotic and abiotic stresses, and indicative that this cell type is a carbon sink, in contrast with surrounding mesophyll cells. Moreover, it is shown that idioblasts are a hotspot of alkaloid accumulation, suggesting that their transcriptome may hold the keys to the in-depth understanding of the MIA pathway and the success of strategies leading to higher levels of the anticancer drugs.

  • 42
  • 466