Publicacoes - INESC TEC

Publicações

Publicações por Jaime Cardoso

2023

BOLD: Blood-gas and Oximetry Linked Dataset - Open Source Research

Autores
Matos, J; Struja, T; Gallifant, J; Nakayama, LF; Charpignon, M; Liu, X; Economou-Zavlanos, N; Cardoso, JS; Johnson, KS; Bhavsar, N; Gichoya, JW; Celi, LA; Wong, AI;

Publicação

Abstract
Pulse oximeters measure peripheral arterial oxygen saturation (SpO2) noninvasively, while the gold standard (SaO2) involves arterial blood gas measurement. There are known racial and ethnic disparities in their performance. BOLD is a new comprehensive dataset that aims to underscore the importance of addressing biases in pulse oximetry accuracy, which disproportionately affect darker-skinned patients. The dataset was created by harmonizing three Electronic Health Record databases (MIMIC-III, MIMIC-IV, eICU-CRD) comprising Intensive Care Unit stays of US patients. Paired SpO2 and SaO2 measurements were time-aligned and combined with various other sociodemographic and parameters to provide a detailed representation of each patient. BOLD includes 49,099 paired measurements, within a 5-minute window and with oxygen saturation levels between 70-100%. Minority racial and ethnic groups account for ~25% of the data - a proportion seldom achieved in previous studies. The codebase is publicly available. Given the prevalent use of pulse oximeters in the hospital and at home, we hope that BOLD will be leveraged to develop debiasing algorithms that can result in more equitable healthcare solutions.

FecharLer Abstract

2023

Evaluating the Performance of Explanation Methods on Ordinal Regression CNN Models

Autores
Barbero-Gómez, J; Cruz, R; Cardoso, JS; Gutiérrez, PA; Hervás-Martínez, C;

Publicação
ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT II

Abstract
This paper introduces an evaluation procedure to validate the efficacy of explanation methods for Convolutional Neural Network (CNN) models in ordinal regression tasks. Two ordinal methods are contrasted against a baseline using cross-entropy, across four datasets. A statistical analysis demonstrates that attribution methods, such as Grad-CAM and IBA, perform significantly better when used with ordinal regression CNN models compared to a baseline approach in most ordinal and nominal metrics. The study suggests that incorporating ordinal information into the attribution map construction process may improve the explanations further.

FecharLer Abstract

2023

Detecting Concepts and Generating Captions from Medical Images: Contributions of the VCMI Team to ImageCLEFmedical Caption 2023

Autores
Torto, IR; Patrício, C; Montenegro, H; Gonçalves, T; Cardoso, JS;

Publicação
Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), Thessaloniki, Greece, September 18th to 21st, 2023.

Abstract
This paper presents the main contributions of the VCMI Team to the ImageCLEFmedical Caption 2023 task. We addressed both the concept detection and caption prediction tasks. Regarding concept detection, our team employed different approaches to assign concepts to medical images: multi-label classification, adversarial training, autoregressive modelling, image retrieval, and concept retrieval. We also developed three model ensembles merging the results of some of the proposed methods. Our best submission obtained an F1-score of 0.4998, ranking 3rd among nine teams. Regarding the caption prediction task, our team explored two main approaches based on image retrieval and language generation. The language generation approaches, based on a vision model as the encoder and a language model as the decoder, yielded the best results, allowing us to rank 5th among thirteen teams, with a BERTScore of 0.6147. © 2023 Copyright for this paper by its authors.

FecharLer Abstract

2024

Active Supervision: Human in the Loop

Autores
Cruz, RPM; Shihavuddin, ASM; Maruf, MH; Cardoso, JS;

Publicação
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I

Abstract
After the learning process, certain types of images may not be modeled correctly because they were not well represented in the training set. These failures can then be compensated for by collecting more images from the real-world and incorporating them into the learning process - an expensive process known as active learning. The proposed twist, called active supervision, uses the model itself to change the existing images in the direction where the boundary is less defined and requests feedback from the user on how the new image should be labeled. Experiments in the context of class imbalance show the technique is able to increase model performance in rare classes. Active human supervision helps provide crucial information to the model during training that the training set lacks.

FecharLer Abstract

2024

Explaining Bounding Boxes in Deep Object Detectors Using Post Hoc Methods for Autonomous Driving Systems

Autores
Nogueira, C; Fernandes, L; Fernandes, JND; Cardoso, JS;

Publicação
SENSORS

Abstract
Deep learning has rapidly increased in popularity, leading to the development of perception solutions for autonomous driving. The latter field leverages techniques developed for computer vision in other domains for accomplishing perception tasks such as object detection. However, the black-box nature of deep neural models and the complexity of the autonomous driving context motivates the study of explainability in these models that perform perception tasks. Moreover, this work explores explainable AI techniques for the object detection task in the context of autonomous driving. An extensive and detailed comparison is carried out between gradient-based and perturbation-based methods (e.g., D-RISE). Moreover, several experimental setups are used with different backbone architectures and different datasets to observe the influence of these aspects in the explanations. All the techniques explored consist of saliency methods, making their interpretation and evaluation primarily visual. Nevertheless, numerical assessment methods are also used. Overall, D-RISE and guided backpropagation obtain more localized explanations. However, D-RISE highlights more meaningful regions, providing more human-understandable explanations. To the best of our knowledge, this is the first approach to obtaining explanations focusing on the regression of the bounding box coordinates.

FecharLer Abstract

2024

Intrinsic Explainability for End-to-End Object Detection

Autores
Fernandes, L; Fernandes, JND; Calado, M; Pinto, JR; Cerqueira, R; Cardoso, JS;

Publicação
IEEE ACCESS

Abstract
Deep Learning models are automating many daily routine tasks, indicating that in the future, even high-risk tasks will be automated, such as healthcare and automated driving areas. However, due to the complexity of such deep learning models, it is challenging to understand their reasoning. Furthermore, the black box nature of the designed deep learning models may undermine public confidence in critical areas. Current efforts on intrinsically interpretable models focus only on classification tasks, leaving a gap in models for object detection. Therefore, this paper proposes a deep learning model that is intrinsically explainable for the object detection task. The chosen design for such a model is a combination of the well-known Faster-RCNN model with the ProtoPNet model. For the Explainable AI experiments, the chosen performance metric was the similarity score from the ProtoPNet model. Our experiments show that this combination leads to a deep learning model that is able to explain its classifications, with similarity scores, using a visual bag of words, which are called prototypes, that are learned during the training process. Furthermore, the adoption of such an explainable method does not seem to hinder the performance of the proposed model, which achieved a mAP of 69% in the KITTI dataset and a mAP of 66% in the GRAZPEDWRI-DX dataset. Moreover, our explanations have shown a high reliability on the similarity score.

FecharLer Abstract