Publications

Publications by CTM

2023

Evaluating Privacy on Synthetic Images Generated using GANs: Contributions of the VCMI Team to ImageCLEFmedical GANs 2023

Authors
Montenegro, H; Neto, PC; Patrício, C; Torto, IR; Gonçalves, T; Teixeira, LF;

Publication
Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), Thessaloniki, Greece, September 18th to 21st, 2023.

Abstract
This paper presents the main contributions of the VCMI Team to the ImageCLEFmedical GANs 2023 task. This task aims to evaluate whether synthetic medical images generated using Generative Adversarial Networks (GANs) contain identifiable characteristics of the training data. We propose various approaches to classify a set of real images as having been used or not used in the training of the model that generated a set of synthetic images. We use similarity-based approaches to classify the real images based on their similarity to the generated ones. We develop autoencoders to classify the images through outlier detection techniques. Finally, we develop patch-based methods that operate on patches extracted from real and generated images to measure their similarity. On the development dataset, we attained an F1-score of 0.846 and an accuracy of 0.850 using an autoencoder-based method. On the test dataset, a similarity-based approach achieved the best results, with an F1-score of 0.801 and an accuracy of 0.810. The empirical results support the hypothesis that medical data generated using deep generative models trained without privacy constraints threatens the privacy of patients in the training data. © 2023 Copyright for this paper by its authors.

CloseRead Abstract

2023

Attention-Based Regularisation for Improved Generalisability in Medical Multi-Centre Data

Authors
Silva, D; Agrotis, G; Tan, RB; Teixeira, LF; Silva, W;

Publication
International Conference on Machine Learning and Applications, ICMLA 2023, Jacksonville, FL, USA, December 15-17, 2023

Abstract
Deep Learning models are tremendously valuable in several prediction tasks, and their use in the medical field is spreading abruptly, especially in computer vision tasks, evaluating the content in X-rays, CTs or MRIs. These methods can save a significant amount of time for doctors in patient diagnostics and help in treatment planning. However, these models are significantly sensitive to confounders in the training data and generally suffer a performance hit when dealing with out-of-distribution data, affecting their reliability and scalability in different medical institutions. Deep Learning research on Medical datasets may overlook essential details regarding the image acquisition procedure and the preprocessing steps. This work proposes a data-centric approach, exploring the potential of attention maps as a regularisation technique to improve robustness and generalisation. We use image metadata and explore self-attention maps and contrastive learning to promote feature space invariance to image disturbance. Experiments were conducted using Chest X-ray datasets that are publicly available. Some datasets contained information about the windowing settings applied by the radiologist, acting as a source of variability. The proposed model was tested and outperformed the baseline in out-of-distribution data, serving as a proof of concept. © 2023 IEEE.

CloseRead Abstract

2023

Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models

Authors
Patrício, C; Teixeira, LF; Neves, JC;

Publication
CoRR

Abstract

2023

Benchmarking edge computing devices for grape bunches and trunks detection using accelerated object detection single shot multibox deep learning models

Authors
Magalhaes, SC; dos Santos, FN; Machado, P; Moreira, AP; Dias, J;

Publication
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE

Abstract
Purpose: Visual perception enables robots to perceive the environment. Visual data is processed using computer vision algorithms that are usually time-expensive and require powerful devices to process the visual data in real-time, which is unfeasible for open-field robots with limited energy. This work benchmarks the performance of different heterogeneous platforms for object detection in real-time. This research benchmarks three architectures: embedded GPU-Graphical Processing Units (such as NVIDIA Jetson Nano 2 GB and 4 GB, and NVIDIA Jetson TX2), TPU-Tensor Processing Unit (such as Coral Dev Board TPU), and DPU-Deep Learning Processor Unit (such as in AMD-Xilinx ZCU104 Development Board, and AMD-Xilinx Kria KV260 Starter Kit). Methods: The authors used the RetinaNet ResNet-50 fine-tuned using the natural VineSet dataset. After the trained model was converted and compiled for target-specific hardware formats to improve the execution efficiency.Conclusions and Results: The platforms were assessed in terms of performance of the evaluation metrics and efficiency (time of inference). Graphical Processing Units (GPUs) were the slowest devices, running at 3 FPS to 5 FPS, and Field Programmable Gate Arrays (FPGAs) were the fastest devices, running at 14 FPS to 25 FPS. The efficiency of the Tensor Processing Unit (TPU) is irrelevant and similar to NVIDIA Jetson TX2. TPU and GPU are the most power-efficient, consuming about 5 W. The performance differences, in the evaluation metrics, across devices are irrelevant and have an F1 of about 70 % and mean Average Precision (mAP) of about 60 %.

CloseRead Abstract

2023

Synthesizing Human Activity for Data Generation

Authors
Romero, A; Carvalho, P; Corte-Real, L; Pereira, A;

Publication
JOURNAL OF IMAGING

Abstract
The problem of gathering sufficiently representative data, such as those about human actions, shapes, and facial expressions, is costly and time-consuming and also requires training robust models. This has led to the creation of techniques such as transfer learning or data augmentation. However, these are often insufficient. To address this, we propose a semi-automated mechanism that allows the generation and editing of visual scenes with synthetic humans performing various actions, with features such as background modification and manual adjustments of the 3D avatars to allow users to create data with greater variability. We also propose an evaluation methodology for assessing the results obtained using our method, which is two-fold: (i) the usage of an action classifier on the output data resulting from the mechanism and (ii) the generation of masks of the avatars and the actors to compare them through segmentation. The avatars were robust to occlusion, and their actions were recognizable and accurate to their respective input actors. The results also showed that even though the action classifier concentrates on the pose and movement of the synthetic humans, it strongly depends on contextual information to precisely recognize the actions. Generating the avatars for complex activities also proved problematic for action recognition and the clean and precise formation of the masks.

CloseRead Abstract Read Full Publication

2023

Special Issue on Novel Applications of Artificial Intelligence in Medicine and Health

Authors
Pereira, T; Cunha, A; Oliveira, HP;

Publication
APPLIED SCIENCES-BASEL

Abstract
Artificial Intelligence (AI) is one of the big hopes for the future of a positive revolution in the use of medical data to improve clinical routine and personalized medicine [...]

CloseRead Abstract