Publicacoes - INESC TEC

Publicações

Publicações por Jaime Cardoso

2021

Privacy-Preserving Generative Adversarial Network for Case-Based Explainability in Medical Image Analysis

Autores
Montenegro, H; Silva, W; Cardoso, JS;

Publicação
IEEE ACCESS

Abstract
Although Deep Learning models have achieved incredible results in medical image classification tasks, their lack of interpretability hinders their deployment in the clinical context. Case-based interpretability provides intuitive explanations, as it is a much more human-like approach than saliency-map-based interpretability. Nonetheless, since one is dealing with sensitive visual data, there is a high risk of exposing personal identity, threatening the individuals' privacy. In this work, we propose a privacy-preserving generative adversarial network for the privatization of case-based explanations. We address the weaknesses of current privacy-preserving methods for visual data from three perspectives: realism, privacy, and explanatory value. We also introduce a counterfactual module in our Generative Adversarial Network that provides counterfactual case-based explanations in addition to standard factual explanations. Experiments were performed in a biometric and medical dataset, demonstrating the network's potential to preserve the privacy of all subjects and keep its explanatory evidence while also maintaining a decent level of intelligibility.

FecharLer Abstract

2020

Automotive Interior Sensing - Towards a Synergetic Approach between Anomaly Detection and Action Recognition Strategies

Autores
Augusto, P; Cardoso, JS; Fonseca, J;

Publicação
4th IEEE International Conference on Image Processing, Applications and Systems, IPAS 2020, Virtual Event, Italy, December 9-11, 2020

Abstract
With the appearance of Shared Autonomous Vehicles there will no longer be a driver responsible for maintaining the car interior and well-being of passengers. To counter this, it is imperative to have a system that is able to detect any abnormal behaviors, more specifically, violence between passengers. Traditional action recognition algorithms build models around known interactions but activities can be so diverse, that having a dataset that incorporates most use cases is unattainable. While action recognition models are normally trained on all the defined activities and directly output a score that classifies the likelihood of violence, video anomaly detection algorithms present themselves as an alternative approach to build a good discriminative model since usually only non-violent examples are needed. This work focuses on anomaly detection and action recognition algorithms trained, validated and tested on a subset of human behavior video sequences from Bosch's internal datasets. The anomaly detection network architecture defines how to properly reconstruct normal frame sequences so that during testing, each sequence can be classified as normal or abnormal based on its reconstruction error. With these errors, regularity scores are inferred showing the predicted regularity of each frame. The resulting framework is a viable addition to traditional action recognition algorithms since it can work as a tool for detecting unknown actions, strange/violent behaviors and aid in understanding the meaning of such human interactions.

FecharLer Abstract

2021

Optimizing Person Re-Identification Using Generated Attention Masks

Autores
Capozzi, L; Pinto, JR; Cardoso, JS; Rebelo, A;

Publicação
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 25th Iberoamerican Congress, CIARP 2021, Porto, Portugal, May 10-13, 2021, Revised Selected Papers

Abstract
The task of person re-identification has important applications in security and surveillance systems. It is a challenging problem since there can be a lot of differences between pictures belonging to the same person, such as lighting, camera position, variation in poses and occlusions. The use of Deep Learning has contributed greatly towards more effective and accurate systems. Many works use attention mechanisms to force the models to focus on less distinctive areas, in order to improve performance in situations where important information may be missing. This paper proposes a new, more flexible method for calculating these masks, using a U-Net which receives a picture and outputs a mask representing the most distinctive areas of the picture. Results show that the method achieves an accuracy comparable or superior to those in state-of-the-art methods.

FecharLer Abstract

2021

A Study on Annotation Efficient Learning Methods for Segmentation in Prostate Histopathological Images

Autores
Costa, P; Campilho, A; Cardoso, JS;

Publicação
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 25th Iberoamerican Congress, CIARP 2021, Porto, Portugal, May 10-13, 2021, Revised Selected Papers

Abstract
Cancer is a leading cause of death worldwide. The detection and diagnosis of most cancers are confirmed by a tissue biopsy that is analyzed via the optic microscope. These samples are then scanned to giga-pixel sized images for further digital processing by pathologists. An automated method to segment the malignant regions of these images could be of great interest to detect cancer earlier and increase the agreement between specialists. However, annotating these giga-pixel images is very expensive, time-consuming and error-prone. We evaluate 4 existing annotation efficient methods, including transfer learning and self-supervised learning approaches. The best performing approach was to pretrain a model to colourize a grayscale histopathological image and then finetune that model on a dataset with manually annotated examples. This method was able to improve the Intersection over Union from 0.2702 to 0.3702.

FecharLer Abstract

2021

Deep Ordinal Focus Assessment for Whole Slide Images

Autores
Albuquerque, T; Moreira, A; Cardoso, JS;

Publicação
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021)

Abstract
Medical image quality assessment plays an important role not only in the design and manufacturing processes of image acquisition but also in the optimization of decision support systems. This work introduces a new deep ordinal learning approach for focus assessment in whole slide images. From the blurred image to the focused image there is an ordinal progression that contains relevant knowledge for more robust learning of the models. With this new method, it is possible to infer quality without losing ordinal information about focus since instead of using the nominal cross-entropy loss for training, ordinal losses were used. Our proposed model is contrasted against other state-of-the-art methods present in the literature. A first conclusion is a benefit of using data-driven methods instead of knowledge-based methods. Additionally, the proposed model is found to be the top-performer in several metrics. The best performing model scores an accuracy of 94.4% for a 12 classes classification problem in the FocusPath database.

FecharLer Abstract

2021

End-to-End Deep Sketch-to-Photo Matching Enforcing Realistic Photo Generation

Autores
Capozzi, L; Pinto, JR; Cardoso, JS; Rebelo, A;

Publicação
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 25th Iberoamerican Congress, CIARP 2021, Porto, Portugal, May 10-13, 2021, Revised Selected Papers

Abstract
The traditional task of locating suspects using forensic sketches posted on public spaces, news, and social media can be a difficult task. Recent methods that use computer vision to improve this process present limitations, as they either do not use end-to-end networks for sketch recognition in police databases (which generally improve performance) or/and do not offer a photo-realistic representation of the sketch that could be used as alternative if the automatic matching process fails. This paper proposes a method that combines these two properties, using a conditional generative adversarial network (cGAN) and a pre-trained face recognition network that are jointly optimised as an end-to-end model. While the model can identify a short list of potential suspects in a given database, the cGAN offers an intermediate realistic face representation to support an alternative manual matching process. Evaluation on sketch-photo pairs from the CUFS, CUFSF and CelebA databases reveal the proposed method outperforms the state-of-the-art in most tasks, and that forcing an intermediate photo-realistic representation only results in a small performance decrease.

FecharLer Abstract