2021
Authors
Sequeira, AF; Goncalves, T; Silva, W; Pinto, JR; Cardoso, JS;
Publication
IET BIOMETRICS
Abstract
Biometric recognition and presentation attack detection (PAD) methods strongly rely on deep learning algorithms. Though often more accurate, these models operate as complex black boxes. Interpretability tools are now being used to delve deeper into the operation of these methods, which is why this work advocates their integration in the PAD scenario. Building upon previous work, a face PAD model based on convolutional neural networks was implemented and evaluated both through traditional PAD metrics and with interpretability tools. An evaluation on the stability of the explanations obtained from testing models with attacks known and unknown in the learning step is made. To overcome the limitations of direct comparison, a suitable representation of the explanations is constructed to quantify how much two explanations differ from each other. From the point of view of interpretability, the results obtained in intra and inter class comparisons led to the conclusion that the presence of more attacks during training has a positive effect in the generalisation and robustness of the models. This is an exploratory study that confirms the urge to establish new approaches in biometrics that incorporate interpretability tools. Moreover, there is a need for methodologies to assess and compare the quality of explanations.
2022
Authors
Neto, PC; Sequeira, AF; Cardoso, JS;
Publication
2022 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2022)
Abstract
Presentation attacks are recurrent threats to biometric systems, where impostors attempt to bypass these systems. Humans often use background information as contextual cues for their visual system. Yet, regarding face-based systems, the background is often discarded, since face presentation attack detection (PAD) models are mostly trained with face crops. This work presents a comparative study of face PAD models (including multi-task learning, adversarial training and dynamic frame selection) in two settings: with and without crops. The results show that the performance is consistently better when the background is present in the images. The proposed multi-task methodology beats the state-of-the-art results on the ROSE-Youtu dataset by a large margin with an equal error rate of 0.2%. Furthermore, we analyze the models' predictions with Grad-CAM++ with the aim to investigate to what extent the models focus on background elements that are known to be useful for human inspection. From this analysis we can conclude that the background cues are not relevant across all the attacks. Thus, showing the capability of the model to leverage the background information only when necessary.
2022
Authors
Gouveia, PF; Oliveira, HP; Monteiro, JP; Teixeira, JF; Silva, NL; Pinto, D; Mavioso, C; Anacleto, J; Martinho, M; Duarte, I; Cardoso, JS; Cardoso, F; Cardoso, MJ;
Publication
EUROPEAN SURGICAL RESEARCH
Abstract
Introduction: Breast volume estimation is considered crucial for breast cancer surgery planning. A single, easy, and reproducible method to estimate breast volume is not available. This study aims to evaluate, in patients proposed for mastectomy, the accuracy of the calculation of breast volume from a low-cost 3D surface scan (Microsoft Kinect) compared to the breast MRI and water displacement technique. Material and Methods: Patients with a Tis/T1-T3 breast cancer proposed for mastectomy between July 2015 and March 2017 were assessed for inclusion in the study. Breast volume calculations were performed using a 3D surface scan and the breast MRI and water displacement technique. Agreement between volumes obtained with both methods was assessed with the Spearman and Pearson correlation coefficients. Results: Eighteen patients with invasive breast cancer were included in the study and submitted to mastectomy. The level of agreement of the 3D breast volume compared to surgical specimens and breast MRI volumes was evaluated. For mastectomy specimen volume, an average (standard deviation) of 0.823 (0.027) and 0.875 (0.026) was obtained for the Pearson and Spearman correlations, respectively. With respect to MRI annotation, we obtained 0.828 (0.038) and 0.715 (0.018). Discussion: Although values obtained by both methodologies still differ, the strong linear correlation coefficient suggests that 3D breast volume measurement using a low-cost surface scan device is feasible and can approximate both the MRI breast volume and mastectomy specimen with sufficient accuracy. Conclusion: 3D breast volume measurement using a depth-sensor low-cost surface scan device is feasible and can parallel MRI breast and mastectomy specimen volumes with enough accuracy. Differences between methods need further development to reach clinical applicability. A possible approach could be the fusion of breast MRI and the 3D surface scan to harmonize anatomic limits and improve volume delimitation.
2022
Authors
Albuquerque, T; Cruz, R; Cardoso, JS;
Publication
MATHEMATICS
Abstract
Ordinal classification tasks are present in a large number of different domains. However, common losses for deep neural networks, such as cross-entropy, do not properly weight the relative ordering between classes. For that reason, many losses have been proposed in the literature, which model the output probabilities as following a unimodal distribution. This manuscript reviews many of these losses on three different datasets and suggests a potential improvement that focuses the unimodal constraint on the neighborhood around the true class, allowing for a more flexible distribution, aptly called quasi-unimodal loss. For this purpose, two constraints are proposed: A first constraint concerns the relative order of the top-three probabilities, and a second constraint ensures that the remaining output probabilities are not higher than the top three. Therefore, gradient descent focuses on improving the decision boundary around the true class in detriment to the more distant classes. The proposed loss is found to be competitive in several cases.
2022
Authors
Montenegro, H; Silva, W; Gaudio, A; Fredrikson, M; Smailagic, A; Cardoso, JS;
Publication
IEEE ACCESS
Abstract
Deep Learning achieves state-of-the-art results in many domains, yet its black-box nature limits its application to real-world contexts. An intuitive way to improve the interpretability of Deep Learning models is by explaining their decisions with similar cases. However, case-based explanations cannot be used in contexts where the data exposes personal identity, as they may compromise the privacy of individuals. In this work, we identify the main limitations and challenges in the anonymization of case-based explanations of image data through a survey on case-based interpretability and image anonymization methods. We empirically analyze the anonymization methods in regards to their capacity to remove personally identifiable information while preserving relevant semantic properties of the data. Through this analysis, we conclude that most privacy-preserving methods are not sufficiently good to be applied to case-based explanations. To promote research on this topic, we formalize the privacy protection of visual case-based explanations as a multi-objective problem to preserve privacy, intelligibility, and relevant explanatory evidence regarding a predictive task. We empirically verify the potential of interpretability saliency maps as qualitative evaluation tools for anonymization. Finally, we identify and propose new lines of research to guide future work in the generation of privacy-preserving case-based explanations.
2022
Authors
Rio-Torto, I; Cardoso, JS; Teixeira, LF;
Publication
PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2022)
Abstract
The growing importance of the Explainable Artificial Intelligence (XAI) field has led to the proposal of several methods for producing visual heatmaps of the classification decisions of deep learning models. However, visual explanations are not sufficient because different end-users have different backgrounds and preferences. Natural language explanations (NLEs) are inherently understandable by humans and, thus, can complement visual explanations. Therefore, we introduce a novel architecture based on multimodal Transformers to enable the generation of NLEs for image classification tasks. Contrary to the current literature, which models NLE generation as a supervised image captioning problem, we propose to learn to generate these textual explanations without their direct supervision, by starting from image captions and evolving to classification-relevant text. Preliminary experiments on a novel dataset where there is a clear demarcation between captions and NLEs show the potential of the approach and shed light on how it can be improved.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.