Publicacoes - INESC TEC

Publicações

Publicações por Jaime Cardoso

2021

FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

Autores
Neto, PC; Boutros, F; Pinto, JR; Damer, N; Sequeira, AF; Cardoso, JS;

Publicação
2021 16TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2021)

Abstract
SARS-CoV-2 has presented direct and indirect challenges to the scientific community. One of the most prominent indirect challenges advents from the mandatory use of face masks in a large number of countries. Face recognition methods struggle to perform identity verification with similar accuracy on masked and unmasked individuals. It has been shown that the performance of these methods drops considerably in the presence of face masks, especially if the reference image is unmasked. We propose FocusFace, a multi-task architecture that uses contrastive learning to be able to accurately perform masked face recognition. The proposed architecture is designed to be trained from scratch or to work on top of state-of-the-art face recognition methods without sacrificing the capabilities of a existing models in conventional face recognition tasks. We also explore different approaches to design the contrastive learning module. Results are presented in terms of masked-masked (MM) and unmasked-masked (U-M) face verification performance. For both settings, the results are on par with published methods, but for M-M specifically, the proposed method was able to outperform all the solutions that it was compared to. We further show that when using our method on top of already existing methods the training computational costs decrease significantly while retaining similar performances. The implementation and the trained models are available at GitHub.

FecharLer Abstract

2022

Lesion Volume Quantification Using Two Convolutional Neural Networks in MRIs of Multiple Sclerosis Patients

Autores
de Oliveira, M; Piacenti Silva, M; da Rocha, FCG; Santos, JM; Cardoso, JD; Lisboa, PN;

Publicação
DIAGNOSTICS

Abstract
Background: Multiple sclerosis (MS) is a neurologic disease of the central nervous system which affects almost three million people worldwide. MS is characterized by a demyelination process that leads to brain lesions, allowing these affected areas to be visualized with magnetic resonance imaging (MRI). Deep learning techniques, especially computational algorithms based on convolutional neural networks (CNNs), have become a frequently used algorithm that performs feature self-learning and enables segmentation of structures in the image useful for quantitative analysis of MRIs, including quantitative analysis of MS. To obtain quantitative information about lesion volume, it is important to perform proper image preprocessing and accurate segmentation. Therefore, we propose a method for volumetric quantification of lesions on MRIs of MS patients using automatic segmentation of the brain and lesions by two CNNs. Methods: We used CNNs at two different moments: the first to perform brain extraction, and the second for lesion segmentation. This study includes four independent MRI datasets: one for training the brain segmentation models, two for training the lesion segmentation model, and one for testing. Results: The proposed brain detection architecture using binary cross-entropy as the loss function achieved a 0.9786 Dice coefficient, 0.9969 accuracy, 0.9851 precision, 0.9851 sensitivity, and 0.9985 specificity. In the second proposed framework for brain lesion segmentation, we obtained a 0.8893 Dice coefficient, 0.9996 accuracy, 0.9376 precision, 0.8609 sensitivity, and 0.9999 specificity. After quantifying the lesion volume of all patients from the test group using our proposed method, we obtained a mean value of 17,582 mm(3). Conclusions: We concluded that the proposed algorithm achieved accurate lesion detection and segmentation with reproducibility corresponding to state-of-the-art software tools and manual segmentation. We believe that this quantification method can add value to treatment monitoring and routine clinical evaluation of MS patients.

FecharLer Abstract

2022

Tackling unsupervised multi-source domain adaptation with optimism and consistency

Autores
Pernes, D; Cardoso, JS;

Publicação
EXPERT SYSTEMS WITH APPLICATIONS

Abstract
It has been known for a while that the problem of multi-source domain adaptation can be regarded as a single source domain adaptation task where the source domain corresponds to a mixture of the original source domains. Nonetheless, how to adjust the mixture distribution weights remains an open question. Moreover, most existing work on this topic focuses only on minimizing the error on the source domains and achieving domain-invariant representations, which is insufficient to ensure low error on the target domain. In this work, we present a novel framework that addresses both problems and beats the current state of the art by using a mildly optimistic objective function and consistency regularization on the target samples.

FecharLer Abstract

2022

Streamlining Action Recognition in Autonomous Shared Vehicles with an Audiovisual Cascade Strategy

Autores
Pinto, JR; Carvalho, P; Pinto, C; Sousa, A; Capozzi, L; Cardoso, JS;

Publicação
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5

Abstract
With the advent of self-driving cars, and big companies such as Waymo or Bosch pushing forward into fully driverless transportation services, the in-vehicle behaviour of passengers must be monitored to ensure safety and comfort. The use of audio-visual information is attractive by its spatio-temporal richness as well as non-invasive nature, but faces tile likely constraints posed by available hardware and energy consumption. Hence new strategies are required to improve the usage of these scarce resources. We propose the processing of audio and visual data in a cascade pipeline for in-vehicle action recognition. The data is processed by modality-specific sub-modules. with subsequent ones being used when a confident classification is not reached. Experiments show an interesting accuracy-acceleration trade-off when compared with a parallel pipeline with late fusion, presenting potential for industrial applications on embedded devices.

FecharLer Abstract

2021

Hidden Markov models on a self-organizing map for anomaly detection in 802.11 wireless networks

Autores
Allahdadi, A; Pernes, D; Cardoso, JS; Morla, R;

Publicação
NEURAL COMPUTING & APPLICATIONS

Abstract
The present work introduces a hybrid integration of the self-organizing map and the hidden Markov model (HMM) for anomaly detection in 802.11 wireless networks. The self-organizing hidden Markov model map (SOHMMM) deals with the spatial connections of HMMs, along with the inherent temporal dependencies of data sequences. In essence, an HMM is associated with each neuron of the SOHMMM lattice. In this paper, the SOHMMM algorithm is employed for anomaly detection in 802.11 wireless access point usage data. Furthermore, we extend the SOHMMM online gradient descent unsupervised learning algorithm for multivariate Gaussian emissions. The experimental analysis uses two types of data: synthetic data to investigate the accuracy and convergence of the SOHMMM algorithm and wireless simulation data to verify the significance and efficiency of the algorithm in anomaly detection. The sensitivity and specificity of the SOHMMM algorithm in anomaly detection are compared to two other approaches, namely HMM initialized with universal background model (HMM-UBM) and SOHMMM with zero neighborhood (Z-SOHMMM). The results from the wireless simulation experiments show that SOHMMM outperformed the aforementioned approaches in all the presented anomalous scenarios.

FecharLer Abstract

2021

An exploratory study of interpretability for face presentation attack detection

Autores
Sequeira, AF; Goncalves, T; Silva, W; Pinto, JR; Cardoso, JS;

Publicação
IET BIOMETRICS

Abstract
Biometric recognition and presentation attack detection (PAD) methods strongly rely on deep learning algorithms. Though often more accurate, these models operate as complex black boxes. Interpretability tools are now being used to delve deeper into the operation of these methods, which is why this work advocates their integration in the PAD scenario. Building upon previous work, a face PAD model based on convolutional neural networks was implemented and evaluated both through traditional PAD metrics and with interpretability tools. An evaluation on the stability of the explanations obtained from testing models with attacks known and unknown in the learning step is made. To overcome the limitations of direct comparison, a suitable representation of the explanations is constructed to quantify how much two explanations differ from each other. From the point of view of interpretability, the results obtained in intra and inter class comparisons led to the conclusion that the presence of more attacks during training has a positive effect in the generalisation and robustness of the models. This is an exploratory study that confirms the urge to establish new approaches in biometrics that incorporate interpretability tools. Moreover, there is a need for methodologies to assess and compare the quality of explanations.

FecharLer Abstract